Documentation Index
Fetch the complete documentation index at: https://agno-v2-shaloo-ai-support-link.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Response caching allows you to cache model responses, which can significantly improve response times and reduce API costs during development and testing.
Basic Usage
Enable caching by setting cache_response=True when initializing the model. The first call will hit the API and cache the response, while subsequent identical calls will return the cached result.
import time
from agno.agent import Agent
from agno.models.openai import OpenAIResponses
agent = Agent(model=OpenAIResponses(id="gpt-5.2", cache_response=True))
# Run the same query twice to demonstrate caching
for i in range(1, 3):
print(f"\n{'=' * 60}")
print(
f"Run {i}: {'Cache Miss (First Request)' if i == 1 else 'Cache Hit (Cached Response)'}"
)
print(f"{'=' * 60}\n")
response = agent.run(
"Write me a short story about a cat that can talk and solve problems."
)
print(response.content)
print(f"\n Elapsed time: {response.metrics.duration:.3f}s")
# Small delay between iterations for clarity
if i == 1:
time.sleep(0.5)
Usage
Set up your virtual environment
uv venv --python 3.12
source .venv/bin/activate
Set your API key
export OPENAI_API_KEY=xxx
Install dependencies
uv pip install -U openai agno
Run Agent
python cache_model_response.py