Documentation Index
Fetch the complete documentation index at: https://agno-v2-shaloo-ai-support-link.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Code
cookbook/11_models/llama_cpp/basic_stream.py
from typing import Iterator # noqa
from agno.agent import Agent, RunOutputEvent # noqa
from agno.models.llama_cpp import LlamaCpp
agent = Agent(model=LlamaCpp(id="ggml-org/gpt-oss-20b-GGUF"), markdown=True)
# Get the response in a variable
# run_response: Iterator[RunOutputEvent] = agent.run("Share a 2 sentence horror story", stream=True)
# for chunk in run_response:
# print(chunk.content)
# Print the response in the terminal
agent.print_response("Share a 2 sentence horror story", stream=True)
Usage
Set up your virtual environment
uv venv --python 3.12
source .venv/bin/activate
Install LlamaCpp
Follow the LlamaCpp installation guide and start the server:llama-server -hf ggml-org/gpt-oss-20b-GGUF --ctx-size 0 --jinja -ub 2048 -b 2048
Run Agent
python cookbook/11_models/llama_cpp/basic_stream.py