Documentation Index
Fetch the complete documentation index at: https://agno-v2-shaloo-ai-support-link.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
This example demonstrates how to create an agent that can transcribe audio conversations, identifying different speakers and providing accurate transcriptions.
Code
import requests
from agno.agent import Agent
from agno.media import Audio
from agno.models.google import Gemini
agent = Agent(
model=Gemini(id="gemini-2.0-flash-exp"),
markdown=True,
)
url = "https://agno-public.s3.us-east-1.amazonaws.com/demo_data/QA-01.mp3"
response = requests.get(url)
audio_content = response.content
# Give a transcript of this audio conversation. Use speaker A, speaker B to identify speakers.
agent.print_response(
"Give a transcript of this audio conversation. Use speaker A, speaker B to identify speakers.",
audio=[Audio(content=audio_content)],
stream=True,
)
Usage
Set up your virtual environment
uv venv --python 3.12
source .venv/bin/activate
Install dependencies
uv pip install -U agno google-generativeai requests
Export your GOOGLE API key
export GOOGLE_API_KEY="your_google_api_key_here"