Image to Text Team

This example demonstrates how an Agno Agent Team can collaborate to analyze images and create engaging fictional stories using an image analyst and creative writer.

Code

cookbook/02_examples/teams/multimodal/image_to_text.py

from pathlib import Path

from agno.agent import Agent
from agno.media import Image
from agno.models.openai import OpenAIResponses
from agno.team import Team

image_analyzer = Agent(
    name="Image Analyst",
    role="Analyze and describe images in detail",
    model=OpenAIResponses(id="gpt-5.2"),
    instructions=[
        "Analyze images carefully and provide detailed descriptions",
        "Focus on visual elements, composition, and key details",
    ],
)

creative_writer = Agent(
    name="Creative Writer",
    role="Create engaging stories and narratives",
    model=OpenAIResponses(id="gpt-5.2"),
    instructions=[
        "Transform image descriptions into compelling fiction stories",
        "Use vivid language and creative storytelling techniques",
    ],
)

# Create a team for collaborative image-to-text processing
image_team = Team(
    name="Image Story Team",
    model=OpenAIResponses(id="gpt-5.2"),
    members=[image_analyzer, creative_writer],
    instructions=[
        "Work together to create compelling fiction stories from images.",
        "Image Analyst: First analyze the image for visual details and context.",
        "Creative Writer: Transform the analysis into engaging fiction narratives.",
        "Ensure the story captures the essence and mood of the image.",
    ],
    markdown=True,
)

image_path = Path(__file__).parent.joinpath("sample.jpg")
image_team.print_response(
    "Write a 3 sentence fiction story about the image",
    images=[Image(filepath=image_path)],
)

Usage

Set up your virtual environment

uv venv --python 3.12
source .venv/bin/activate

Install required libraries

uv pip install agno

Set environment variables

export OPENAI_API_KEY=****

Add sample image

# Add a sample.jpg image file in the same directory as the script

Run the agent

python cookbook/02_examples/teams/multimodal/image_to_text.py

Get Started

Basics

Capabilities

Context

Production

Providers

Other

Additional Resources

Image to Text Team

Code

Usage

Get Started

Basics

Capabilities

Context

Production

Providers

Other

Additional Resources

Documentation Index

​Code

​Usage

Code

Usage