CSV Row Chunking

CSV row chunking is a method of splitting CSV files into smaller chunks based on the number of rows, rather than character count. This approach is particularly useful for structured data where you want to process CSV files in manageable row-based chunks while preserving the integrity of individual records.

Create a Python file

csv_row_chunking.py

import asyncio
from agno.agent import Agent
from agno.knowledge.chunking.row import RowChunking
from agno.knowledge.knowledge import Knowledge
from agno.knowledge.reader.csv_reader import CSVReader
from agno.vectordb.pgvector import PgVector

db_url = "postgresql+psycopg://ai:ai@localhost:5532/ai"

knowledge_base = Knowledge(
    vector_db=PgVector(table_name="imdb_movies_row_chunking", db_url=db_url),
)

asyncio.run(knowledge_base.ainsert(
    url="https://agno-public.s3.amazonaws.com/demo_data/IMDB-Movie-Data.csv",
    reader=CSVReader(
        chunking_strategy=RowChunking(),
    ),
))

# Initialize the Agent with the knowledge_base
agent = Agent(
    knowledge=knowledge_base,
    search_knowledge=True,
)

# Use the agent
agent.print_response("Tell me about the movie Guardians of the Galaxy", markdown=True)

Set up your virtual environment

uv venv --python 3.12
source .venv/bin/activate

Install dependencies

uv pip install -U agno sqlalchemy psycopg pgvector

Run PgVector

docker run -d \
  -e POSTGRES_DB=ai \
  -e POSTGRES_USER=ai \
  -e POSTGRES_PASSWORD=ai \
  -e PGDATA=/var/lib/postgresql/data/pgdata \
  -v pgvolume:/var/lib/postgresql/data \
  -p 5532:5432 \
  --name pgvector \
  agno/pgvector:16

Run the script

python csv_row_chunking.py

CSV Row Chunking Params

Parameter	Type	Default	Description
`rows_per_chunk`	`int`	`100`	The number of rows to include in each chunk.
`skip_header`	`bool`	`False`	Whether to skip the header row when chunking.
`clean_rows`	`bool`	`True`	Whether to clean and normalize row data.
`include_header_in_chunks`	`bool`	`False`	Whether to include the header row in each chunk.
`max_chunk_size`	`int`	`5000`	Maximum character size for each chunk (fallback limit).

Get Started

Basics

Capabilities

Context

Production

Providers

Other

Additional Resources

CSV Row Chunking

CSV Row Chunking Params

Get Started

Basics

Capabilities

Context

Production

Providers

Other

Additional Resources

Documentation Index

​CSV Row Chunking Params

CSV Row Chunking Params