Back to Docs
Benchmarks

CSV Upload

Import existing Q&A datasets from CSV files

Last updated: August 20, 2025
Category: benchmarks

CSV Upload

Import an existing question-answer dataset by uploading a CSV file. This is useful when you already have manually curated test cases or want to bring in datasets from other tools.

CSV Format

Required Columns

ColumnDescription
questionThe query text
answerThe expected answer

Optional Columns

ColumnFormatDescription
idStringUnique entry ID (auto-generated if omitted)
chunk_idsPipe-separatedGround-truth chunk IDs (e.g., chunk_1|chunk_2)
page_numsPipe-separated integersPage numbers (e.g., 1|2|3)
source_pathsPipe-separatedDocument names (e.g., report.pdf|manual.pdf)

Example CSV

question,answer,chunk_ids,page_nums,source_paths
"What is the return policy?","Items can be returned within 30 days.","chunk_12|chunk_15","1|2","returns_policy.pdf"
"How do I reset my password?","Go to Settings > Security > Reset Password.","chunk_42","5","user_guide.pdf"
"What are the shipping options?","Standard (5-7 days) and Express (1-2 days).","chunk_8|chunk_9|chunk_10","3|4","shipping_faq.pdf|logistics.pdf"

Important: Multi-value fields use the pipe character (|) as a separator, not commas, because the file itself is comma-separated.

Uploading via the Platform

  1. Navigate to Benchmarks
  2. Click Upload CSV
  3. Provide a benchmark name
  4. Select your CSV file
  5. Click Upload

The benchmark is created with status active and is immediately ready for evaluation.

Uploading via the API

The CSV upload endpoint is available through the backend API:

POST /benchmarks/upload-csv
Content-Type: multipart/form-data

Fields:
  - name: "My Benchmark"
  - description: "Optional description"
  - csv_file: <your_file.csv>

When to Use CSV Upload

  • Manual curation — You have hand-crafted test cases from domain experts
  • Migration — Moving evaluation data from another tool
  • Generation-only benchmarks — You only need question and answer columns (no chunk_ids needed)
  • External pipelines — Your Q&A pairs are produced by a separate script or system

Tips

  • Benchmarks without chunk_ids can still be used for generation-only evaluations
  • Benchmarks without page_nums will skip page-level metrics (chunk and document metrics still work)
  • Exported benchmarks use the same CSV format, so you can download, edit, and re-upload

Next Steps

Need Help?

Can't find what you're looking for? Our team is here to help.