Synthetic Generation
Auto-generate benchmarks from your knowledge base
Synthetic Benchmark Generation
Vecta can automatically generate question-answer pairs grounded in your knowledge base. The synthetic benchmark includes multi-hop retrievals, edge cases, and ground-truth citations — providing comprehensive test coverage for your RAG system.
How It Works
- Sampling — Vecta randomly samples chunks from your data source (controlled by
random_seedfor reproducibility). - Question generation — An LLM generates a question that the sampled chunk can answer, along with a canonical answer.
- Citation discovery — Vecta performs a similarity sweep across your knowledge base and runs parallel LLM-as-a-judge calls to find all chunks that can answer the question — not just the original chunk. This ensures ground-truth recall is comprehensive.
- Assembly — Each entry is assembled with
question,answer,chunk_ids,page_nums, andsource_paths.
Quality check: For every synthetic Q&A pair, the SDK runs a panel of LLM-as-a-judge calls. Any chunk that the judges deem relevant is automatically merged into the benchmark's ground-truth citations, ensuring your downstream recall/precision numbers are accurate.
Using the API Client
from vecta import VectaAPIClient
client = VectaAPIClient()
benchmark = client.create_benchmark(
data_source_id="your-data-source-id",
questions_count=100,
random_seed=42,
description="Q4 knowledge base eval",
)
print(f"Benchmark ID: {benchmark['id']}")
print(f"Status: {benchmark['status']}")
print(f"Questions generated: {benchmark['questions_count']}")
| Parameter | Type | Default | Description |
|---|---|---|---|
data_source_id | str | required | ID of the connected data source |
questions_count | int | 100 | Number of Q&A pairs to generate |
random_seed | int | None | Seed for reproducible generation |
description | str | None | Optional description |
wait_for_completion | bool | True | Block until generation finishes |
The create_benchmark method creates the benchmark and triggers generation in a single call. When wait_for_completion=True, it polls until the benchmark status becomes active.
Using the Local Client
from vecta import VectaClient, ChromaLocalConnector, VectorDBSchema
schema = VectorDBSchema(
id_accessor="id",
content_accessor="document",
metadata_accessor="metadata",
source_path_accessor="metadata.source_path",
page_nums_accessor="metadata.page_nums",
)
connector = ChromaLocalConnector(
client=chroma_client,
collection_name="my_docs",
schema=schema,
)
vecta = VectaClient(
data_source_connector=connector,
openai_api_key="sk-...", # required for LLM generation
)
# Load chunks from the data source
chunks = vecta.load_knowledge_base()
print(f"Loaded {len(chunks)} chunks")
# Generate synthetic benchmark
entries = vecta.generate_benchmark(
n_questions=50,
random_seed=42,
)
print(f"Generated {len(entries)} benchmark entries")
# Save for later use
vecta.save_benchmark("my_benchmark.csv")
Local Client Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
n_questions | int | required | Number of Q&A pairs |
random_seed | int | None | Seed for reproducibility |
Using the Platform
- Navigate to Benchmarks
- Click Create Benchmark
- Select a connected data source
- Set the number of questions and random seed
- Click Generate Benchmark
The generation runs server-side. Once complete, the benchmark status changes from draft to active and you can view individual entries.
Requirements
- Data source must be connected and have chunks available
- Minimum chunks — The data source must have at least as many chunks as the requested question count
- OpenAI API key — Required for the LLM calls (configured server-side for the platform, or passed to
VectaClientfor local use)
Saving and Loading
# Save benchmark to CSV
vecta.save_benchmark("benchmark.csv")
# Load benchmark in another session
vecta.load_benchmark("benchmark.csv")
# Or download from the API
entries = client.download_benchmark("benchmark-id")
Next Steps
- CSV Upload — Import existing datasets instead
- Evaluations — Run evaluations against your benchmark