Synthetic Generation

Auto-generate benchmarks from your knowledge base. Vecta creates questions that only your data can answer.

Quick Start

from vecta import VectaAPIClient

client = VectaAPIClient(api_key="your-key")

# Generate benchmark
benchmark = client.create_benchmark(
    data_source_id="...",  # ID of your uploaded files or vector DB
    questions_count=10
)

How It Works

Sample chunks from your knowledge base
Generate questions using LLM - each question is specific to its chunk
Find related chunks using semantic search
Validate relevance with LLM judges
Build ground truth with all relevant chunks/pages/documents

Flow showing how Vecta validates chunk relevance during synthetic benchmark creation — **Figure:** After generating a question from a chunk, an LLM judge determines whether any similar "candidate" chunks *also* help answer the generated question. If they do, then they should be included as cited sources in the benchmark dataset.

Configuration Options

benchmark = client.create_benchmark(
    data_source_id=db["id"],
    questions_count=10,          # How many questions to generate
    random_seed=42,               # For reproducibility
    wait_for_completion=True      # Wait for generation to finish
)

Local SDK

from vecta import VectaClient

vecta = VectaClient(
    vector_db_connector=connector,
    openai_api_key="your-openai-key"
)

# Load data
vecta.load_knowledge_base()

# Generate
entries = vecta.generate_benchmark(
    n_questions=100,
    similarity_top_k=10,  # Candidates to validate
    random_seed=42
)

# Save
vecta.save_benchmark("my_benchmark.csv")

Quality Control

Vecta ensures high-quality benchmarks:

Question quality:

Specific to your domain
Requires chunk content to answer
Not answerable from general knowledge

Ground truth accuracy:

All relevant chunks identified via semantic search + LLM validation
Multi-hop questions automatically detected
Page and document attribution computed

Best Practices

Start with 5-10 questions:

# Quick iteration
benchmark = client.create_benchmark(
    data_source_id=db["id"],
    questions_count=5
)

Use random seeds for consistency:

# Same seed = same questions
benchmark = client.create_benchmark(
    data_source_id=db["id"],
    questions_count=10,
    random_seed=42
)

Generate multiple benchmarks:

# Different random samples
for i in range(3):
    benchmark = client.create_benchmark(
        data_source_id=db["id"],
        questions_count=50,
        random_seed=i,
        description=f"Benchmark variant {i+1}"
    )

Requirements

Minimum chunks: 25+ for reliable benchmarks
Metadata: Chunks must include source_path and page_nums
OpenAI API key: Required for question generation (local SDK only)

Benchmark Size Guide

Questions	Use Case	Generation Time
5-10	Quick testing, rapid iteration	20 seconds
50-100	Standard evaluation	2-5 minutes
1000+	SLA / Production monitoring	2 hours

Monitoring Generation

# Cloud API - check status
benchmark = client.get_benchmark(benchmark["id"])
print(f"Status: {benchmark['status']}")  # generating, active, error

# View in dashboard
print(f"Dashboard: https://runvecta.com/platform/benchmarks/{benchmark['id']}")

Next Steps

Evaluations → - Use your benchmark in evaluations
CSV Upload → - Export and modify benchmarks