Back to Docs
Experiments

Overview

Group evaluations, attach metadata, and visualize comparisons

Last updated: August 20, 2025
Category: experiments

Experiments

An experiment groups related evaluations so you can compare configuration changes side by side. For example, sweeping top_k values, testing different embedding models, or comparing prompt strategies.

Creating an Experiment

API Client

from vecta import VectaAPIClient

client = VectaAPIClient()

experiment = client.create_experiment(
    name="Chunk Size Sweep",
    description="Testing 256 / 512 / 1024 chunk sizes",
)

print(f"Experiment ID: {experiment['id']}")

Platform UI

Experiments are managed through the Experiments dashboard where you can create, rename, and delete experiments and view their grouped evaluations.

Running Evaluations Within an Experiment

Pass the experiment_id and metadata to any evaluation call:

for chunk_size in [256, 512, 1024]:
    results = client.evaluate_retrieval(
        benchmark_id="your-benchmark-id",
        retrieval_function=make_retriever(chunk_size=chunk_size),
        evaluation_name=f"chunk-{chunk_size}",
        experiment_id=experiment["id"],
        metadata={
            "chunk_size": chunk_size,
            "model": "text-embedding-3-small",
            "top_k": 10,
        },
    )
    print(f"Chunk size {chunk_size}: F1 = {results.chunk_level.f1_score:.2%}")

The metadata is stored alongside each evaluation and used for comparison and plotting.

Viewing Experiment Results

API Client

exp_detail = client.get_experiment(experiment["id"])

print(f"Evaluations: {len(exp_detail['evaluations'])}")
print(f"Metadata keys: {exp_detail['metadata_keys']}")

for ev in exp_detail["evaluations"]:
    print(f"  {ev['name']}: chunk F1 = {ev.get('chunk_level', {}).get('f1_score', 'N/A')}")

Plotting

Use the built-in plotting module to visualize results across metadata values:

from vecta import plot_experiment, get_metadata_keys

exp_detail = client.get_experiment(experiment["id"])
evaluations = exp_detail["evaluations"]

# See available metadata keys
keys = get_metadata_keys(evaluations)
print(f"Available keys: {keys}")
# e.g., ["chunk_size", "model", "top_k"]

# Plot — auto-detects value type:
#   Numeric values → line chart
#   String values  → grouped bar chart
plot_experiment(evaluations, metadata_key="chunk_size")
plot_experiment(evaluations, metadata_key="model")

Managing Experiments

# List all experiments
experiments = client.list_experiments()

# Rename
from vecta import RenameRequest
client.rename_experiment(experiment["id"], RenameRequest(name="New Name"))

# Delete (evaluations are un-linked, not deleted)
client.delete_experiment(experiment["id"])

Note: Deleting an experiment does not delete its evaluations. The evaluations are un-linked and remain visible in the evaluations list.

Example: Comparing Embedding Models

experiment = client.create_experiment(name="Embedding Model Comparison")

for model_name in ["text-embedding-3-small", "text-embedding-3-large", "e5-large"]:
    retriever = build_retriever(embedding_model=model_name)

    results = client.evaluate_retrieval(
        benchmark_id="bm-id",
        retrieval_function=retriever,
        evaluation_name=f"embed-{model_name}",
        experiment_id=experiment["id"],
        metadata={"model": model_name},
    )

# Visualize
exp = client.get_experiment(experiment["id"])
plot_experiment(exp["evaluations"], metadata_key="model")

Next Steps

Need Help?

Can't find what you're looking for? Our team is here to help.