API Quickstart

Get up and running with Vecta's cloud API in under 5 minutes. This guide will walk you through creating your first evaluation using our hosted infrastructure.

Prerequisites

A Vecta account (sign up for free)
Python 3.9+ (for the SDK)
Your RAG system or vector database connection details

Step 1: Install the Python SDK

pip install vecta

Step 2: Get Your API Key

Head to Vecta settings
Click Create New API Key
Copy your API key and store it securely

Step 3: Connect Your Vector Database

Using the Web Interface

Go to Platform → Vector Databases in your dashboard
Click Add Database
Choose your database type (Pinecone, ChromaDB, Weaviate, etc.)
Enter your connection details
Test the connection and save

Using the SDK

from vecta import VectaAPIClient

client = VectaAPIClient(api_key="your-api-key")

# Add a Pinecone database
database = client.add_vector_database(
    name="my-knowledge-base",
    type="pinecone",
    config={
        "api_key": "your-pinecone-key",
        "environment": "us-west1-gcp",
        "index_name": "my-index"
    }
)

print(f"Database connected: {database.id}")

Step 4: Create a Benchmark

Benchmarks are test datasets that define what "good" retrieval looks like for your domain.

Auto-Generate from Your Data

# Generate benchmark from your vector database
benchmark = client.create_benchmark(
    name="Customer Support Evaluation",
    vector_db_id=database.id,
    description="Evaluating our support chatbot knowledge base",
    num_questions=10,  # Generate 10 test questions
)

print(f"Benchmark created: {benchmark.id}")
print(f"Generated {benchmark.questions_count} questions")

Upload Your Own Test Data

If you have existing test questions and answers:

# Upload custom benchmark
benchmark = client.upload_benchmark(
    name="Custom Support Benchmark",
    file_path="my_benchmark.csv"  # CSV with question,answer,chunk_ids columns
)

Step 5: Run Your First Evaluation

Now let's evaluate how well your RAG system retrieves relevant information:

Retrieval-Only Evaluation

def my_retrieval_function(query: str) -> list[str]:
    """Your retrieval function that returns chunk IDs."""
    # Example: query your vector database
    # results = your_vector_db.search(query, k=10)
    # return [result.id for result in results]
    
    # For this example, we'll use a simple mock
    return ["chunk_1", "chunk_2", "chunk_3"]

# Run the evaluation
results = client.evaluate_retrieval(
    benchmark_id=benchmark.id,
    retrieval_function=my_retrieval_function,
    evaluation_name="Support Bot Retrieval v1.0"
)

# View results
print(f"Chunk-level F1: {results.chunk_level.f1_score:.3f}")
print(f"Page-level F1: {results.page_level.f1_score:.3f}")
print(f"Document-level F1: {results.document_level.f1_score:.3f}")

Full RAG Evaluation

To evaluate both retrieval and generation:

def my_rag_function(query: str) -> tuple[list[str], str]:
    """Your RAG function that returns (chunk_ids, generated_text)."""
    # 1. Retrieve relevant chunks
    chunk_ids = my_retrieval_function(query)
    
    # 2. Generate answer using your LLM
    # context = get_context_from_chunks(chunk_ids)
    # generated_text = your_llm.generate(query, context)
    
    # For this example:
    generated_text = f"Generated answer based on {len(chunk_ids)} retrieved chunks."
    
    return chunk_ids, generated_text

# Run full RAG evaluation
results = client.evaluate_retrieval_and_generation(
    benchmark_id=benchmark.id,
    retrieval_generation_function=my_rag_function,
    evaluation_name="Support Bot Full Pipeline v1.0"
)

print(f"Retrieval F1: {results.chunk_level.f1_score:.3f}")
print(f"Generation Accuracy: {results.generation_metrics.accuracy:.3f}")
print(f"Generation Factuality: {results.generation_metrics.factuality:.3f}")

Step 6: View Results in Dashboard

Go to Platform → Evaluations in your dashboard
Click on your evaluation to see detailed results
Explore metrics by query, performance over time, and failure cases
Download reports or set up alerts for performance regressions

Troubleshooting

Common Issues

API Key Issues

# Verify your API key is working
client = VectaAPIClient(api_key="your-api-key")
user = client.get_current_user()
print(f"Authenticated as: {user.email}")

Connection Timeouts

# Increase timeout for large datasets
client = VectaAPIClient(
    api_key="your-api-key",
    timeout=300  # 5 minutes
)

Rate Limiting

# Add delays between requests
import time

def my_retrieval_function_with_delays(query: str) -> list[str]:
    time.sleep(0.1)  # 100ms delay
    return my_retrieval_function(query)

Need help? Contact our support team or book a demo.