Back to Docs
Getting Started
API Quickstart
Get started with Vecta's Cloud API in minutes
Last updated: December 19, 2024
Category: getting-started
API Quickstart
Get up and running with Vecta's cloud API in under 5 minutes. This guide will walk you through creating your first evaluation using our hosted infrastructure.
Prerequisites
- A Vecta account (sign up for free)
- Python 3.9+ (for the SDK)
- Your RAG system or vector database connection details
Step 1: Install the Python SDK
pip install vecta
Step 2: Get Your API Key
- Head to Vecta settings
- Click Create New API Key
- Copy your API key and store it securely
Step 3: Connect Your Vector Database
Using the Web Interface
- Go to Platform → Vector Databases in your dashboard
- Click Add Database
- Choose your database type (Pinecone, ChromaDB, Weaviate, etc.)
- Enter your connection details
- Test the connection and save
Using the SDK
from vecta import VectaAPIClient
client = VectaAPIClient(api_key="your-api-key")
# Add a Pinecone database
database = client.add_vector_database(
name="my-knowledge-base",
type="pinecone",
config={
"api_key": "your-pinecone-key",
"environment": "us-west1-gcp",
"index_name": "my-index"
}
)
print(f"Database connected: {database.id}")
Step 4: Create a Benchmark
Benchmarks are test datasets that define what "good" retrieval looks like for your domain.
Auto-Generate from Your Data
# Generate benchmark from your vector database
benchmark = client.create_benchmark(
name="Customer Support Evaluation",
vector_db_id=database.id,
description="Evaluating our support chatbot knowledge base",
num_questions=10, # Generate 10 test questions
)
print(f"Benchmark created: {benchmark.id}")
print(f"Generated {benchmark.questions_count} questions")
Upload Your Own Test Data
If you have existing test questions and answers:
# Upload custom benchmark
benchmark = client.upload_benchmark(
name="Custom Support Benchmark",
file_path="my_benchmark.csv" # CSV with question,answer,chunk_ids columns
)
Step 5: Run Your First Evaluation
Now let's evaluate how well your RAG system retrieves relevant information:
Retrieval-Only Evaluation
def my_retrieval_function(query: str) -> list[str]:
"""Your retrieval function that returns chunk IDs."""
# Example: query your vector database
# results = your_vector_db.search(query, k=10)
# return [result.id for result in results]
# For this example, we'll use a simple mock
return ["chunk_1", "chunk_2", "chunk_3"]
# Run the evaluation
results = client.evaluate_retrieval(
benchmark_id=benchmark.id,
retrieval_function=my_retrieval_function,
evaluation_name="Support Bot Retrieval v1.0"
)
# View results
print(f"Chunk-level F1: {results.chunk_level.f1_score:.3f}")
print(f"Page-level F1: {results.page_level.f1_score:.3f}")
print(f"Document-level F1: {results.document_level.f1_score:.3f}")
Full RAG Evaluation
To evaluate both retrieval and generation:
def my_rag_function(query: str) -> tuple[list[str], str]:
"""Your RAG function that returns (chunk_ids, generated_text)."""
# 1. Retrieve relevant chunks
chunk_ids = my_retrieval_function(query)
# 2. Generate answer using your LLM
# context = get_context_from_chunks(chunk_ids)
# generated_text = your_llm.generate(query, context)
# For this example:
generated_text = f"Generated answer based on {len(chunk_ids)} retrieved chunks."
return chunk_ids, generated_text
# Run full RAG evaluation
results = client.evaluate_retrieval_and_generation(
benchmark_id=benchmark.id,
retrieval_generation_function=my_rag_function,
evaluation_name="Support Bot Full Pipeline v1.0"
)
print(f"Retrieval F1: {results.chunk_level.f1_score:.3f}")
print(f"Generation Accuracy: {results.generation_metrics.accuracy:.3f}")
print(f"Generation Factuality: {results.generation_metrics.factuality:.3f}")
Step 6: View Results in Dashboard
- Go to Platform → Evaluations in your dashboard
- Click on your evaluation to see detailed results
- Explore metrics by query, performance over time, and failure cases
- Download reports or set up alerts for performance regressions
Troubleshooting
Common Issues
API Key Issues
# Verify your API key is working
client = VectaAPIClient(api_key="your-api-key")
user = client.get_current_user()
print(f"Authenticated as: {user.email}")
Connection Timeouts
# Increase timeout for large datasets
client = VectaAPIClient(
api_key="your-api-key",
timeout=300 # 5 minutes
)
Rate Limiting
# Add delays between requests
import time
def my_retrieval_function_with_delays(query: str) -> list[str]:
time.sleep(0.1) # 100ms delay
return my_retrieval_function(query)
Need help? Contact our support team or book a demo.