Back to Docs
Data Sources
Vector Databases
Connect ChromaDB, Pinecone, Weaviate, and more
Last updated: August 19, 2025
Category: data-sources
Vector Database Connectors
Connect Vecta to your vector database. All connectors use configurable schemas to adapt to your data structure.
Quick Examples
ChromaDB Cloud
from vecta import VectaAPIClient
client = VectaAPIClient(api_key="your-key")
db = client.connect_chroma_cloud(
tenant="your-tenant",
database="your-db",
api_key="your-chroma-key",
collection_name="documents"
)
Pinecone
db = client.connect_pinecone(
api_key="your-pinecone-key",
index_name="your-index",
namespace="" # optional
)
PostgreSQL + pgvector
from vecta.connectors.pgvector_connector import PgVectorConnector
from vecta.core.schema_helpers import SchemaTemplates
connector = PgVectorConnector(
dsn="postgresql://user:pass@host:5432/db",
table="chunks",
schema=SchemaTemplates.pgvector_standard()
)
All Supported Databases
ChromaDB (Local)
from vecta import VectaClient
from vecta.connectors.chroma_local_connector import ChromaLocalConnector
from vecta.core.schema_helpers import SchemaTemplates
from chromadb import Client
chroma = Client()
connector = ChromaLocalConnector(
client=chroma,
collection_name="documents",
schema=SchemaTemplates.chroma_default()
)
vecta = VectaClient(vector_db_connector=connector)
Weaviate
from vecta.connectors.weaviate_connector import WeaviateConnector
from vecta.core.schema_helpers import SchemaTemplates
connector = WeaviateConnector(
cluster_url="https://your-cluster.weaviate.cloud",
api_key="your-key",
collection_name="Documents",
schema=SchemaTemplates.weaviate_default()
)
Azure Cosmos DB
from vecta.connectors.azure_cosmos_connector import AzureCosmosConnector
from vecta.core.schemas import VectorDBSchema
connector = AzureCosmosConnector(
endpoint="https://your-account.documents.azure.com:443/",
key="your-key",
database_name="your-db",
container_name="chunks",
schema=VectorDBSchema(
id_accessor="id",
content_accessor="content",
source_path_accessor="metadata.source",
page_nums_accessor="metadata.pages"
)
)
Databricks
from vecta.connectors.databricks_connector import DatabricksConnector
connector = DatabricksConnector(
workspace_url="https://your-workspace.cloud.databricks.com",
index_name="your_index",
personal_access_token="your-token",
schema=SchemaTemplates.databricks_indexed()
)
Framework Integrations
LangChain
from vecta.connectors.langchain_connector import LangChainVectorStoreConnector
from vecta.core.schema_helpers import SchemaTemplates
# Works with any LangChain VectorStore
connector = LangChainVectorStoreConnector(
vectorstore=your_langchain_store,
schema=SchemaTemplates.chroma_default() # Adjust for your store
)
LlamaIndex
from vecta.connectors.llama_index_connector import LlamaIndexConnector
from vecta.core.schemas import VectorDBSchema
connector = LlamaIndexConnector(
index=your_llama_index,
schema=VectorDBSchema(
id_accessor="node_id",
content_accessor="content",
source_path_accessor="metadata.file_name",
page_nums_accessor="metadata.pages"
)
)
Custom Connectors
Build a connector for any database:
from vecta.connectors.base import BaseVectorDBConnector
from vecta.core.schemas import ChunkData, VectorDBSchema
class MyConnector(BaseVectorDBConnector):
def __init__(self, client, schema: VectorDBSchema):
super().__init__(schema)
self.client = client
def get_all_chunks(self) -> list[ChunkData]:
results = self.client.fetch_all()
return [self._create_chunk_data_from_raw(r) for r in results]
def semantic_search(self, query: str, k: int) -> list[ChunkData]:
results = self.client.search(query, limit=k)
return [self._create_chunk_data_from_raw(r) for r in results]
def get_chunk_by_id(self, chunk_id: str) -> ChunkData:
result = self.client.get(chunk_id)
return self._create_chunk_data_from_raw(result)
Schema Templates
Pre-built schemas for popular databases:
from vecta.core.schema_helpers import SchemaTemplates
# ChromaDB
schema = SchemaTemplates.chroma_default()
# Pinecone
schema = SchemaTemplates.pinecone_default()
# PostgreSQL (standard - metadata as JSON column)
schema = SchemaTemplates.pgvector_standard()
# PostgreSQL (flat - separate columns)
schema = SchemaTemplates.pgvector_flat(
id_col="id",
content_col="content",
source_path_col="file_name",
page_nums_col="pages"
)
# Weaviate
schema = SchemaTemplates.weaviate_default()
# Databricks
schema = SchemaTemplates.databricks_indexed()
Troubleshooting
Connection fails:
- Verify credentials and URLs
- Check network access/firewall rules
- Ensure database/collection exists
Schema extraction returns None:
- Print a sample record to see structure
- Adjust accessor syntax to match your data
- See Accessor Syntax
Missing metadata:
- Ensure chunks include
source_path
andpage_nums
- Add metadata when creating/updating chunks
- Use defaults in schema if metadata inconsistent
Next Steps
- Accessor Syntax → - Learn accessor patterns
- Benchmarks → - Generate test datasets
- Evaluations → - Run evaluations