Back to Docs
Data Sources

Vector Databases

Connect ChromaDB, Pinecone, Weaviate, and more

Last updated: August 19, 2025
Category: data-sources

Vector Database Connectors

Connect Vecta to your vector database. All connectors use configurable schemas to adapt to your data structure.

Quick Examples

ChromaDB Cloud

from vecta import VectaAPIClient

client = VectaAPIClient(api_key="your-key")

db = client.connect_chroma_cloud(
    tenant="your-tenant",
    database="your-db",
    api_key="your-chroma-key",
    collection_name="documents"
)

Pinecone

db = client.connect_pinecone(
    api_key="your-pinecone-key",
    index_name="your-index",
    namespace=""  # optional
)

PostgreSQL + pgvector

from vecta.connectors.pgvector_connector import PgVectorConnector
from vecta.core.schema_helpers import SchemaTemplates

connector = PgVectorConnector(
    dsn="postgresql://user:pass@host:5432/db",
    table="chunks",
    schema=SchemaTemplates.pgvector_standard()
)

All Supported Databases

ChromaDB (Local)

from vecta import VectaClient
from vecta.connectors.chroma_local_connector import ChromaLocalConnector
from vecta.core.schema_helpers import SchemaTemplates
from chromadb import Client

chroma = Client()
connector = ChromaLocalConnector(
    client=chroma,
    collection_name="documents",
    schema=SchemaTemplates.chroma_default()
)

vecta = VectaClient(vector_db_connector=connector)

Weaviate

from vecta.connectors.weaviate_connector import WeaviateConnector
from vecta.core.schema_helpers import SchemaTemplates

connector = WeaviateConnector(
    cluster_url="https://your-cluster.weaviate.cloud",
    api_key="your-key",
    collection_name="Documents",
    schema=SchemaTemplates.weaviate_default()
)

Azure Cosmos DB

from vecta.connectors.azure_cosmos_connector import AzureCosmosConnector
from vecta.core.schemas import VectorDBSchema

connector = AzureCosmosConnector(
    endpoint="https://your-account.documents.azure.com:443/",
    key="your-key",
    database_name="your-db",
    container_name="chunks",
    schema=VectorDBSchema(
        id_accessor="id",
        content_accessor="content",
        source_path_accessor="metadata.source",
        page_nums_accessor="metadata.pages"
    )
)

Databricks

from vecta.connectors.databricks_connector import DatabricksConnector

connector = DatabricksConnector(
    workspace_url="https://your-workspace.cloud.databricks.com",
    index_name="your_index",
    personal_access_token="your-token",
    schema=SchemaTemplates.databricks_indexed()
)

Framework Integrations

LangChain

from vecta.connectors.langchain_connector import LangChainVectorStoreConnector
from vecta.core.schema_helpers import SchemaTemplates

# Works with any LangChain VectorStore
connector = LangChainVectorStoreConnector(
    vectorstore=your_langchain_store,
    schema=SchemaTemplates.chroma_default()  # Adjust for your store
)

LlamaIndex

from vecta.connectors.llama_index_connector import LlamaIndexConnector
from vecta.core.schemas import VectorDBSchema

connector = LlamaIndexConnector(
    index=your_llama_index,
    schema=VectorDBSchema(
        id_accessor="node_id",
        content_accessor="content",
        source_path_accessor="metadata.file_name",
        page_nums_accessor="metadata.pages"
    )
)

Custom Connectors

Build a connector for any database:

from vecta.connectors.base import BaseVectorDBConnector
from vecta.core.schemas import ChunkData, VectorDBSchema

class MyConnector(BaseVectorDBConnector):
    def __init__(self, client, schema: VectorDBSchema):
        super().__init__(schema)
        self.client = client

    def get_all_chunks(self) -> list[ChunkData]:
        results = self.client.fetch_all()
        return [self._create_chunk_data_from_raw(r) for r in results]

    def semantic_search(self, query: str, k: int) -> list[ChunkData]:
        results = self.client.search(query, limit=k)
        return [self._create_chunk_data_from_raw(r) for r in results]

    def get_chunk_by_id(self, chunk_id: str) -> ChunkData:
        result = self.client.get(chunk_id)
        return self._create_chunk_data_from_raw(result)

Schema Templates

Pre-built schemas for popular databases:

from vecta.core.schema_helpers import SchemaTemplates

# ChromaDB
schema = SchemaTemplates.chroma_default()

# Pinecone
schema = SchemaTemplates.pinecone_default()

# PostgreSQL (standard - metadata as JSON column)
schema = SchemaTemplates.pgvector_standard()

# PostgreSQL (flat - separate columns)
schema = SchemaTemplates.pgvector_flat(
    id_col="id",
    content_col="content",
    source_path_col="file_name",
    page_nums_col="pages"
)

# Weaviate
schema = SchemaTemplates.weaviate_default()

# Databricks
schema = SchemaTemplates.databricks_indexed()

Troubleshooting

Connection fails:

  • Verify credentials and URLs
  • Check network access/firewall rules
  • Ensure database/collection exists

Schema extraction returns None:

  • Print a sample record to see structure
  • Adjust accessor syntax to match your data
  • See Accessor Syntax

Missing metadata:

  • Ensure chunks include source_path and page_nums
  • Add metadata when creating/updating chunks
  • Use defaults in schema if metadata inconsistent

Next Steps

Need Help?

Can't find what you're looking for? Our team is here to help.