What Are Vector Databases?
Vector databases store and search high-dimensional vectors—numerical representations of data like text, images, or audio. They're the foundation of modern AI search, RAG systems, and recommendation engines.
Why Vector Databases Matter
The Embedding Revolution
Traditional Search:
"machine learning" → keyword match → results
Vector Search:
"machine learning" → [0.1, 0.5, 0.8, ...] → similarity search
(embedding captures meaning)
→ semantically similar results
Use Cases
| Use Case | Description |
|---|---|
| RAG | Retrieve context for LLMs |
| Semantic search | Find by meaning, not keywords |
| Recommendations | Similar products, content |
| Anomaly detection | Find outliers |
| Duplicate detection | Similar images, text |
| Personalization | User preference matching |
How They Work
Core Concepts
| Concept | Description |
|---|---|
| Vector | Array of floats representing data |
| Embedding | Vector from ML model |
| Dimension | Number of floats (e.g., 1536) |
| Similarity | How close vectors are |
| Index | Data structure for fast search |
Indexing Algorithms
| Algorithm | Characteristics | Best For |
|---|---|---|
| HNSW | Fast, accurate | General use |
| IVF | Memory efficient | Large datasets |
| Flat | Perfect accuracy | Small datasets |
| PQ | Compressed | Very large, less accuracy |
Similarity Metrics
| Metric | Formula | Use Case |
|---|---|---|
| Cosine | Angle between vectors | Text, normalized |
| Euclidean | Distance between points | General |
| Dot Product | Raw similarity | Recommendations |
Database Comparison
Overview
| Database | Type | Pricing | Best For |
|---|---|---|---|
| Pinecone | Managed | $$$ | Production, no ops |
| Weaviate | Open/Managed | Free-$$$ | Flexibility |
| Chroma | Open source | Free | Local, prototyping |
| Qdrant | Open/Managed | Free-$$ | Performance |
| Milvus | Open source | Free | Large scale |
| pgvector | Postgres extension | Free | Existing Postgres |
Feature Comparison
| Feature | Pinecone | Weaviate | Chroma | Qdrant |
|---|---|---|---|---|
| Managed option | ✅ | ✅ | ❌ | ✅ |
| Self-hosted | ❌ | ✅ | ✅ | ✅ |
| Hybrid search | ✅ | ✅ | ✅ | ✅ |
| Filtering | ✅ | ✅ | ✅ | ✅ |
| Built-in embedding | ❌ | ✅ | ✅ | ❌ |
| Maximum vectors | 10B+ | 100B+ | Millions | Billions |
Detailed Analysis
Pinecone
- Pros: Fully managed, highly reliable, scales well
- Cons: Expensive, cloud-only, vendor lock-in
- Pricing: $0.096/1M vectors/month + operations
Weaviate
- Pros: Open source, GraphQL API, built-in vectorization
- Cons: More complex setup
- Pricing: Free (self-hosted), cloud starting $25/mo
Chroma
- Pros: Simple, great for development, free
- Cons: Not production-ready for large scale
- Pricing: Free (open source)
Qdrant
- Pros: Extremely fast, Rust-based, good cloud offering
- Cons: Smaller community
- Pricing: Free (self-hosted), cloud starting $25/mo
Implementation
Quick Start with Python
Pinecone:
from pinecone import Pinecone
pc = Pinecone(api_key="...")
index = pc.Index("my-index")
# Upsert vectors
index.upsert(vectors=[
{"id": "doc1", "values": [0.1, 0.2, ...], "metadata": {"title": "..."}},
])
# Query
results = index.query(vector=[0.1, 0.3, ...], top_k=5)
Weaviate:
import weaviate
client = weaviate.Client("http://localhost:8080")
# Add data
client.data_object.create(
data_object={"title": "..."},
class_name="Document",
vector=[0.1, 0.2, ...]
)
# Query
result = client.query.get("Document", ["title"]).with_near_vector({
"vector": [0.1, 0.3, ...]
}).with_limit(5).do()
Chroma:
import chromadb
client = chromadb.Client()
collection = client.create_collection("my-collection")
# Add documents (auto-embeds with default model)
collection.add(
documents=["doc1 text", "doc2 text"],
ids=["doc1", "doc2"]
)
# Query (auto-embeds query)
results = collection.query(
query_texts=["search query"],
n_results=5
)
Architecture Patterns
RAG with Vector DB
Document Ingestion:
Documents → Chunking → Embedding → Vector DB
Query Time:
Query → Embedding → Vector Search → Top-K Chunks
↓
LLM + Context → Response
Hybrid Search
Combine vector + keyword search:
Query: "Apple financial report 2024"
Vector Search:
→ Semantically similar finance documents
Keyword Search:
→ Documents containing "Apple" and "2024"
Fusion:
→ Combine scores for best results
Performance Optimization
Tips
| Optimization | Impact |
|---|---|
| Batch operations | 10x throughput |
| Dimension reduction | Faster queries |
| Filtering strategy | Reduce search space |
| Index tuning | Accuracy/speed tradeoff |
| Caching | Repeat queries |
Scaling Considerations
| Scale | Recommendation |
|---|---|
| < 100K vectors | Any solution works |
| 100K - 10M | Managed or self-hosted |
| 10M - 1B | Scaled managed solution |
| > 1B | Purpose-built infrastructure |
Selection Guide
By Use Case
| Use Case | Recommendation |
|---|---|
| Rapid prototyping | Chroma |
| Production RAG | Pinecone or Weaviate Cloud |
| Maximum control | Weaviate or Qdrant self-hosted |
| Existing Postgres | pgvector |
| Massive scale | Milvus or Pinecone |
By Team
| Team Type | Recommendation |
|---|---|
| Solo developer | Chroma or Supabase |
| Small startup | Weaviate Cloud or Qdrant Cloud |
| Enterprise | Pinecone or Weaviate Enterprise |
| Budget-conscious | Self-hosted Weaviate/Qdrant |
Future Trends
What's Coming
- Multimodal vectors: Images, video, audio unified
- Graph + vector: Combined capabilities
- Edge deployment: On-device vector search
- Auto-optimization: Self-tuning indexes
- Serverless: True pay-per-query
"Vector databases are becoming essential infrastructure for AI applications. Just as traditional databases store structured data, vector databases store understanding."








