Skip to content

Vector Database for Face Search

Definition

Vector databases enable fast similarity search across millions of face embeddings — powering 1:N face deduplication in eKYC.


Technology Options

Database Type Search Speed (1M vectors) Scaling
FAISS Library (Facebook) < 5ms In-memory, sharding manual
Milvus Purpose-built vector DB < 10ms Distributed, cloud-native
Pinecone Managed service < 20ms Fully managed
Qdrant Open-source vector DB < 10ms Distributed
Weaviate Vector DB + object storage < 15ms Managed or self-hosted
pgvector PostgreSQL extension < 50ms PostgreSQL scaling

Index Types

Index How It Works Best For
IVF (Inverted File) Cluster vectors, search nearest clusters 1-10M vectors
HNSW Hierarchical graph-based search Low-latency, < 10M
IVF + PQ IVF with product quantization compression 10M-1B vectors
DiskANN Disk-based index > 1B vectors, cost-optimized

Key Takeaways

Summary

  • FAISS (IVF+PQ) is the standard for high-performance face search — < 5ms for millions of vectors
  • Milvus is the best managed vector database for production eKYC deduplication
  • Index choice depends on scale: HNSW (< 10M, lowest latency) vs IVF+PQ (> 10M, memory-efficient)
  • At Aadhaar scale (1.4B), custom sharding across multiple FAISS/Milvus clusters is needed