\ In the current AI world, search is not just a feature; it’s the core of how we interact with information. But have you ever searched for a concept, and end up frustrated when the results focus on your exact keywords and miss the actual meaning of the keywords? Example, a search for "tips for new dog owners" might miss a great article titled "A Guide to Your First Canine Companion." This is the classic limitation of traditional keyword search.
The solution isn't to abandon keywords but to enhance them. Hybrid Search, a state-of-the-art modern technique that delivers the best of both worlds. Hybrid Search includes the precision of keyword matching and the contextual understanding of modern AI.
This article will walk you through not just the what and why, but the how, with a complete, hands-on implementation using the open-source vector database Milvus.
\
Imagine you are searching for “fast running shoes” in e-commerce site. A traditional search will list the results matching “shoe”, “running” & “fast” the product name instantly. But this search will miss the products with words “sneakers” or products described as “swift”, “quick”, “athletic footwear” etc.
Hybrid search doesn't force you to choose between Lexical and Semantic. It brings them together, creating a search experience that is both precise and context-aware. Hybrid Search delivers far more relevant results.
\
Before we start building, let us gather our tools:
pip install pymilvusEvery database needs a blueprint for the data it stores. In Milvus, this is called a schema. For hybrid search, our blueprint needs to specify fields for our text, its dense (semantic) vector, and its sparse (lexical) vector.
from pymilvus import Collection, FieldSchema, CollectionSchema, DataType, connections import pymilvus # Connect to Milvus instance (set host as needed) connections.connect("default", host='localhost', port='19530') # 1. Define Fields id_field = FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=True) text_field = FieldSchema(name="text", dtype=DataType.VARCHAR, max_length=2048) # Dense vector field (e.g., 768 dimensions for BGE models) dense_vector_field = FieldSchema(name="dense_vector", dtype=DataType.FLOAT_VECTOR, dim=768) # Sparse vector field (for Splade/BM25-style sparse representations) sparse_vector_field = FieldSchema(name="sparse_vector", dtype=DataType.SPARSE_FLOAT_VECTOR) # 2. Define the Schema schema = CollectionSchema( fields=[id_field, text_field, dense_vector_field, sparse_vector_field], description="Collection for hybrid search implementation" ) # 3. Create the Collection collection_name = "hybrid_search_articles" collection = Collection(name=collection_name, schema=schema) print(f"Collection '{collection_name}' created successfully.")
\n
If a schema is a blueprint, an index is the super-fast table of contents. To get optimal performance, we need to tell Milvus how to organize our different vector types.
# Create index for the dense vector field dense_index_params = { "index_type": "AUTOINDEX", "metric_type": "COSINE", # Common metric for semantic search "params": {} } collection.create_index("dense_vector", dense_index_params) # Create index for the sparse vector field sparse_index_params = { "index_type": "SPARSE_INVERTED_INDEX", "metric_type": "IP", # Inner Product is standard for sparse vectors "params": {} } collection.create_index("sparse_vector", sparse_index_params) print("Indexes created for dense and sparse fields.")
\
Now we can move to populate our collection with data. We will take our text documents, use our embedding model to generate both dense and sparse vectors for each, and insert them into Milvus.
The following code uses a mock function to generate vectors. In a real-world application, you would replace this with calls to your actual AI model.
# This is for demo. You must generate these vectors using your specific AI models def generate_mock_embeddings(texts): # In a real app, replace with calls to your model endpoint import random import numpy as np dense = [np.random.rand(768).tolist() for _ in texts] # Sparse vectors are dictionary representations of indices/values sparse = [{random.randint(0, 5000): random.random() for _ in range(10)} for _ in texts] return dense, sparse # ---------------------------- texts = ["Milvus is a vector database.", "Hybrid search is powerful.", "Semantic search uses AI.", "Keyword search is traditional."] dense_vecs, sparse_vecs = generate_mock_embeddings(texts) data_to_insert = [ {"text": t, "dense_vector": d, "sparse_vector": s} for t, d, s in zip(texts, dense_vecs, sparse_vecs) ] collection.insert(data_to_insert) collection.load() # Load collection into memory for searching print(f"Inserted {len(data_to_insert)} records and loaded collection.")
\n
This is where the magic happens. We will take a user query, generate both dense and sparse vectors for it (aka inference), and then ask Milvus to perform two searches in parallel. Milvus then uses a reranker to intelligently fuse the two sets of results into a single, highly relevant list.
The most common reranker is Reciprocal Rank Fusion (RRF), which smartly combines the rankings from both searches without needing complex manual tuning.
from pymilvus import AnnSearchRequest, RRFRanker, WeightedRanker # Assume we generate query vectors the same way we generated data vectors query_text = "What is a vector database?" # Use your models to get these vectors: query_dense_vector, query_sparse_vector = generate_mock_embeddings([query_text]) # 1. Define the Dense Search Request req_dense = AnnSearchRequest( data=query_dense_vector, # Your query vector(s) anns_field="dense_vector", param={"metric_type": "COSINE", "params": {"nprobe": 10}}, limit=10 # Get top 10 from dense search ) # 2. Define the Sparse Search Request req_sparse = AnnSearchRequest( data=query_sparse_vector, # Your query sparse vector(s) anns_field="sparse_vector", param={"metric_type": "IP", "params": {}}, limit=10 # Get top 10 from sparse search ) # 3. Define the Reranker # We use RRF which dynamically fuses rankings rerank = RRFRanker() # Optional: Use WeightedRanker if you want to explicitly bias towards semantic (0.7, 0.3) # rerank = WeightedRanker(0.7, 0.3) # 4. Execute the Hybrid Search results = collection.hybrid_search( reqs=[req_dense, req_sparse], rerank=rerank, limit=5, # Final limit of results to return output_fields=["text"] ) # 5. Process and display results print("\nHydrid Search Results:") for hit in results[0]: # results[0] because we provided one query vector print(f"ID: {hit.id} | Score (RRF): {hit.distance:.4f} | Text: {hit.entity.get('text')}")
\
Implementing the code is just the beginning. To build a truly exceptional search experience, follow these best practices.
By combining the strengths of lexical and semantic search, you can build an intelligent, intuitive, and highly effective search solution that understands user intent, not just keywords. You now have the blueprint and the code to implement it yourself. Happy building!
\


