Retrieval Module
Retrieval utilities for LLM applications.
This module provides comprehensive retrieval tools for RAG (Retrieval-Augmented Generation):
- Query Processing:
rewrite_query() - Rewrite queries for better retrieval expand_query() - Expand queries into multiple variations generate_sub_queries() - Break complex queries into sub-queries
- Search Methods:
keyword_search() - BM25-like keyword search semantic_search() - Embedding-based semantic search hybrid_search() - Combined keyword + semantic search
- Re-ranking:
rerank_results() - Re-rank results by relevance, recency, popularity, diversity reciprocal_rank_fusion() - Combine multiple result lists diversify_results() - Apply MMR for result diversity
- Context Management:
compress_context() - Compress results to fit token limits filter_results() - Filter by score, metadata, deduplication
- Formatting:
format_results() - Format results for display results_to_context() - Convert results to LLM context string
- Data Classes:
Document - Represents a document with content and metadata (from core.types) SearchResult - Represents a ranked search result HybridSearchConfig - Configuration for hybrid search FilterConfig - Configuration for result filtering
- Submodules:
query - Query processing utilities search - Search methods (keyword, semantic, hybrid) reranking - Re-ranking and fusion utilities context - Context compression and filtering formatting - Result formatting utilities structures - Data structures and configuration classes
- class kerb.retrieval.Document(content, metadata=<factory>, id=None, source=None, format=DocumentFormat.UNKNOWN, score=0.0, page_content=None)[source]
Bases:
objectUniversal document representation across the toolkit.
Consolidates the Document classes from document/ and retrieval/ packages to provide a single, consistent document representation.
- content
The text content of the document
- metadata
Additional metadata about the document
- id
Optional unique identifier for the document
- source
Optional source path or URL where document was loaded from
- format
Document format (defaults to UNKNOWN)
- score
Relevance score (used in retrieval contexts, defaults to 0.0)
- page_content
Optional list of content per page (for multi-page documents)
Examples
>>> # Simple document >>> doc = Document(content="Hello, world!")
>>> # Document with metadata >>> doc = Document( ... content="Important document", ... metadata={"author": "John", "created": "2025-01-01"}, ... source="doc.txt" ... )
>>> # Retrieval result with score >>> doc = Document( ... id="doc_123", ... content="Relevant content", ... score=0.95 ... )
- format: DocumentFormat = 'unknown'
- __init__(content, metadata=<factory>, id=None, source=None, format=DocumentFormat.UNKNOWN, score=0.0, page_content=None)
- class kerb.retrieval.SearchResult(document, score, rank, method='unknown')[source]
Bases:
objectRepresents a search result with relevance information.
- __init__(document, score, rank, method='unknown')
- class kerb.retrieval.HybridSearchConfig(top_k=10, keyword_weight=0.5, semantic_weight=0.5, fusion_method='weighted')[source]
Bases:
objectConfiguration for hybrid search operations.
- top_k
Number of top results to return
- keyword_weight
Weight for keyword scores (0-1)
- semantic_weight
Weight for semantic scores (0-1)
- fusion_method
Fusion method (FusionMethod enum or string)
- __init__(top_k=10, keyword_weight=0.5, semantic_weight=0.5, fusion_method='weighted')
- class kerb.retrieval.FilterConfig(min_score=None, max_results=None, metadata_filter=None, dedup_threshold=0.9)[source]
Bases:
objectConfiguration for result filtering operations.
- min_score
Minimum score threshold
- max_results
Maximum number of results
- metadata_filter
Filter by metadata fields
- dedup_threshold
Similarity threshold for deduplication (0-1)
- __init__(min_score=None, max_results=None, metadata_filter=None, dedup_threshold=0.9)
- kerb.retrieval.rewrite_query(query, style='clear', max_length=None)[source]
Rewrite a query for better retrieval.
- Parameters:
- Returns:
Rewritten query
- Return type:
Examples
>>> from kerb.core.enums import QueryStyle >>> rewritten = rewrite_query("python async", style=QueryStyle.DETAILED)
- kerb.retrieval.expand_query(query, expansions=None, method='synonyms')[source]
Expand a query into multiple variations for broader retrieval.
- Parameters:
- Returns:
List of query variations
- Return type:
Examples
>>> from kerb.core.enums import ExpansionMethod >>> queries = expand_query("machine learning", method=ExpansionMethod.SYNONYMS)
- kerb.retrieval.generate_sub_queries(query, max_queries=3)[source]
Generate sub-queries from a complex query for step-by-step retrieval.
- Parameters:
- Returns:
List of sub-queries
- Return type:
Example
>>> generate_sub_queries("How to implement authentication in a Python FastAPI app?") ["What is authentication?", "How to use FastAPI?", "Python authentication methods"]
- kerb.retrieval.keyword_search(query, documents, top_k=10, field='content')[source]
Perform keyword-based search using BM25-like scoring.
- Parameters:
- Returns:
Ranked search results
- Return type:
Example
>>> docs = [Document(id="1", content="Python is great"), ...] >>> results = keyword_search("python programming", docs)
- kerb.retrieval.semantic_search(query_embedding, documents, document_embeddings, top_k=10, similarity_metric='cosine')[source]
Perform semantic search using embeddings.
- Parameters:
- Returns:
Ranked search results
- Return type:
Example
>>> from kerb.embedding import embed >>> query_emb = embed("python programming") >>> doc_embs = [embed(doc.content) for doc in docs] >>> results = semantic_search(query_emb, docs, doc_embs)
- kerb.retrieval.hybrid_search(query, query_embedding, documents, document_embeddings, top_k=10, keyword_weight=0.5, semantic_weight=0.5, fusion_method='weighted', config=None)[source]
Perform hybrid search combining keyword and semantic search.
- Parameters:
query (
str) – Search query textquery_embedding (
List[float]) – Embedding vector of the querydocument_embeddings (
List[List[float]]) – Embedding vectors for documentstop_k (
int) – Number of top results to return (ignored if config is provided)keyword_weight (
float) – Weight for keyword scores (ignored if config is provided)semantic_weight (
float) – Weight for semantic scores (ignored if config is provided)fusion_method (
Union[FusionMethod,str]) – Fusion method (ignored if config is provided)config (
Optional[HybridSearchConfig]) – HybridSearchConfig object with all parameters (recommended)
- Returns:
Ranked search results
- Return type:
Examples
>>> # Using config object (recommended) >>> from kerb.retrieval import HybridSearchConfig >>> from kerb.core.enums import FusionMethod >>> config = HybridSearchConfig( ... top_k=10, ... keyword_weight=0.4, ... semantic_weight=0.6, ... fusion_method=FusionMethod.RRF ... ) >>> results = hybrid_search( ... query="python async", ... query_embedding=embed("python async"), ... documents=docs, ... document_embeddings=doc_embs, ... config=config ... )
>>> # Using individual parameters (backward compatible) >>> results = hybrid_search( ... query="python async", ... query_embedding=embed("python async"), ... documents=docs, ... document_embeddings=doc_embs, ... keyword_weight=0.4, ... semantic_weight=0.6 ... )
- kerb.retrieval.rerank_results(query, results, method='relevance', top_k=None, scorer=None)[source]
Re-rank search results using additional signals.
- Parameters:
query (
str) – The search queryresults (
List[SearchResult]) – Initial search resultsmethod (
Union[RerankMethod,str]) – Re-ranking method (RerankMethod enum or string: “relevance”, “diversity”, “mmr”, “cross_encoder”, “llm”)top_k (
Optional[int]) – Number of top results to return after re-rankingscorer (
Optional[Callable[[str,Document],float]]) – Custom scoring function for method=”custom”
- Returns:
Re-ranked search results
- Return type:
Examples
>>> from kerb.core.enums import RerankMethod >>> results = keyword_search("python", docs) >>> reranked = rerank_results("python", results, method=RerankMethod.MMR)
- kerb.retrieval.reciprocal_rank_fusion(result_lists, k=60, top_k=None)[source]
Combine multiple result lists using Reciprocal Rank Fusion.
- Parameters:
- Returns:
Fused and ranked results
- Return type:
Example
>>> results1 = keyword_search("python", docs) >>> results2 = semantic_search(embed("python"), docs, embeddings) >>> fused = reciprocal_rank_fusion([results1, results2])
- kerb.retrieval.diversify_results(results, max_results=10, diversity_factor=0.5)[source]
Diversify results using Maximal Marginal Relevance (MMR).
- Parameters:
results (
List[SearchResult]) – Search results to diversifymax_results (
int) – Number of results to returndiversity_factor (
float) – Balance between relevance (0) and diversity (1)
- Returns:
Diversified results
- Return type:
Example
>>> results = semantic_search(query_emb, docs, embeddings, top_k=50) >>> diverse = diversify_results(results, max_results=10, diversity_factor=0.7)
- kerb.retrieval.compress_context(query, results, max_tokens=2000, strategy='top_k')[source]
Compress retrieved context to fit within token limits.
- Parameters:
query (
str) – The search queryresults (
List[SearchResult]) – Search results to compressmax_tokens (
int) – Maximum number of tokens (approximate)strategy (
Union[CompressionStrategy,str]) – Compression strategy (CompressionStrategy enum or string: “top_k”, “summarize”, “filter”, “truncate”)
- Returns:
Compressed results
- Return type:
Examples
>>> from kerb.core.enums import CompressionStrategy >>> compressed = compress_context(query, results, max_tokens=1000, strategy=CompressionStrategy.TOP_K)
- kerb.retrieval.filter_results(results, min_score=None, max_results=None, metadata_filter=None, dedup_threshold=0.9, config=None)[source]
Filter search results based on various criteria.
- Parameters:
results (
List[SearchResult]) – Search results to filtermin_score (
Optional[float]) – Minimum score threshold (ignored if config is provided)max_results (
Optional[int]) – Maximum number of results (ignored if config is provided)metadata_filter (
Optional[Dict[str,Any]]) – Filter by metadata fields (ignored if config is provided)dedup_threshold (
float) – Similarity threshold for deduplication (ignored if config is provided)config (
Optional[FilterConfig]) – FilterConfig object with all parameters (recommended)
- Returns:
Filtered results
- Return type:
Examples
>>> # Using config object (recommended) >>> from kerb.retrieval import FilterConfig >>> config = FilterConfig( ... min_score=0.5, ... max_results=10, ... metadata_filter={"category": "tech"}, ... dedup_threshold=0.9 ... ) >>> filtered = filter_results(results, config=config)
>>> # Using individual parameters (backward compatible) >>> filtered = filter_results( ... results, ... min_score=0.5, ... max_results=10, ... metadata_filter={"category": "tech"} ... )
- kerb.retrieval.format_results(results, format_style='simple', include_metadata=False)[source]
Format search results for display.
- Parameters:
results (
List[SearchResult]) – Search results to formatformat_style (
str) – “simple”, “detailed”, or “json”include_metadata (
bool) – Whether to include document metadata
- Returns:
Formatted results
- Return type:
Example
>>> results = keyword_search("python", docs) >>> print(format_results(results, format_style="detailed"))
- kerb.retrieval.results_to_context(results, separator='\\n\\n---\\n\\n', include_source=True)[source]
Convert search results to a context string for LLM prompts.
- Args:
results: Search results to convert separator: Separator between documents include_source: Whether to include document IDs
- Returns:
str: Formatted context string
- Example:
>>> results = hybrid_search(query, query_emb, docs, embeddings) >>> context = results_to_context(results) >>> prompt = f"Context:
{context}
Question: {query} Answer:”
- Return type:
RAG and vector search utilities for semantic retrieval.