Retrieval Module

Retrieval utilities for LLM applications.

This module provides comprehensive retrieval tools for RAG (Retrieval-Augmented Generation):

Query Processing:: rewrite_query() - Rewrite queries for better retrieval expand_query() - Expand queries into multiple variations generate_sub_queries() - Break complex queries into sub-queries
Search Methods:: keyword_search() - BM25-like keyword search semantic_search() - Embedding-based semantic search hybrid_search() - Combined keyword + semantic search
Re-ranking:: rerank_results() - Re-rank results by relevance, recency, popularity, diversity reciprocal_rank_fusion() - Combine multiple result lists diversify_results() - Apply MMR for result diversity
Context Management:: compress_context() - Compress results to fit token limits filter_results() - Filter by score, metadata, deduplication
Formatting:: format_results() - Format results for display results_to_context() - Convert results to LLM context string
Data Classes:: Document - Represents a document with content and metadata (from core.types) SearchResult - Represents a ranked search result HybridSearchConfig - Configuration for hybrid search FilterConfig - Configuration for result filtering
Submodules:: query - Query processing utilities search - Search methods (keyword, semantic, hybrid) reranking - Re-ranking and fusion utilities context - Context compression and filtering formatting - Result formatting utilities structures - Data structures and configuration classes

class kerb.retrieval.Document(content, metadata=<factory>, id=None, source=None, format=DocumentFormat.UNKNOWN, score=0.0, page_content=None)[source]

Bases: object

Universal document representation across the toolkit.

Consolidates the Document classes from document/ and retrieval/ packages to provide a single, consistent document representation.

content: The text content of the document

metadata: Additional metadata about the document

id: Optional unique identifier for the document

source: Optional source path or URL where document was loaded from

format: Document format (defaults to UNKNOWN)

score: Relevance score (used in retrieval contexts, defaults to 0.0)

page_content: Optional list of content per page (for multi-page documents)

Examples

>>> # Simple document
>>> doc = Document(content="Hello, world!")

>>> # Document with metadata
>>> doc = Document(
...     content="Important document",
...     metadata={"author": "John", "created": "2025-01-01"},
...     source="doc.txt"
... )

>>> # Retrieval result with score
>>> doc = Document(
...     id="doc_123",
...     content="Relevant content",
...     score=0.95
... )

content: str

metadata: Dict[str, Any]

id: str | None = None

source: str | None = None

format: DocumentFormat = 'unknown'

score: float = 0.0

page_content: List[str] | None = None

__len__()[source]

Return the length of the document content.

Return type:: int

to_dict()[source]

Convert document to dictionary.

Return type:: Dict[str, Any]
Returns:: Dictionary representation of the document

classmethod from_dict(data)[source]

Create document from dictionary.

Parameters:: data (Dict[str, Any]) – Dictionary with document data
Return type:: Document
Returns:: New Document instance

__repr__()[source]

String representation of the document.

Return type:: str

__init__(content, metadata=<factory>, id=None, source=None, format=DocumentFormat.UNKNOWN, score=0.0, page_content=None)

class kerb.retrieval.SearchResult(document, score, rank, method='unknown')[source]

Bases: object

Represents a search result with relevance information.

document: Document

score: float

rank: int

method: str = 'unknown'

__init__(document, score, rank, method='unknown')

class kerb.retrieval.HybridSearchConfig(top_k=10, keyword_weight=0.5, semantic_weight=0.5, fusion_method='weighted')[source]

Bases: object

Configuration for hybrid search operations.

top_k: Number of top results to return

keyword_weight: Weight for keyword scores (0-1)

semantic_weight: Weight for semantic scores (0-1)

fusion_method: Fusion method (FusionMethod enum or string)

top_k: int = 10

keyword_weight: float = 0.5

semantic_weight: float = 0.5

fusion_method: FusionMethod | str = 'weighted'

__init__(top_k=10, keyword_weight=0.5, semantic_weight=0.5, fusion_method='weighted')

class kerb.retrieval.FilterConfig(min_score=None, max_results=None, metadata_filter=None, dedup_threshold=0.9)[source]

Bases: object

Configuration for result filtering operations.

min_score: Minimum score threshold

max_results: Maximum number of results

metadata_filter: Filter by metadata fields

dedup_threshold: Similarity threshold for deduplication (0-1)

min_score: float | None = None

max_results: int | None = None

metadata_filter: Dict[str, Any] | None = None

dedup_threshold: float = 0.9

__init__(min_score=None, max_results=None, metadata_filter=None, dedup_threshold=0.9)

kerb.retrieval.rewrite_query(query, style='clear', max_length=None)[source]

Rewrite a query for better retrieval.

Parameters:

query (str) – The original query text
style (Union[QueryStyle, str]) – Rewriting style (QueryStyle enum or string: “clear”, “detailed”, “concise”, “keyword”, “natural”)
max_length (Optional[int]) – Maximum length of rewritten query

Returns:

Rewritten query

Return type:

str

Examples

>>> from kerb.core.enums import QueryStyle
>>> rewritten = rewrite_query("python async", style=QueryStyle.DETAILED)

kerb.retrieval.expand_query(query, expansions=None, method='synonyms')[source]

Expand a query into multiple variations for broader retrieval.

Parameters:

query (str) – The original query text
expansions (Optional[List[str]]) – Custom expansion terms to add
method (Union[ExpansionMethod, str]) – Expansion method (ExpansionMethod enum or string: “synonyms”, “related_terms”, “llm”, “embeddings”)

Returns:

List of query variations

Return type:

List[str]

Examples

>>> from kerb.core.enums import ExpansionMethod
>>> queries = expand_query("machine learning", method=ExpansionMethod.SYNONYMS)

kerb.retrieval.generate_sub_queries(query, max_queries=3)[source]

Generate sub-queries from a complex query for step-by-step retrieval.

Parameters:

query (str) – The original complex query
max_queries (int) – Maximum number of sub-queries to generate

Returns:

List of sub-queries

Return type:

List[str]

Example

>>> generate_sub_queries("How to implement authentication in a Python FastAPI app?")
["What is authentication?", "How to use FastAPI?", "Python authentication methods"]

kerb.retrieval.keyword_search(query, documents, top_k=10, field='content')[source]

Perform keyword-based search using BM25-like scoring.

Parameters:

query (str) – Search query
documents (List[Document]) – List of documents to search
top_k (int) – Number of top results to return
field (str) – Document field to search (“content” or metadata key)

Returns:

Ranked search results

Return type:

List[SearchResult]

Example

>>> docs = [Document(id="1", content="Python is great"), ...]
>>> results = keyword_search("python programming", docs)

kerb.retrieval.semantic_search(query_embedding, documents, document_embeddings, top_k=10, similarity_metric='cosine')[source]

Perform semantic search using embeddings.

Parameters:

query_embedding (List[float]) – Embedding vector of the query
documents (List[Document]) – List of documents
document_embeddings (List[List[float]]) – Embedding vectors for documents (same order as documents)
top_k (int) – Number of top results to return
similarity_metric (str) – “cosine”, “dot”, or “euclidean”

Returns:

Ranked search results

Return type:

List[SearchResult]

Example

>>> from kerb.embedding import embed
>>> query_emb = embed("python programming")
>>> doc_embs = [embed(doc.content) for doc in docs]
>>> results = semantic_search(query_emb, docs, doc_embs)

kerb.retrieval.hybrid_search(query, query_embedding, documents, document_embeddings, top_k=10, keyword_weight=0.5, semantic_weight=0.5, fusion_method='weighted', config=None)[source]

Perform hybrid search combining keyword and semantic search.

Parameters:

query (str) – Search query text
query_embedding (List[float]) – Embedding vector of the query
documents (List[Document]) – List of documents
document_embeddings (List[List[float]]) – Embedding vectors for documents
top_k (int) – Number of top results to return (ignored if config is provided)
keyword_weight (float) – Weight for keyword scores (ignored if config is provided)
semantic_weight (float) – Weight for semantic scores (ignored if config is provided)
fusion_method (Union[FusionMethod, str]) – Fusion method (ignored if config is provided)
config (Optional[HybridSearchConfig]) – HybridSearchConfig object with all parameters (recommended)

Returns:

Ranked search results

Return type:

List[SearchResult]

Examples

>>> # Using config object (recommended)
>>> from kerb.retrieval import HybridSearchConfig
>>> from kerb.core.enums import FusionMethod
>>> config = HybridSearchConfig(
...     top_k=10,
...     keyword_weight=0.4,
...     semantic_weight=0.6,
...     fusion_method=FusionMethod.RRF
... )
>>> results = hybrid_search(
...     query="python async",
...     query_embedding=embed("python async"),
...     documents=docs,
...     document_embeddings=doc_embs,
...     config=config
... )

>>> # Using individual parameters (backward compatible)
>>> results = hybrid_search(
...     query="python async",
...     query_embedding=embed("python async"),
...     documents=docs,
...     document_embeddings=doc_embs,
...     keyword_weight=0.4,
...     semantic_weight=0.6
... )

kerb.retrieval.rerank_results(query, results, method='relevance', top_k=None, scorer=None)[source]

Re-rank search results using additional signals.

Parameters:

query (str) – The search query
results (List[SearchResult]) – Initial search results
method (Union[RerankMethod, str]) – Re-ranking method (RerankMethod enum or string: “relevance”, “diversity”, “mmr”, “cross_encoder”, “llm”)
top_k (Optional[int]) – Number of top results to return after re-ranking
scorer (Optional[Callable[[str, Document], float]]) – Custom scoring function for method=”custom”

Returns:

Re-ranked search results

Return type:

List[SearchResult]

Examples

>>> from kerb.core.enums import RerankMethod
>>> results = keyword_search("python", docs)
>>> reranked = rerank_results("python", results, method=RerankMethod.MMR)

kerb.retrieval.reciprocal_rank_fusion(result_lists, k=60, top_k=None)[source]

Combine multiple result lists using Reciprocal Rank Fusion.

Parameters:

result_lists (List[List[SearchResult]]) – Multiple lists of search results to fuse
k (int) – RRF constant (typically 60)
top_k (Optional[int]) – Number of top results to return

Returns:

Fused and ranked results

Return type:

List[SearchResult]

Example

>>> results1 = keyword_search("python", docs)
>>> results2 = semantic_search(embed("python"), docs, embeddings)
>>> fused = reciprocal_rank_fusion([results1, results2])

kerb.retrieval.diversify_results(results, max_results=10, diversity_factor=0.5)[source]

Diversify results using Maximal Marginal Relevance (MMR).

Parameters:

results (List[SearchResult]) – Search results to diversify
max_results (int) – Number of results to return
diversity_factor (float) – Balance between relevance (0) and diversity (1)

Returns:

Diversified results

Return type:

List[SearchResult]

Example

>>> results = semantic_search(query_emb, docs, embeddings, top_k=50)
>>> diverse = diversify_results(results, max_results=10, diversity_factor=0.7)

kerb.retrieval.compress_context(query, results, max_tokens=2000, strategy='top_k')[source]

Compress retrieved context to fit within token limits.

Parameters:

query (str) – The search query
results (List[SearchResult]) – Search results to compress
max_tokens (int) – Maximum number of tokens (approximate)
strategy (Union[CompressionStrategy, str]) – Compression strategy (CompressionStrategy enum or string: “top_k”, “summarize”, “filter”, “truncate”)

Returns:

Compressed results

Return type:

List[SearchResult]

Examples

>>> from kerb.core.enums import CompressionStrategy
>>> compressed = compress_context(query, results, max_tokens=1000, strategy=CompressionStrategy.TOP_K)

kerb.retrieval.filter_results(results, min_score=None, max_results=None, metadata_filter=None, dedup_threshold=0.9, config=None)[source]

Filter search results based on various criteria.

Parameters:

results (List[SearchResult]) – Search results to filter
min_score (Optional[float]) – Minimum score threshold (ignored if config is provided)
max_results (Optional[int]) – Maximum number of results (ignored if config is provided)
metadata_filter (Optional[Dict[str, Any]]) – Filter by metadata fields (ignored if config is provided)
dedup_threshold (float) – Similarity threshold for deduplication (ignored if config is provided)
config (Optional[FilterConfig]) – FilterConfig object with all parameters (recommended)

Returns:

Filtered results

Return type:

List[SearchResult]

Examples

>>> # Using config object (recommended)
>>> from kerb.retrieval import FilterConfig
>>> config = FilterConfig(
...     min_score=0.5,
...     max_results=10,
...     metadata_filter={"category": "tech"},
...     dedup_threshold=0.9
... )
>>> filtered = filter_results(results, config=config)

>>> # Using individual parameters (backward compatible)
>>> filtered = filter_results(
...     results,
...     min_score=0.5,
...     max_results=10,
...     metadata_filter={"category": "tech"}
... )

kerb.retrieval.format_results(results, format_style='simple', include_metadata=False)[source]

Format search results for display.

Parameters:

results (List[SearchResult]) – Search results to format
format_style (str) – “simple”, “detailed”, or “json”
include_metadata (bool) – Whether to include document metadata

Returns:

Formatted results

Return type:

str

Example

>>> results = keyword_search("python", docs)
>>> print(format_results(results, format_style="detailed"))

kerb.retrieval.results_to_context(results, separator='\\n\\n---\\n\\n', include_source=True)[source]

Convert search results to a context string for LLM prompts.

Args:
results: Search results to convert separator: Separator between documents include_source: Whether to include document IDs

Returns:
str: Formatted context string

Example:
>>> results = hybrid_search(query, query_emb, docs, embeddings)
>>> context = results_to_context(results)
>>> prompt = f"Context:

{context}

Question: {query} Answer:”

Return type:: str

RAG and vector search utilities for semantic retrieval.