Retrieval Module

Retrieval utilities for LLM applications.

This module provides comprehensive retrieval tools for RAG (Retrieval-Augmented Generation):

Query Processing:

rewrite_query() - Rewrite queries for better retrieval expand_query() - Expand queries into multiple variations generate_sub_queries() - Break complex queries into sub-queries

Search Methods:

keyword_search() - BM25-like keyword search semantic_search() - Embedding-based semantic search hybrid_search() - Combined keyword + semantic search

Re-ranking:

rerank_results() - Re-rank results by relevance, recency, popularity, diversity reciprocal_rank_fusion() - Combine multiple result lists diversify_results() - Apply MMR for result diversity

Context Management:

compress_context() - Compress results to fit token limits filter_results() - Filter by score, metadata, deduplication

Formatting:

format_results() - Format results for display results_to_context() - Convert results to LLM context string

Data Classes:

Document - Represents a document with content and metadata (from core.types) SearchResult - Represents a ranked search result HybridSearchConfig - Configuration for hybrid search FilterConfig - Configuration for result filtering

Submodules:

query - Query processing utilities search - Search methods (keyword, semantic, hybrid) reranking - Re-ranking and fusion utilities context - Context compression and filtering formatting - Result formatting utilities structures - Data structures and configuration classes

class kerb.retrieval.Document(content, metadata=<factory>, id=None, source=None, format=DocumentFormat.UNKNOWN, score=0.0, page_content=None)[source]

Bases: object

Universal document representation across the toolkit.

Consolidates the Document classes from document/ and retrieval/ packages to provide a single, consistent document representation.

content

The text content of the document

metadata

Additional metadata about the document

id

Optional unique identifier for the document

source

Optional source path or URL where document was loaded from

format

Document format (defaults to UNKNOWN)

score

Relevance score (used in retrieval contexts, defaults to 0.0)

page_content

Optional list of content per page (for multi-page documents)

Examples

>>> # Simple document
>>> doc = Document(content="Hello, world!")
>>> # Document with metadata
>>> doc = Document(
...     content="Important document",
...     metadata={"author": "John", "created": "2025-01-01"},
...     source="doc.txt"
... )
>>> # Retrieval result with score
>>> doc = Document(
...     id="doc_123",
...     content="Relevant content",
...     score=0.95
... )
content: str
metadata: Dict[str, Any]
id: str | None = None
source: str | None = None
format: DocumentFormat = 'unknown'
score: float = 0.0
page_content: List[str] | None = None
__len__()[source]

Return the length of the document content.

Return type:

int

to_dict()[source]

Convert document to dictionary.

Return type:

Dict[str, Any]

Returns:

Dictionary representation of the document

classmethod from_dict(data)[source]

Create document from dictionary.

Parameters:

data (Dict[str, Any]) – Dictionary with document data

Return type:

Document

Returns:

New Document instance

__repr__()[source]

String representation of the document.

Return type:

str

__init__(content, metadata=<factory>, id=None, source=None, format=DocumentFormat.UNKNOWN, score=0.0, page_content=None)
class kerb.retrieval.SearchResult(document, score, rank, method='unknown')[source]

Bases: object

Represents a search result with relevance information.

document: Document
score: float
rank: int
method: str = 'unknown'
__init__(document, score, rank, method='unknown')
class kerb.retrieval.HybridSearchConfig(top_k=10, keyword_weight=0.5, semantic_weight=0.5, fusion_method='weighted')[source]

Bases: object

Configuration for hybrid search operations.

top_k

Number of top results to return

keyword_weight

Weight for keyword scores (0-1)

semantic_weight

Weight for semantic scores (0-1)

fusion_method

Fusion method (FusionMethod enum or string)

top_k: int = 10
keyword_weight: float = 0.5
semantic_weight: float = 0.5
fusion_method: FusionMethod | str = 'weighted'
__init__(top_k=10, keyword_weight=0.5, semantic_weight=0.5, fusion_method='weighted')
class kerb.retrieval.FilterConfig(min_score=None, max_results=None, metadata_filter=None, dedup_threshold=0.9)[source]

Bases: object

Configuration for result filtering operations.

min_score

Minimum score threshold

max_results

Maximum number of results

metadata_filter

Filter by metadata fields

dedup_threshold

Similarity threshold for deduplication (0-1)

min_score: float | None = None
max_results: int | None = None
metadata_filter: Dict[str, Any] | None = None
dedup_threshold: float = 0.9
__init__(min_score=None, max_results=None, metadata_filter=None, dedup_threshold=0.9)
kerb.retrieval.rewrite_query(query, style='clear', max_length=None)[source]

Rewrite a query for better retrieval.

Parameters:
  • query (str) – The original query text

  • style (Union[QueryStyle, str]) – Rewriting style (QueryStyle enum or string: “clear”, “detailed”, “concise”, “keyword”, “natural”)

  • max_length (Optional[int]) – Maximum length of rewritten query

Returns:

Rewritten query

Return type:

str

Examples

>>> from kerb.core.enums import QueryStyle
>>> rewritten = rewrite_query("python async", style=QueryStyle.DETAILED)
kerb.retrieval.expand_query(query, expansions=None, method='synonyms')[source]

Expand a query into multiple variations for broader retrieval.

Parameters:
  • query (str) – The original query text

  • expansions (Optional[List[str]]) – Custom expansion terms to add

  • method (Union[ExpansionMethod, str]) – Expansion method (ExpansionMethod enum or string: “synonyms”, “related_terms”, “llm”, “embeddings”)

Returns:

List of query variations

Return type:

List[str]

Examples

>>> from kerb.core.enums import ExpansionMethod
>>> queries = expand_query("machine learning", method=ExpansionMethod.SYNONYMS)
kerb.retrieval.generate_sub_queries(query, max_queries=3)[source]

Generate sub-queries from a complex query for step-by-step retrieval.

Parameters:
  • query (str) – The original complex query

  • max_queries (int) – Maximum number of sub-queries to generate

Returns:

List of sub-queries

Return type:

List[str]

Example

>>> generate_sub_queries("How to implement authentication in a Python FastAPI app?")
["What is authentication?", "How to use FastAPI?", "Python authentication methods"]

Perform keyword-based search using BM25-like scoring.

Parameters:
  • query (str) – Search query

  • documents (List[Document]) – List of documents to search

  • top_k (int) – Number of top results to return

  • field (str) – Document field to search (“content” or metadata key)

Returns:

Ranked search results

Return type:

List[SearchResult]

Example

>>> docs = [Document(id="1", content="Python is great"), ...]
>>> results = keyword_search("python programming", docs)

Perform semantic search using embeddings.

Parameters:
  • query_embedding (List[float]) – Embedding vector of the query

  • documents (List[Document]) – List of documents

  • document_embeddings (List[List[float]]) – Embedding vectors for documents (same order as documents)

  • top_k (int) – Number of top results to return

  • similarity_metric (str) – “cosine”, “dot”, or “euclidean”

Returns:

Ranked search results

Return type:

List[SearchResult]

Example

>>> from kerb.embedding import embed
>>> query_emb = embed("python programming")
>>> doc_embs = [embed(doc.content) for doc in docs]
>>> results = semantic_search(query_emb, docs, doc_embs)

Perform hybrid search combining keyword and semantic search.

Parameters:
  • query (str) – Search query text

  • query_embedding (List[float]) – Embedding vector of the query

  • documents (List[Document]) – List of documents

  • document_embeddings (List[List[float]]) – Embedding vectors for documents

  • top_k (int) – Number of top results to return (ignored if config is provided)

  • keyword_weight (float) – Weight for keyword scores (ignored if config is provided)

  • semantic_weight (float) – Weight for semantic scores (ignored if config is provided)

  • fusion_method (Union[FusionMethod, str]) – Fusion method (ignored if config is provided)

  • config (Optional[HybridSearchConfig]) – HybridSearchConfig object with all parameters (recommended)

Returns:

Ranked search results

Return type:

List[SearchResult]

Examples

>>> # Using config object (recommended)
>>> from kerb.retrieval import HybridSearchConfig
>>> from kerb.core.enums import FusionMethod
>>> config = HybridSearchConfig(
...     top_k=10,
...     keyword_weight=0.4,
...     semantic_weight=0.6,
...     fusion_method=FusionMethod.RRF
... )
>>> results = hybrid_search(
...     query="python async",
...     query_embedding=embed("python async"),
...     documents=docs,
...     document_embeddings=doc_embs,
...     config=config
... )
>>> # Using individual parameters (backward compatible)
>>> results = hybrid_search(
...     query="python async",
...     query_embedding=embed("python async"),
...     documents=docs,
...     document_embeddings=doc_embs,
...     keyword_weight=0.4,
...     semantic_weight=0.6
... )
kerb.retrieval.rerank_results(query, results, method='relevance', top_k=None, scorer=None)[source]

Re-rank search results using additional signals.

Parameters:
  • query (str) – The search query

  • results (List[SearchResult]) – Initial search results

  • method (Union[RerankMethod, str]) – Re-ranking method (RerankMethod enum or string: “relevance”, “diversity”, “mmr”, “cross_encoder”, “llm”)

  • top_k (Optional[int]) – Number of top results to return after re-ranking

  • scorer (Optional[Callable[[str, Document], float]]) – Custom scoring function for method=”custom”

Returns:

Re-ranked search results

Return type:

List[SearchResult]

Examples

>>> from kerb.core.enums import RerankMethod
>>> results = keyword_search("python", docs)
>>> reranked = rerank_results("python", results, method=RerankMethod.MMR)
kerb.retrieval.reciprocal_rank_fusion(result_lists, k=60, top_k=None)[source]

Combine multiple result lists using Reciprocal Rank Fusion.

Parameters:
  • result_lists (List[List[SearchResult]]) – Multiple lists of search results to fuse

  • k (int) – RRF constant (typically 60)

  • top_k (Optional[int]) – Number of top results to return

Returns:

Fused and ranked results

Return type:

List[SearchResult]

Example

>>> results1 = keyword_search("python", docs)
>>> results2 = semantic_search(embed("python"), docs, embeddings)
>>> fused = reciprocal_rank_fusion([results1, results2])
kerb.retrieval.diversify_results(results, max_results=10, diversity_factor=0.5)[source]

Diversify results using Maximal Marginal Relevance (MMR).

Parameters:
  • results (List[SearchResult]) – Search results to diversify

  • max_results (int) – Number of results to return

  • diversity_factor (float) – Balance between relevance (0) and diversity (1)

Returns:

Diversified results

Return type:

List[SearchResult]

Example

>>> results = semantic_search(query_emb, docs, embeddings, top_k=50)
>>> diverse = diversify_results(results, max_results=10, diversity_factor=0.7)
kerb.retrieval.compress_context(query, results, max_tokens=2000, strategy='top_k')[source]

Compress retrieved context to fit within token limits.

Parameters:
  • query (str) – The search query

  • results (List[SearchResult]) – Search results to compress

  • max_tokens (int) – Maximum number of tokens (approximate)

  • strategy (Union[CompressionStrategy, str]) – Compression strategy (CompressionStrategy enum or string: “top_k”, “summarize”, “filter”, “truncate”)

Returns:

Compressed results

Return type:

List[SearchResult]

Examples

>>> from kerb.core.enums import CompressionStrategy
>>> compressed = compress_context(query, results, max_tokens=1000, strategy=CompressionStrategy.TOP_K)
kerb.retrieval.filter_results(results, min_score=None, max_results=None, metadata_filter=None, dedup_threshold=0.9, config=None)[source]

Filter search results based on various criteria.

Parameters:
  • results (List[SearchResult]) – Search results to filter

  • min_score (Optional[float]) – Minimum score threshold (ignored if config is provided)

  • max_results (Optional[int]) – Maximum number of results (ignored if config is provided)

  • metadata_filter (Optional[Dict[str, Any]]) – Filter by metadata fields (ignored if config is provided)

  • dedup_threshold (float) – Similarity threshold for deduplication (ignored if config is provided)

  • config (Optional[FilterConfig]) – FilterConfig object with all parameters (recommended)

Returns:

Filtered results

Return type:

List[SearchResult]

Examples

>>> # Using config object (recommended)
>>> from kerb.retrieval import FilterConfig
>>> config = FilterConfig(
...     min_score=0.5,
...     max_results=10,
...     metadata_filter={"category": "tech"},
...     dedup_threshold=0.9
... )
>>> filtered = filter_results(results, config=config)
>>> # Using individual parameters (backward compatible)
>>> filtered = filter_results(
...     results,
...     min_score=0.5,
...     max_results=10,
...     metadata_filter={"category": "tech"}
... )
kerb.retrieval.format_results(results, format_style='simple', include_metadata=False)[source]

Format search results for display.

Parameters:
  • results (List[SearchResult]) – Search results to format

  • format_style (str) – “simple”, “detailed”, or “json”

  • include_metadata (bool) – Whether to include document metadata

Returns:

Formatted results

Return type:

str

Example

>>> results = keyword_search("python", docs)
>>> print(format_results(results, format_style="detailed"))
kerb.retrieval.results_to_context(results, separator='\\n\\n---\\n\\n', include_source=True)[source]

Convert search results to a context string for LLM prompts.

Args:

results: Search results to convert separator: Separator between documents include_source: Whether to include document IDs

Returns:

str: Formatted context string

Example:
>>> results = hybrid_search(query, query_emb, docs, embeddings)
>>> context = results_to_context(results)
>>> prompt = f"Context:

{context}

Question: {query} Answer:”

Return type:

str

RAG and vector search utilities for semantic retrieval.