Context Module
Context management utilities for LLM applications.
This module provides comprehensive tools for managing LLM context windows:
- Data Classes:
ContextItem - Represents a single item in context ContextWindow - Represents a managed context window CompressionResult - Result of context compression TruncationStrategy - Enum for truncation strategies CompressionMethod - Enum for compression methods
- Token Management:
- For token counting, use the tokenizer module:
from kerb.tokenizer import count_tokens, batch_count_tokens
- Context Window Management:
create_context_window() - Create managed context window truncate_context_window() - Truncate window to fit token limits
- Sliding Window Utilities:
create_sliding_window() - Create sliding windows over items create_token_sliding_window() - Create token-based sliding windows create_adaptive_window() - Create adaptive window balancing recency and priority
- Context Compression:
compress_context() - Compress context to target token count auto_compress_window() - Automatically compress window items
- Priority Management:
assign_priorities() - Assign priorities using custom function priority_by_recency() - Assign priorities based on recency priority_by_relevance() - Assign priorities based on query relevance priority_by_diversity() - Assign priorities to maximize diversity
- Context Optimization:
deduplicate_context() - Remove duplicate or similar items reorder_context() - Reorder items using specified strategy merge_context_windows() - Merge multiple windows into one optimize_context_for_query() - Optimize window for specific query
- Context Formatting:
format_context_window() - Format window for LLM consumption context_to_messages() - Convert window to chat message format extract_context_summary() - Extract summary of window contents
- class kerb.context.ContextItem(content, priority=1.0, token_count=None, metadata=<factory>, timestamp=None, item_type='text')[source]
Bases:
objectRepresents a single item in the context window.
- __init__(content, priority=1.0, token_count=None, metadata=<factory>, timestamp=None, item_type='text')
- class kerb.context.ContextWindow(items=<factory>, max_tokens=None, current_tokens=0, strategy=TruncationStrategy.LAST, metadata=<factory>)[source]
Bases:
objectRepresents a managed context window.
- items: list[ContextItem]
- strategy: TruncationStrategy = 'last'
- __init__(items=<factory>, max_tokens=None, current_tokens=0, strategy=TruncationStrategy.LAST, metadata=<factory>)
- class kerb.context.CompressionResult(compressed_content, original_tokens, compressed_tokens, compression_ratio, method, metadata=<factory>)[source]
Bases:
objectResult of context compression.
- method: CompressionMethod
- __init__(compressed_content, original_tokens, compressed_tokens, compression_ratio, method, metadata=<factory>)
- class kerb.context.TruncationStrategy(*values)[source]
Bases:
EnumStrategies for truncating context when exceeding limits.
- FIRST = 'first'
- LAST = 'last'
- MIDDLE = 'middle'
- PRIORITY = 'priority'
- SEMANTIC = 'semantic'
- class kerb.context.CompressionMethod(*values)[source]
Bases:
EnumMethods for compressing context.
- SUMMARIZE = 'summarize'
- EXTRACT_KEY_INFO = 'extract_key_info'
- REMOVE_REDUNDANCY = 'remove_redundancy'
- ABBREVIATE = 'abbreviate'
- HYBRID = 'hybrid'
- kerb.context.create_context_window(items, max_tokens=None, strategy=TruncationStrategy.LAST, token_estimator=None)[source]
Create a managed context window from items.
- Parameters:
items (
Union[List[str],List[ContextItem]]) – List of strings or ContextItem objectsmax_tokens (
Optional[int]) – Maximum tokens allowed in windowstrategy (
TruncationStrategy) – Truncation strategy if limit exceededtoken_estimator (
Optional[Callable[[str],int]]) – Custom token estimation function (defaults to count_tokens from tokenizer)
- Returns:
Managed context window
- Return type:
Example
>>> window = create_context_window(["Hello", "World"], max_tokens=1000) >>> print(window.current_tokens)
- kerb.context.truncate_context_window(window, max_tokens, strategy=TruncationStrategy.LAST)[source]
Truncate context window to fit within token limit.
- Parameters:
window (
ContextWindow) – Context window to truncatemax_tokens (
int) – Maximum tokens allowedstrategy (
TruncationStrategy) – Truncation strategy to use
- Returns:
Truncated context window
- Return type:
Example
>>> window = truncate_context_window(window, max_tokens=500)
- kerb.context.compress_context(content, target_tokens, method=CompressionMethod.SUMMARIZE, model='gpt-4o-mini')[source]
Compress context to target token count.
- Parameters:
content (
str) – Content to compresstarget_tokens (
int) – Target token countmethod (
CompressionMethod) – Compression method to usemodel (
str) – Model for token estimation (not used with tokenizer module, kept for backward compatibility)
- Returns:
Compression result with metrics
- Return type:
Example
>>> result = compress_context(long_text, target_tokens=500) >>> print(f"Compressed to {result.compression_ratio:.1%}")
- kerb.context.auto_compress_window(window, target_ratio=0.7, method=CompressionMethod.SUMMARIZE)[source]
Automatically compress context window items.
- Parameters:
window (
ContextWindow) – Context window to compresstarget_ratio (
float) – Target compression ratio (0-1)method (
CompressionMethod) – Compression method to use
- Returns:
Window with compressed items
- Return type:
Example
>>> compressed_window = auto_compress_window(window, target_ratio=0.7)
- kerb.context.create_sliding_window(items, window_size, step_size=None)[source]
Create sliding windows over context items.
- Parameters:
items (
List[ContextItem]) – List of context itemswindow_size (
int) – Number of items per windowstep_size (
Optional[int]) – Step size between windows (defaults to window_size)
- Returns:
List of sliding windows
- Return type:
Example
>>> windows = create_sliding_window(items, window_size=3, step_size=1)
- kerb.context.create_token_sliding_window(items, max_tokens, overlap_tokens=0)[source]
Create sliding windows based on token limits.
- Parameters:
items (
List[ContextItem]) – List of context itemsmax_tokens (
int) – Maximum tokens per windowoverlap_tokens (
int) – Number of tokens to overlap between windows
- Returns:
List of token-based sliding windows
- Return type:
Example
>>> windows = create_token_sliding_window(items, max_tokens=500, overlap_tokens=50)
- kerb.context.create_adaptive_window(items, max_tokens, recency_weight=0.5, priority_weight=0.5)[source]
Create adaptive window balancing recency and priority.
- Parameters:
items (
List[ContextItem]) – List of context itemsmax_tokens (
int) – Maximum tokens allowedrecency_weight (
float) – Weight for recency (0-1)priority_weight (
float) – Weight for priority (0-1)
- Returns:
Adaptively selected context window
- Return type:
Example
>>> window = create_adaptive_window(items, max_tokens=1000)
- kerb.context.deduplicate_context(items, similarity_threshold=0.9)[source]
Remove duplicate or highly similar context items.
- Parameters:
items (
List[ContextItem]) – List of context itemssimilarity_threshold (
float) – Threshold for considering items duplicates (0-1)
- Returns:
Deduplicated items
- Return type:
Example
>>> unique_items = deduplicate_context(items, similarity_threshold=0.85)
- kerb.context.reorder_context(items, strategy='chronological')[source]
Reorder context items using specified strategy.
- Parameters:
items (
List[ContextItem]) – List of context itemsstrategy (
Union[ReorderStrategy,str]) – Reordering strategy (ReorderStrategy enum or string: “chronological”, “priority”, “relevance”, “alternating”)
- Returns:
Reordered items
- Return type:
Examples
>>> from kerb.core.enums import ReorderStrategy >>> reordered = reorder_context(items, strategy=ReorderStrategy.PRIORITY)
- kerb.context.merge_context_windows(windows, max_tokens=None, deduplication=True)[source]
Merge multiple context windows into one.
- Parameters:
windows (
List[ContextWindow]) – List of context windows to mergemax_tokens (
Optional[int]) – Maximum tokens for merged windowdeduplication (
bool) – Whether to deduplicate items
- Returns:
Merged context window
- Return type:
Example
>>> merged = merge_context_windows([window1, window2], max_tokens=2000)
- kerb.context.optimize_context_for_query(window, query, max_tokens, relevance_weight=0.7, diversity_weight=0.3)[source]
Optimize context window for a specific query.
- Parameters:
window (
ContextWindow) – Context window to optimizequery (
str) – Query to optimize formax_tokens (
int) – Maximum tokens allowedrelevance_weight (
float) – Weight for relevance scoringdiversity_weight (
float) – Weight for diversity scoring
- Returns:
Optimized context window
- Return type:
Example
>>> optimized = optimize_context_for_query(window, "What is AI?", max_tokens=1000)
- kerb.context.format_context_window(window, format_template=None, include_metadata=False)[source]
Format context window for LLM consumption.
- Parameters:
window (
ContextWindow) – Context window to formatinclude_metadata (
bool) – Whether to include item metadata
- Returns:
Formatted context string
- Return type:
Example
>>> formatted = format_context_window(window)
- kerb.context.context_to_messages(window, system_prefix=None)[source]
Convert context window to chat message format.
- Parameters:
window (
ContextWindow) – Context window to convertsystem_prefix (
Optional[str]) – Optional system message prefix
- Returns:
List of message dictionaries
- Return type:
Example
>>> messages = context_to_messages(window, system_prefix="You are a helpful assistant.")
- kerb.context.extract_context_summary(window)[source]
Extract summary of context window contents.
- Parameters:
window (
ContextWindow) – Context window to summarize- Returns:
Summary of context window
- Return type:
Example
>>> summary = extract_context_summary(window) >>> print(summary)
- kerb.context.assign_priorities(items, priority_fn)[source]
Assign priorities to context items using custom function.
- Parameters:
items (
List[ContextItem]) – List of context itemspriority_fn (
Callable[[ContextItem],float]) – Function that takes ContextItem and returns priority score
- Returns:
Items with updated priorities
- Return type:
Example
>>> items = assign_priorities(items, lambda x: len(x.content) / 100)
- kerb.context.priority_by_recency(items)[source]
Assign priorities based on recency (newer = higher priority).
- Parameters:
items (
List[ContextItem]) – List of context items- Returns:
Items with recency-based priorities
- Return type:
- kerb.context.priority_by_relevance(items, query, relevance_fn=None)[source]
Assign priorities based on relevance to query.
- Parameters:
- Returns:
Items with relevance-based priorities
- Return type:
Example
>>> items = priority_by_relevance(items, "machine learning")
- kerb.context.priority_by_diversity(items, similarity_fn=None)[source]
Assign priorities to maximize diversity (MMR-style).
- Parameters:
- Returns:
Items with diversity-based priorities
- Return type:
Context window management and token budget tracking.