Context Module

Context management utilities for LLM applications.

This module provides comprehensive tools for managing LLM context windows:

Data Classes:

ContextItem - Represents a single item in context ContextWindow - Represents a managed context window CompressionResult - Result of context compression TruncationStrategy - Enum for truncation strategies CompressionMethod - Enum for compression methods

Token Management:
For token counting, use the tokenizer module:

from kerb.tokenizer import count_tokens, batch_count_tokens

Context Window Management:

create_context_window() - Create managed context window truncate_context_window() - Truncate window to fit token limits

Sliding Window Utilities:

create_sliding_window() - Create sliding windows over items create_token_sliding_window() - Create token-based sliding windows create_adaptive_window() - Create adaptive window balancing recency and priority

Context Compression:

compress_context() - Compress context to target token count auto_compress_window() - Automatically compress window items

Priority Management:

assign_priorities() - Assign priorities using custom function priority_by_recency() - Assign priorities based on recency priority_by_relevance() - Assign priorities based on query relevance priority_by_diversity() - Assign priorities to maximize diversity

Context Optimization:

deduplicate_context() - Remove duplicate or similar items reorder_context() - Reorder items using specified strategy merge_context_windows() - Merge multiple windows into one optimize_context_for_query() - Optimize window for specific query

Context Formatting:

format_context_window() - Format window for LLM consumption context_to_messages() - Convert window to chat message format extract_context_summary() - Extract summary of window contents

class kerb.context.ContextItem(content, priority=1.0, token_count=None, metadata=<factory>, timestamp=None, item_type='text')[source]

Bases: object

Represents a single item in the context window.

content: str
priority: float = 1.0
token_count: int | None = None
metadata: Dict[str, Any]
timestamp: float | None = None
item_type: str = 'text'
to_dict()[source]

Convert to dictionary.

Return type:

Dict[str, Any]

classmethod from_dict(data)[source]

Create from dictionary.

Return type:

ContextItem

__lt__(other)[source]

Compare by priority (for heap operations).

Return type:

bool

__init__(content, priority=1.0, token_count=None, metadata=<factory>, timestamp=None, item_type='text')
class kerb.context.ContextWindow(items=<factory>, max_tokens=None, current_tokens=0, strategy=TruncationStrategy.LAST, metadata=<factory>)[source]

Bases: object

Represents a managed context window.

items: list[ContextItem]
max_tokens: int | None = None
current_tokens: int = 0
strategy: TruncationStrategy = 'last'
metadata: Dict[str, Any]
add_item(item)[source]

Add item to context window.

Return type:

None

get_content()[source]

Get concatenated content from all items.

Return type:

str

to_dict()[source]

Convert to dictionary.

Return type:

Dict[str, Any]

__init__(items=<factory>, max_tokens=None, current_tokens=0, strategy=TruncationStrategy.LAST, metadata=<factory>)
class kerb.context.CompressionResult(compressed_content, original_tokens, compressed_tokens, compression_ratio, method, metadata=<factory>)[source]

Bases: object

Result of context compression.

compressed_content: str
original_tokens: int
compressed_tokens: int
compression_ratio: float
method: CompressionMethod
metadata: Dict[str, Any]
__init__(compressed_content, original_tokens, compressed_tokens, compression_ratio, method, metadata=<factory>)
class kerb.context.TruncationStrategy(*values)[source]

Bases: Enum

Strategies for truncating context when exceeding limits.

FIRST = 'first'
LAST = 'last'
MIDDLE = 'middle'
PRIORITY = 'priority'
SEMANTIC = 'semantic'
class kerb.context.CompressionMethod(*values)[source]

Bases: Enum

Methods for compressing context.

SUMMARIZE = 'summarize'
EXTRACT_KEY_INFO = 'extract_key_info'
REMOVE_REDUNDANCY = 'remove_redundancy'
ABBREVIATE = 'abbreviate'
HYBRID = 'hybrid'
kerb.context.create_context_window(items, max_tokens=None, strategy=TruncationStrategy.LAST, token_estimator=None)[source]

Create a managed context window from items.

Parameters:
Returns:

Managed context window

Return type:

ContextWindow

Example

>>> window = create_context_window(["Hello", "World"], max_tokens=1000)
>>> print(window.current_tokens)
kerb.context.truncate_context_window(window, max_tokens, strategy=TruncationStrategy.LAST)[source]

Truncate context window to fit within token limit.

Parameters:
Returns:

Truncated context window

Return type:

ContextWindow

Example

>>> window = truncate_context_window(window, max_tokens=500)
kerb.context.compress_context(content, target_tokens, method=CompressionMethod.SUMMARIZE, model='gpt-4o-mini')[source]

Compress context to target token count.

Parameters:
  • content (str) – Content to compress

  • target_tokens (int) – Target token count

  • method (CompressionMethod) – Compression method to use

  • model (str) – Model for token estimation (not used with tokenizer module, kept for backward compatibility)

Returns:

Compression result with metrics

Return type:

CompressionResult

Example

>>> result = compress_context(long_text, target_tokens=500)
>>> print(f"Compressed to {result.compression_ratio:.1%}")
kerb.context.auto_compress_window(window, target_ratio=0.7, method=CompressionMethod.SUMMARIZE)[source]

Automatically compress context window items.

Parameters:
Returns:

Window with compressed items

Return type:

ContextWindow

Example

>>> compressed_window = auto_compress_window(window, target_ratio=0.7)
kerb.context.create_sliding_window(items, window_size, step_size=None)[source]

Create sliding windows over context items.

Parameters:
  • items (List[ContextItem]) – List of context items

  • window_size (int) – Number of items per window

  • step_size (Optional[int]) – Step size between windows (defaults to window_size)

Returns:

List of sliding windows

Return type:

List[ContextWindow]

Example

>>> windows = create_sliding_window(items, window_size=3, step_size=1)
kerb.context.create_token_sliding_window(items, max_tokens, overlap_tokens=0)[source]

Create sliding windows based on token limits.

Parameters:
  • items (List[ContextItem]) – List of context items

  • max_tokens (int) – Maximum tokens per window

  • overlap_tokens (int) – Number of tokens to overlap between windows

Returns:

List of token-based sliding windows

Return type:

List[ContextWindow]

Example

>>> windows = create_token_sliding_window(items, max_tokens=500, overlap_tokens=50)
kerb.context.create_adaptive_window(items, max_tokens, recency_weight=0.5, priority_weight=0.5)[source]

Create adaptive window balancing recency and priority.

Parameters:
  • items (List[ContextItem]) – List of context items

  • max_tokens (int) – Maximum tokens allowed

  • recency_weight (float) – Weight for recency (0-1)

  • priority_weight (float) – Weight for priority (0-1)

Returns:

Adaptively selected context window

Return type:

ContextWindow

Example

>>> window = create_adaptive_window(items, max_tokens=1000)
kerb.context.deduplicate_context(items, similarity_threshold=0.9)[source]

Remove duplicate or highly similar context items.

Parameters:
  • items (List[ContextItem]) – List of context items

  • similarity_threshold (float) – Threshold for considering items duplicates (0-1)

Returns:

Deduplicated items

Return type:

List[ContextItem]

Example

>>> unique_items = deduplicate_context(items, similarity_threshold=0.85)
kerb.context.reorder_context(items, strategy='chronological')[source]

Reorder context items using specified strategy.

Parameters:
  • items (List[ContextItem]) – List of context items

  • strategy (Union[ReorderStrategy, str]) – Reordering strategy (ReorderStrategy enum or string: “chronological”, “priority”, “relevance”, “alternating”)

Returns:

Reordered items

Return type:

List[ContextItem]

Examples

>>> from kerb.core.enums import ReorderStrategy
>>> reordered = reorder_context(items, strategy=ReorderStrategy.PRIORITY)
kerb.context.merge_context_windows(windows, max_tokens=None, deduplication=True)[source]

Merge multiple context windows into one.

Parameters:
  • windows (List[ContextWindow]) – List of context windows to merge

  • max_tokens (Optional[int]) – Maximum tokens for merged window

  • deduplication (bool) – Whether to deduplicate items

Returns:

Merged context window

Return type:

ContextWindow

Example

>>> merged = merge_context_windows([window1, window2], max_tokens=2000)
kerb.context.optimize_context_for_query(window, query, max_tokens, relevance_weight=0.7, diversity_weight=0.3)[source]

Optimize context window for a specific query.

Parameters:
  • window (ContextWindow) – Context window to optimize

  • query (str) – Query to optimize for

  • max_tokens (int) – Maximum tokens allowed

  • relevance_weight (float) – Weight for relevance scoring

  • diversity_weight (float) – Weight for diversity scoring

Returns:

Optimized context window

Return type:

ContextWindow

Example

>>> optimized = optimize_context_for_query(window, "What is AI?", max_tokens=1000)
kerb.context.format_context_window(window, format_template=None, include_metadata=False)[source]

Format context window for LLM consumption.

Parameters:
  • window (ContextWindow) – Context window to format

  • format_template (Optional[str]) – Custom format template

  • include_metadata (bool) – Whether to include item metadata

Returns:

Formatted context string

Return type:

str

Example

>>> formatted = format_context_window(window)
kerb.context.context_to_messages(window, system_prefix=None)[source]

Convert context window to chat message format.

Parameters:
Returns:

List of message dictionaries

Return type:

List[Dict[str, str]]

Example

>>> messages = context_to_messages(window, system_prefix="You are a helpful assistant.")
kerb.context.extract_context_summary(window)[source]

Extract summary of context window contents.

Parameters:

window (ContextWindow) – Context window to summarize

Returns:

Summary of context window

Return type:

str

Example

>>> summary = extract_context_summary(window)
>>> print(summary)
kerb.context.assign_priorities(items, priority_fn)[source]

Assign priorities to context items using custom function.

Parameters:
Returns:

Items with updated priorities

Return type:

List[ContextItem]

Example

>>> items = assign_priorities(items, lambda x: len(x.content) / 100)
kerb.context.priority_by_recency(items)[source]

Assign priorities based on recency (newer = higher priority).

Parameters:

items (List[ContextItem]) – List of context items

Returns:

Items with recency-based priorities

Return type:

List[ContextItem]

kerb.context.priority_by_relevance(items, query, relevance_fn=None)[source]

Assign priorities based on relevance to query.

Parameters:
Returns:

Items with relevance-based priorities

Return type:

List[ContextItem]

Example

>>> items = priority_by_relevance(items, "machine learning")
kerb.context.priority_by_diversity(items, similarity_fn=None)[source]

Assign priorities to maximize diversity (MMR-style).

Parameters:
Returns:

Items with diversity-based priorities

Return type:

List[ContextItem]

Context window management and token budget tracking.