Core Module

Core types and shared data structures for the kerb library.

This package contains shared types used across multiple packages to eliminate duplication and ensure consistency.

Import Structure:

## Top-level imports (most common): `python from kerb.core import Document, Message from kerb.core import ChainStrategy, CompressionStrategy `

## Submodule imports (for organized access): `python from kerb.core import types, enums # Then: types.DocumentFormat, enums.RerankMethod `

## Direct submodule access (for less common items): `python from kerb.core.types import DocumentFormat, MessageRole from kerb.core.enums import ReorderStrategy, ParseMode `

class kerb.core.Document(content, metadata=<factory>, id=None, source=None, format=DocumentFormat.UNKNOWN, score=0.0, page_content=None)[source]

Bases: object

Universal document representation across the toolkit.

Consolidates the Document classes from document/ and retrieval/ packages to provide a single, consistent document representation.

content

The text content of the document

metadata

Additional metadata about the document

id

Optional unique identifier for the document

source

Optional source path or URL where document was loaded from

format

Document format (defaults to UNKNOWN)

score

Relevance score (used in retrieval contexts, defaults to 0.0)

page_content

Optional list of content per page (for multi-page documents)

Examples

>>> # Simple document
>>> doc = Document(content="Hello, world!")
>>> # Document with metadata
>>> doc = Document(
...     content="Important document",
...     metadata={"author": "John", "created": "2025-01-01"},
...     source="doc.txt"
... )
>>> # Retrieval result with score
>>> doc = Document(
...     id="doc_123",
...     content="Relevant content",
...     score=0.95
... )
content: str
metadata: Dict[str, Any]
id: str | None = None
source: str | None = None
format: DocumentFormat = 'unknown'
score: float = 0.0
page_content: List[str] | None = None
__len__()[source]

Return the length of the document content.

Return type:

int

to_dict()[source]

Convert document to dictionary.

Return type:

Dict[str, Any]

Returns:

Dictionary representation of the document

classmethod from_dict(data)[source]

Create document from dictionary.

Parameters:

data (Dict[str, Any]) – Dictionary with document data

Return type:

Document

Returns:

New Document instance

__repr__()[source]

String representation of the document.

Return type:

str

__init__(content, metadata=<factory>, id=None, source=None, format=DocumentFormat.UNKNOWN, score=0.0, page_content=None)
class kerb.core.Message(role, content, timestamp=None, metadata=<factory>, name=None, function_call=None, tool_calls=None)[source]

Bases: object

Universal message representation for conversations.

Consolidates the Message classes from generation/ and memory/ packages to provide a single, consistent message representation.

role

The role of the message sender (system, user, assistant, etc.)

content

The message content

timestamp

Optional ISO format timestamp (auto-generated if not provided)

metadata

Additional metadata about the message

name

Optional name for the message sender (used in function calling)

function_call

Optional function call information (legacy)

tool_calls

Optional list of tool calls

Examples

>>> # Simple user message
>>> msg = Message(role="user", content="Hello!")
>>> # System message with enum role
>>> msg = Message(
...     role=MessageRole.SYSTEM,
...     content="You are a helpful assistant"
... )
>>> # Message with metadata
>>> msg = Message(
...     role="assistant",
...     content="Here's the answer",
...     metadata={"model": "gpt-4o", "tokens": 150}
... )
role: MessageRole | str
content: str
timestamp: str | None = None
metadata: Dict[str, Any]
name: str | None = None
function_call: Dict[str, Any] | None = None
tool_calls: List[Dict[str, Any]] | None = None
__post_init__()[source]

Auto-generate timestamp if not provided.

to_dict()[source]

Convert message to dictionary format.

Return type:

Dict[str, Any]

Returns:

Dictionary representation suitable for API calls

classmethod from_dict(data)[source]

Create message from dictionary.

Parameters:

data (Dict[str, Any]) – Dictionary with message data

Return type:

Message

Returns:

New Message instance

__repr__()[source]

String representation of the message.

Return type:

str

__init__(role, content, timestamp=None, metadata=<factory>, name=None, function_call=None, tool_calls=None)
class kerb.core.ChainStrategy(*values)[source]

Bases: Enum

Strategy for executing chain steps.

SEQUENTIAL = 'sequential'
PARALLEL = 'parallel'
CONDITIONAL = 'conditional'
DYNAMIC = 'dynamic'
class kerb.core.CompressionStrategy(*values)[source]

Bases: Enum

Strategy for compressing context.

TOP_K = 'top_k'
SUMMARIZE = 'summarize'
FILTER = 'filter'
TRUNCATE = 'truncate'
class kerb.core.ChunkingStrategy(*values)[source]

Bases: Enum

Strategy for chunking text.

SIMPLE = 'simple'
RECURSIVE = 'recursive'
SEMANTIC = 'semantic'
SENTENCE = 'sentence'
PARAGRAPH = 'paragraph'
TOKEN = 'token'
SLIDING_WINDOW = 'sliding_window'

The core module provides shared types, interfaces, and base classes used across all Kerb modules.