Embedding Module
Embedding utilities for converting text to vector representations.
This module provides flexible embedding generation with multiple model options: - Local/hash-based (no dependencies, for testing) - Sentence Transformers (local ML models, high quality) - OpenAI API (cloud-based, highest quality)
- Usage Examples:
# Common usage - core functions from kerb.embedding import embed, embed_batch
vec = embed(“Hello world”) vecs = embed_batch([“Hello”, “World”])
# Provider-specific usage from kerb.embedding.providers import OpenAIEmbedder, LocalEmbedder from kerb.embedding.providers import SentenceTransformerEmbedder
embedder = OpenAIEmbedder(model_name=”text-embedding-3-large”) vec = embedder.embed(“Hello”)
# Utilities from kerb.embedding.utils import cosine_similarity, euclidean_distance
similarity = cosine_similarity(vec1, vec2)
- class kerb.embedding.EmbeddingModel(*values)[source]
Bases:
EnumEnum for embedding models.
For custom models not listed here, use a plain string instead.
- LOCAL = 'local'
- ALL_MINILM_L6_V2 = 'all-MiniLM-L6-v2'
- ALL_MINILM_L12_V2 = 'all-MiniLM-L12-v2'
- ALL_MPNET_BASE_V2 = 'all-mpnet-base-v2'
- PARAPHRASE_MINILM_L6_V2 = 'paraphrase-MiniLM-L6-v2'
- PARAPHRASE_MPNET_BASE_V2 = 'paraphrase-mpnet-base-v2'
- TEXT_EMBEDDING_3_SMALL = 'text-embedding-3-small'
- TEXT_EMBEDDING_3_LARGE = 'text-embedding-3-large'
- TEXT_EMBEDDING_ADA_002 = 'text-embedding-ada-002'
- class kerb.embedding.ModelBackend(*values)[source]
Bases:
EnumEnum for embedding backends.
- LOCAL = 'local'
- SENTENCE_TRANSFORMERS = 'sentence_transformers'
- OPENAI = 'openai'
- kerb.embedding.embed(text, model=EmbeddingModel.LOCAL, dimensions=384, api_key=None, **kwargs)[source]
Generate an embedding vector for text.
- Parameters:
text (
str) – The text to embedmodel (
Union[str,EmbeddingModel]) – Model to use: - EmbeddingModel.LOCAL - Hash-based (default, no dependencies) - EmbeddingModel.ALL_MINILM_L6_V2 - Sentence Transformers (384 dim) - EmbeddingModel.ALL_MPNET_BASE_V2 - Sentence Transformers (768 dim) - EmbeddingModel.TEXT_EMBEDDING_3_SMALL - OpenAI (1536 dim) - EmbeddingModel.TEXT_EMBEDDING_3_LARGE - OpenAI (3072 dim) - Or use a string for custom models: “custom-model-name”dimensions (
int) – Dimension for local embeddings (default: 384)api_key (
Optional[str]) – OpenAI API key (or set OPENAI_API_KEY env var)**kwargs – Additional model-specific parameters
- Returns:
Embedding vector (normalized to unit length)
- Return type:
Examples
# Using enum (recommended for known models) vec = embed(“Hello, world!”) vec = embed(“Hello”, model=EmbeddingModel.ALL_MINILM_L6_V2) vec = embed(“Hello”, model=EmbeddingModel.TEXT_EMBEDDING_3_SMALL, api_key=”sk-…”)
# Using string for custom models vec = embed(“Hello”, model=”my-custom-sentence-transformer”)
- kerb.embedding.embed_batch(texts, model=EmbeddingModel.LOCAL, dimensions=384, batch_size=32, api_key=None, **kwargs)[source]
Generate embeddings for multiple texts efficiently.
- Parameters:
- Returns:
List of embedding vectors
- Return type:
Examples
# Using enum embeddings = embed_batch([“doc1”, “doc2”, “doc3”]) embeddings = embed_batch(docs, model=EmbeddingModel.ALL_MINILM_L6_V2) embeddings = embed_batch(docs, model=EmbeddingModel.TEXT_EMBEDDING_3_SMALL)
# Using string for custom models embeddings = embed_batch(docs, model=”custom-model”)
- async kerb.embedding.embed_async(text, model=EmbeddingModel.TEXT_EMBEDDING_3_SMALL, api_key=None, **kwargs)[source]
Generate embedding asynchronously (wrapper for API-based models).
- Parameters:
text (
str) – Text to embedmodel (
Union[str,EmbeddingModel]) – Embedding model to use**kwargs – Additional model parameters
- Returns:
Embedding vector
- Return type:
Note
Currently only supports async for OpenAI models. Local models will run synchronously in a thread pool.
Examples
>>> import asyncio >>> embedding = asyncio.run(embed_async("Hello world"))
- async kerb.embedding.embed_batch_async(texts, model=EmbeddingModel.TEXT_EMBEDDING_3_SMALL, api_key=None, batch_size=100, max_concurrent=5, **kwargs)[source]
Generate embeddings for multiple texts asynchronously.
- Parameters:
model (
Union[str,EmbeddingModel]) – Embedding model to usebatch_size (
int) – Number of texts per API callmax_concurrent (
int) – Maximum concurrent requests (for API models)**kwargs – Additional model parameters
- Returns:
List of embedding vectors
- Return type:
Examples
>>> import asyncio >>> texts = ["Hello", "World", "AI"] >>> embeddings = asyncio.run(embed_batch_async(texts))
- kerb.embedding.embed_batch_stream(texts, model=EmbeddingModel.LOCAL, batch_size=32, api_key=None, **kwargs)[source]
Stream embeddings for large datasets (memory efficient).
Yields embeddings one at a time instead of loading all into memory. Useful for processing very large datasets.
- Parameters:
- Yields:
Tuple[int, List[float]] – (index, embedding) pairs
Examples
>>> texts = ["text1", "text2", ...] # Large list >>> for idx, embedding in embed_batch_stream(texts, batch_size=100): ... # Process embedding immediately ... print(f"Processed {idx}")
- async kerb.embedding.embed_batch_stream_async(texts, model=EmbeddingModel.TEXT_EMBEDDING_3_SMALL, batch_size=100, api_key=None, max_concurrent=5, **kwargs)[source]
Stream embeddings asynchronously for large datasets.
- Parameters:
- Yields:
Tuple[int, List[float]] – (index, embedding) pairs
Examples
>>> async def process(): ... texts = ["text1", "text2", ...] ... async for idx, embedding in embed_batch_stream_async(texts): ... print(f"Processed {idx}") >>> asyncio.run(process())
- class kerb.embedding.LocalEmbedder(dimensions=384)[source]
Bases:
objectLocal hash-based embedder
This is a simple, deterministic embedding that requires no external models. Suitable for testing, prototyping, or when you don’t need semantic quality.
- Parameters:
dimensions (
int) – Embedding dimension (default: 384)
Examples
embedder = LocalEmbedder(dimensions=512) vec = embedder.embed(“Hello world”) vecs = embedder.embed_batch([“Hello”, “World”])
- __init__(dimensions=384)[source]
Initialize the local embedder.
- Parameters:
dimensions (
int) – Embedding dimension
- class kerb.embedding.OpenAIEmbedder(model_name='text-embedding-3-small', api_key=None)[source]
Bases:
objectOpenAI embedding provider.
Requires: pip install openai
- Parameters:
Examples:
embedder = OpenAIEmbedder(model_name="text-embedding-3-large") vec = embedder.embed("Hello world") vecs = embedder.embed_batch(["Hello", "World"]) # Async usage import asyncio async def main(): vec = await embedder.embed_async("Hello") asyncio.run(main())
- __init__(model_name='text-embedding-3-small', api_key=None)[source]
Initialize the OpenAI embedder.
- class kerb.embedding.SentenceTransformerEmbedder(model_name='all-MiniLM-L6-v2')[source]
Bases:
objectSentence Transformers embedding provider (runs locally).
Requires: pip install sentence-transformers
- Parameters:
model_name (
str) – Model name (default: “all-MiniLM-L6-v2”)
Examples
embedder = SentenceTransformerEmbedder(model_name=”all-mpnet-base-v2”) vec = embedder.embed(“Hello world”) vecs = embedder.embed_batch([“Hello”, “World”])
- __init__(model_name='all-MiniLM-L6-v2')[source]
Initialize the Sentence Transformer embedder.
- Parameters:
model_name (
str) – Model name
- kerb.embedding.local_embed(text, dimensions=384)[source]
Generate embedding using local hash-based method.
This is a simple, deterministic embedding that requires no external models. Suitable for testing, prototyping, or when you don’t need semantic quality.
- kerb.embedding.openai_embed(text, model_name='text-embedding-3-small', api_key=None, **kwargs)[source]
Generate embedding using OpenAI API.
Requires: pip install openai
- Parameters:
- Returns:
Embedding vector
- Return type:
Popular models:
“text-embedding-3-small” (1536 dim, cost-effective)
“text-embedding-3-large” (3072 dim, highest quality)
“text-embedding-ada-002” (1536 dim, legacy)
- kerb.embedding.openai_embed_batch(texts, model_name='text-embedding-3-small', api_key=None, batch_size=100, **kwargs)[source]
Generate embeddings for multiple texts using OpenAI API.
Processes texts in batches to stay within API limits.
- Parameters:
- Returns:
List of embedding vectors
- Return type:
- async kerb.embedding.openai_embed_async(text, model_name='text-embedding-3-small', api_key=None, **kwargs)[source]
Generate embedding using OpenAI API asynchronously.
Requires: pip install openai
- Parameters:
- Returns:
Embedding vector
- Return type:
Examples
>>> import asyncio >>> embedding = asyncio.run(openai_embed_async("Hello world"))
- async kerb.embedding.openai_embed_batch_async(texts, model_name='text-embedding-3-small', api_key=None, batch_size=100, max_concurrent=5, **kwargs)[source]
Generate embeddings for multiple texts using OpenAI API asynchronously.
Processes texts in batches with concurrent requests for improved performance.
- Parameters:
model_name (
str) – OpenAI model name (default: “text-embedding-3-small”)api_key (
Optional[str]) – OpenAI API key (or set OPENAI_API_KEY env var)batch_size (
int) – Number of texts per API call (max 2048 for OpenAI)max_concurrent (
int) – Maximum concurrent API requests**kwargs – Additional API parameters
- Returns:
List of embedding vectors
- Return type:
Examples
>>> import asyncio >>> texts = ["Hello", "World", "AI"] >>> embeddings = asyncio.run(openai_embed_batch_async(texts))
- kerb.embedding.sentence_transformer_embed(text, model_name='all-MiniLM-L6-v2', **kwargs)[source]
Generate embedding using Sentence Transformers (local ML model).
Requires: pip install sentence-transformers
- Parameters:
- Returns:
Embedding vector
- Return type:
- Popular models:
“all-MiniLM-L6-v2” (384 dim, fast)
“all-mpnet-base-v2” (768 dim, quality)
“all-MiniLM-L12-v2” (384 dim, balanced)
- kerb.embedding.sentence_transformer_embed_batch(texts, model_name='all-MiniLM-L6-v2', batch_size=32, **kwargs)[source]
Generate embeddings for multiple texts using Sentence Transformers.
More efficient than calling sentence_transformer_embed repeatedly.
- kerb.embedding.cosine_similarity(vector1, vector2)[source]
Calculate cosine similarity between two vectors.
- Parameters:
- Returns:
Cosine similarity score between -1 and 1 (1 = identical)
- Return type:
Examples
from kerb.embedding import embed sim = cosine_similarity(embed(“hello”), embed(“hi”))
- kerb.embedding.euclidean_distance(vector1, vector2)[source]
Calculate Euclidean (L2) distance between two vectors.
- kerb.embedding.manhattan_distance(vector1, vector2)[source]
Calculate Manhattan (L1) distance between two vectors.
- kerb.embedding.batch_similarity(query_vector, vectors, metric='cosine')[source]
Calculate similarity between a query vector and multiple vectors.
- Parameters:
- Returns:
Similarity/distance scores
- Return type:
Examples
from kerb.embedding import embed, embed_batch query = embed(“search query”) docs = embed_batch([“doc1”, “doc2”, “doc3”]) scores = batch_similarity(query, docs, metric=”cosine”)
- kerb.embedding.top_k_similar(query_vector, vectors, k=5, metric='cosine', return_scores=False)[source]
Find top-k most similar vectors to a query vector.
- Parameters:
- Returns:
Top-k indices (or index-score pairs)
- Return type:
Examples
from kerb.embedding import embed, embed_batch query = embed(“search query”) docs = embed_batch([“doc1”, “doc2”, “doc3”]) indices = top_k_similar(query, docs, k=2) # Or with scores results = top_k_similar(query, docs, k=2, return_scores=True)
- kerb.embedding.mean_pooling(vectors)[source]
Calculate the mean of multiple vectors (centroid).
Useful for averaging embeddings of multiple texts.
- Parameters:
- Returns:
Mean vector
- Return type:
Examples
from kerb.embedding import embed_batch # Average embeddings of multiple sentences sentences = [“First sentence.”, “Second sentence.”, “Third sentence.”] embeddings = embed_batch(sentences) avg_embedding = mean_pooling(embeddings)
- kerb.embedding.weighted_mean_pooling(vectors, weights)[source]
Calculate weighted mean of multiple vectors.
- Parameters:
- Returns:
Weighted mean vector
- Return type:
Examples
from kerb.embedding import embed_batch embeddings = embed_batch([“important”, “less important”]) weighted_avg = weighted_mean_pooling(embeddings, weights=[0.8, 0.2])
- kerb.embedding.max_pooling(vectors)[source]
Apply max pooling across multiple vectors (element-wise maximum).
- kerb.embedding.pairwise_similarities(vectors, metric='cosine')[source]
Calculate pairwise similarities between all vectors.
Returns a similarity matrix where element [i][j] is the similarity between vectors[i] and vectors[j].
- Parameters:
- Returns:
N x N similarity matrix
- Return type:
Examples
from kerb.embedding import embed_batch docs = embed_batch([“doc1”, “doc2”, “doc3”]) sim_matrix = pairwise_similarities(docs)
- kerb.embedding.cluster_embeddings(vectors, threshold=0.8)[source]
Simple clustering of embeddings based on similarity threshold.
Groups embeddings that are similar above the threshold.
- Parameters:
- Returns:
List of clusters (each cluster is a list of indices)
- Return type:
Examples
from kerb.embedding import embed_batch docs = embed_batch([“doc1”, “doc2 similar to 1”, “doc3 different”]) clusters = cluster_embeddings(docs, threshold=0.7)
Embedding generation and similarity search helpers.