RAG Pipeline

Vector store, embedding providers, and graph-expanded semantic search.

RAG (Retrieval-Augmented Generation) subsystem for code-aware context (M6-deferred).

Provides vector store abstraction, embedding providers, and a search pipeline that expands results using the project’s existing GraphStore.

Decision: D-017

Abstract vector store interface with Chroma backend.

chromadb is a core dependency. Chroma operates in two modes: - Embedded PersistentClient: no server needed, local persistence - HttpClient: connects to the Chroma Docker service (docker compose up -d)

Follows the D-014 pattern (abstract base + factory function). Decision: D-017

class curate_ipsum.rag.vector_store.VectorDocument(id, text, embedding=None, metadata=<factory>)[source]

Bases: object

A document stored in the vector store.

Parameters:
id: str
text: str
embedding: list[float] | None = None
metadata: dict[str, Any]
class curate_ipsum.rag.vector_store.VectorSearchResult(id, text, score, metadata=<factory>)[source]

Bases: object

A single search result from the vector store.

Parameters:
id: str
text: str
score: float
metadata: dict[str, Any]
class curate_ipsum.rag.vector_store.VectorStore[source]

Bases: ABC

Abstract interface for vector storage and similarity search.

Implementations persist document embeddings and support approximate nearest-neighbor queries.

abstractmethod add(documents)[source]

Add documents to the store. Upserts on matching IDs.

Parameters:

documents (list[VectorDocument])

Return type:

None

abstractmethod search(query_embedding, top_k=10, filter_metadata=None)[source]

Search for similar documents by embedding vector.

Parameters:
Return type:

list[VectorSearchResult]

abstractmethod delete(ids)[source]

Delete documents by ID.

Parameters:

ids (list[str])

Return type:

None

abstractmethod count()[source]

Return the number of documents in the store.

Return type:

int

close()[source]

Release resources. Override if needed.

Return type:

None

class curate_ipsum.rag.vector_store.ChromaVectorStore(collection_name='code_nodes', persist_directory=None, chroma_host=None, chroma_port=8000)[source]

Bases: VectorStore

Chroma-based vector store (core dependency).

Operates in two modes: - Embedded (PersistentClient): no server, local persistence - Client/server (HttpClient): connects to docker compose chroma service

Set CHROMA_HOST env var to connect to a remote Chroma instance, otherwise defaults to embedded mode.

Decision: D-017

Parameters:
  • collection_name (str)

  • persist_directory (str | None)

  • chroma_host (str | None)

  • chroma_port (int)

add(documents)[source]

Add documents to the store. Upserts on matching IDs.

Parameters:

documents (list[VectorDocument])

Return type:

None

search(query_embedding, top_k=10, filter_metadata=None)[source]

Search for similar documents by embedding vector.

Parameters:
Return type:

list[VectorSearchResult]

delete(ids)[source]

Delete documents by ID.

Parameters:

ids (list[str])

Return type:

None

count()[source]

Return the number of documents in the store.

Return type:

int

curate_ipsum.rag.vector_store.build_vector_store(backend='chroma', **kwargs)[source]

Factory: create a VectorStore of the requested type.

Parameters:
  • backend (str) – “chroma” (only supported backend currently)

  • **kwargs (Any) – Backend-specific configuration

Returns:

VectorStore instance

Raises:

ValueError – Unknown backend

Return type:

VectorStore

Embedding provider abstraction with local sentence-transformers backend.

sentence-transformers and all-MiniLM-L6-v2 are core dependencies (not optional). Alternative/larger models can be installed via [embeddings-gpu] or [embeddings-large].

Decision: D-017

class curate_ipsum.rag.embedding_provider.EmbeddingProvider[source]

Bases: ABC

Abstract interface for text → embedding vector conversion.

abstractmethod embed(texts)[source]

Convert a batch of texts into embedding vectors.

Returns a list of float vectors, one per input text.

Parameters:

texts (list[str])

Return type:

list[list[float]]

abstractmethod dimension()[source]

Return the embedding dimensionality.

Return type:

int

class curate_ipsum.rag.embedding_provider.LocalEmbeddingProvider(model_name='all-MiniLM-L6-v2')[source]

Bases: EmbeddingProvider

Local embedding via sentence-transformers.

Default model: all-MiniLM-L6-v2 (384 dimensions, fast, good for code). Install [embeddings-gpu] for GPU acceleration or [embeddings-large] for InstructorEmbedding support.

Parameters:

model_name (str)

embed(texts)[source]

Convert a batch of texts into embedding vectors.

Returns a list of float vectors, one per input text.

Parameters:

texts (list[str])

Return type:

list[list[float]]

dimension()[source]

Return the embedding dimensionality.

Return type:

int

class curate_ipsum.rag.embedding_provider.MockEmbeddingProvider(dim=384)[source]

Bases: EmbeddingProvider

Mock embedding provider for testing. Returns fixed-length zero vectors.

Parameters:

dim (int)

embed(texts)[source]

Convert a batch of texts into embedding vectors.

Returns a list of float vectors, one per input text.

Parameters:

texts (list[str])

Return type:

list[list[float]]

dimension()[source]

Return the embedding dimensionality.

Return type:

int

RAG search pipeline with graph-expanded retrieval.

Vector top-k → graph expansion (callers/callees via GraphStore) → rerank by combined score → pack into LLM context.

Decision: D-017

class curate_ipsum.rag.search.RAGConfig(vector_top_k=20, expansion_hops=1, caller_decay=0.7, callee_decay=0.8, max_context_tokens=4000, project_id='default')[source]

Bases: object

Configuration for the RAG search pipeline.

Parameters:
  • vector_top_k (int)

  • expansion_hops (int)

  • caller_decay (float)

  • callee_decay (float)

  • max_context_tokens (int)

  • project_id (str)

vector_top_k: int = 20
expansion_hops: int = 1
caller_decay: float = 0.7
callee_decay: float = 0.8
max_context_tokens: int = 4000
project_id: str = 'default'
class curate_ipsum.rag.search.RAGResult(node_id, text, score, source='vector', metadata=<factory>)[source]

Bases: object

A single result from the RAG pipeline.

Parameters:
node_id: str
text: str
score: float
source: str = 'vector'
metadata: dict[str, Any]
class curate_ipsum.rag.search.RAGPipeline(vector_store, embedding_provider, graph_store=None, config=None)[source]

Bases: object

Code-aware retrieval pipeline.

Combines vector similarity search with graph-based expansion using the project’s existing GraphStore (D-014) for caller/callee relationships.

Usage:

pipeline = RAGPipeline(
    vector_store=chroma_store,
    embedding_provider=local_embedder,
    graph_store=sqlite_graph_store,  # optional
)
results = pipeline.search("function that validates input")
Parameters:
search(query)[source]

Search for code relevant to the query.

  1. Embed query

  2. Vector search for top-k

  3. Graph-expand results (callers + callees)

  4. Deduplicate and rerank

Parameters:

query (str)

Return type:

list[RAGResult]

pack_context(results, max_tokens=None)[source]

Pack RAG results into a single context string for LLM prompt injection.

Respects the token budget (estimated at 4 chars per token).

Parameters:
Return type:

str