How Olga works

How Olga composes every answer: from her system prompt, your organization content, and current state.

Phase 1

This is part of what RunOlga delivers in Phase 1.

Every answer Olga gives is composed from three sources: who she is, what your organization has recorded, and what is true right now. She does not rely on a general memory of the world — she reads from your instance and grounds each response in what she finds.

The three layers

Olga's responses are assembled from three stores. Each plays a distinct role, and together they are what lets a Care Manager get the same thoughtful partner Ray does.

Layer	What it holds	How it varies
Layer 1 — RunOlga Store	Her system prompt: her personality and behavior.	Universal. The same across every instance.
Layer 2 — Instance Store	Your organization's content — Priorities, Issues, Processes, Metrics, Tasks, Seats.	Different for every subscriber.
Live Context Store	Current operational state.	Updates in near-real-time as people interact with RunOlga.

Layer 1 is who Olga is — defined once, in her system prompt, and shared by every organization on RunOlga. Layer 2 is what your organization is: the Vision, Seats, Priorities, and the rest of your recorded content. The Live Context Store is what is happening now — the state that changes as a leader adds a Priority, logs an Issue, or completes a Task.

How a single answer is composed

For each message you send, Olga runs the same pipeline:

Receive user message.
Load system prompt (Layer 1).
Retrieve relevant Layer 2 content.
Retrieve relevant Live Context data.
Assemble context: system prompt + retrieved content + conversation history + user message.
Call the LLM.
Display response.
Log retrieval references for provenance.

The order matters. Olga starts from who she is, gathers what is relevant from your organization and its current state, brings in the conversation so far, and only then answers. The final step — logging which records she read — is what makes every response traceable afterward.

Retrieval in Phase 1

In Phase 1, retrieval is deliberately direct and simple:

Olga runs standard database queries against Layer 2. She reads your recorded content directly.
There is no vector embedding or semantic similarity — that arrives in Phase 2.
There is no pattern recognition across history — that arrives in Phase 3. Olga can list your Issues; she does not yet identify recurring themes across them.

Keeping retrieval direct is also part of why Olga feels fast: simple queries stay simple.

Every query is automatically scoped to your RBAC permissions. Olga reads only what you are allowed to see, enforced at the application layer rather than left to the prompt. See Permissions and isolation for how scoping and instance isolation work.

Conversation state

Olga keeps a conversation going so you do not have to repeat yourself:

History is per-user and per-session, and it persists across page navigation.
When you move to a different component — say from Priorities to Issues — Olga's context updates to reflect where you now are.
She can reference earlier messages in the same conversation.
Your conversation history is private to you. It is not visible to other users in the same instance.

So if you ask Olga to draft a Priority, then navigate to your Tasks, she still knows what you were working on — and the next person in your organization never sees that exchange.

Performance targets

Phase 1 retrieval is kept direct so that Olga feels immediate when a leader needs her for an urgent decision.

Query type	Target
Simple factual query (Instance Store or Live Context Store)	Under 2 seconds
Methodology query (RunOlga Store)	Under 2 seconds
Multi-store composition query	Under 4 seconds

A responsive mode keeps latency feeling immediate when a decision can't wait.

Provenance

Every response is traceable to its source: which store it came from, and which records Olga accessed to answer. The final step of the pipeline — logging retrieval references — is what makes this possible. If Olga tells you something about your organization, you can see where it came from. For how grounding and confirmation-before-writes work together, see Grounding and confirmation.