Zero Data Retention — what Neumannic does and does not store
The plain-language explanation of our data retention posture: what's kept, what's discarded, and the constitutional binding that makes it structurally impossible for us to bypass.
What zero data retention means at Neumannic
When you send a prompt to Neumannic, that prompt is processed by an underlying model, you receive an answer, and the conversation moves forward. What we call zero data retention is the promise about everything that happens around that exchange: the prompt itself, the intermediate computation, the response text, the model provider's internal logs — all of it is treated as ephemeral by default. We do not let providers keep your prompts. We do not cache outputs beyond the active session in any form that survives a server restart. We do not retain operational telemetry that could be replayed to reconstruct what you asked.
We call this binding our zero-data-retention invariant, tracked internally as INVARIANT-29. The word invariant is deliberate: it means the property holds across every tier, every feature, every code path. It is enforced in three places — in the code itself, in the configuration surface that workspace admins see, and in the runtime resolver that picks providers for each request. A change that would relax this binding is structurally rejected at all three layers. Even a workspace administrator with the highest privileges in their workspace cannot turn it off; only a founder-superadmin scope with an audit-logged typed confirmation can, and that confirmation is recorded permanently with the operator's identity and the specific reason.
Zero data retention composes with another binding we hold strictly: we are not locked to any one model vendor. That means we route your requests to whichever provider's zero-data-retention posture best fits the task. If one provider would require us to relax retention in exchange for a feature we wanted, we use a different provider — the policy is binding, the vendor list is not.
What we do NOT store
The following categories of data are explicitly outside our retention window. If you are a compliance reviewer evaluating whether to bring Neumannic into a regulated workflow, this is the list you can verify against our admin posture page.
- Prompt and chat content beyond the active conversation window. When you close a conversation, the prompts and responses are not archived for our internal use. Conversation history that you explicitly choose to keep within the product is governed by your account settings; nothing about that retention path involves the model providers themselves.
- Document drafts beyond active editing windows. While you are writing in a document surface, drafts are buffered to support collaborative editing and crash recovery. Once the editing window closes, the buffered state is encrypted at rest and not used for anything except restoring your editing session.
- Cached prompt prefixes beyond one hour. Where we use prompt-prefix caching to reduce cost and latency, the longest time-to-live we permit is one hour. Most caches are far shorter — the typical lifetime is five to ten minutes of volatile memory. We never opt into the twenty-four-hour cache tier offered by certain providers, because that opt-in forfeits the zero-data- retention posture across the affected requests.
- Image-generation prompts after generation. For image-generation requests, the prompt and the generated image are returned to your session and not retained server-side by the image provider. We use vendors whose privacy posture supports the per-request retention-off header, and we set it for every request.
- Memory entries you delete. If you delete an item from your memory, that item is removed from the storage layer, from the search index, and from any derived embeddings. Deletion is a full cascade, not a soft archive. You can verify this by exporting your memory both before and after a deletion — the exported state will reflect the change.
What we DO store, and why
The counterpart to the no-store list above is honest disclosure of what is retained, and why. The categories below are the only data we keep, the reason we keep them, and the underlying integration that handles them so technical reviewers can do their own diligence.
- Workspace and organization metadata. The identifiers, names, roles, and billing details required to run your subscription and route you to the right workspace. This is held in Stripe (for billing) and Clerk (for authentication and session management). We do not pass any of this to the model providers.
- Document state for collaborative editing. When you are co-editing a document, the in-progress state is held in our collaboration layer so other editors see your changes live. The state is encrypted at rest. When the document is closed, the encrypted state stays at rest until the document is reopened. We do not read it; the encryption key path is workspace-scoped.
- Memory entries you explicitly saved or opted in to. If memory is enabled for your workspace, items that you have either explicitly saved or that we extracted from conversations (with your workspace's per-workspace opt-in) are retained until you delete them. The underlying storage is our managed vector store; the integrity layer signs every item to protect against tampering.
- Audit logs. We keep an append-only record of administrative actions — who promoted whom to admin, who changed which configuration, who triggered which destructive operation. These are operational logs, not user content. The retention window for audit logs is longer than for any other category because compliance posture requires it.
The no-24-hour-cache promise
Several model providers offer a twenty-four-hour cache tier that trades retention for speed and cost. We do not use it. Not for any tier. Not for any feature. Not as an opt-in. This is documented internally as Mandate 12, and it was ratified with a verbatim phrasing that is worth quoting because it captures the policy precisely:
“Opting in for the 24-hour cache means we will forfeit zero data retention, so that's not an option for any of the providers.”
In practical terms, this means several provider configurations are structurally off the table for us:
- OpenAI's twenty-four-hour cache opt-in is forbidden in our codebase — our continuous integration rejects any change that would introduce it.
- OpenAI's gpt-5.5-and-newer model family, which currently requires the longer-retention cache mode by default, is not used in any retention-binding integration point until OpenAI ships a zero-data-retention-eligible variant.
- Google Gemini's explicit cached-content API parameter is forbidden in our codebase for the same reason — it is architecturally non-compatible with our retention posture.
- Providers without a documented zero-data-retention program — at time of writing, Replicate, DeepSeek, Qwen, and DashScope — are on our internal blocklist for retention-binding routes.
What we do use, instead: Anthropic's short-and-medium prompt cache (five minutes and one hour, both eligible under their retention program); OpenAI's default in-memory mode for gpt-4o, gpt-4o-mini, and gpt-5.1, which is volatile-memory only and ~five to ten minutes; and Together AI plus Fireworks AI for always-on caching where the cache itself is volatile-memory only. The full per-provider posture is available to workspace admins via the admin posture page linked at the end of this document.
How our system makes zero data retention structurally impossible to break
The policy above is enforced three different ways, in three different places. We designed it this way intentionally: if any one layer fails, the other two still catch the violation. A reviewer evaluating us as a vendor can verify each layer independently, and the layers are mutually reinforcing.
Code level
A continuous integration script reviews every proposed code change and blocks anything that would introduce a pattern known to relax the retention posture. A developer literally cannot merge a change that contradicts the policy without an explicit founder-superadmin override that is audit-logged. The most common cases this catches are accidentally pulling in a twenty-four-hour cache parameter from copy-pasted vendor documentation, or hardcoding a forbidden vendor as a default.
Configuration level
The workspace-admin panel for provider and cache configuration greys out the options that would relax retention. Even a workspace admin with the highest privileges in their workspace cannot turn it off — the option is not selectable, the toggle is disabled, and the explanation is shown inline at the disabled control. The configuration store itself rejects non-compliant values at write time as a second-layer defense.
Runtime level
When a system component requests a cache time-to-live that no eligible provider supports, the runtime resolver substitutes the safe default — no cache — rather than degrading the retention posture. The request still succeeds. You never see a “retention was bypassed because the cache was too long” error because the system structurally prevents that path. The gap is recorded in operational metrics so we can surface it to admins, but the retention posture itself is preserved by construction.
Memory — opt-in only, deletion is a full cascade
Memory in Neumannic — the feature that lets the product carry forward context across sessions — is opt-in by default at every workspace tier that supports it. On your first interaction with a surface that would write to memory, you are shown a full-screen confirmation that explains what memory does and asks for your explicit consent. You can decline. You can also turn memory off later, from your account settings.
When you opt out, the deletion is a full cascade. Items stored in the memory layer are removed; the derived embeddings used by retrieval are removed; the audit-log entries that record what memory was active for which session are not removed (they remain as operational record), but they no longer reference any retrievable content. This is the GDPR Article-17 erasure-right path applied end-to-end to the memory feature.
Memory is also protected against tampering: anything written to the memory layer is signed at write time, and the signature is verified at read time. Items inferred from external sources (URLs, uploaded documents) go through a review queue before they are trusted, so a poisoned source cannot quietly slip facts into your workspace context. The full description of how memory works lives at /help/memory; the per-user memory controls live at /settings/memory.
Who to contact and how to verify
For questions about privacy or data handling, the canonical contact is privacy@neumannic.com. For the workspace admin who wants to verify the live provider and retention posture matches what is written here, the in-app admin panel at /admin/llm-provider-configs/cache-policy/ surfaces the per-tier, per-integration-point posture in real time. The under-the-hood explanation of the enforcement layers — the script names, the configuration schema, the metrics tables — is available at /help/security for technical reviewers who want more depth than this page provides.
A customer reading this page should know exactly what we do not store, what we do store and why, who to contact with questions, and where to verify the live posture matches the written policy.