Independent AI innovation laboratory

The most efficient token is the one never used.

SedaSoft has spent twenty-five years building robust infrastructure for information management - with common structure, accountability and provenance built in from the ground up. The result is a production AI platform that routes seven of eight RAG pipeline stages through local neural networks, calling a cloud LLM only when it is genuinely needed. The Efficiency Engine - the cross-system layer that enforces token discipline, carbon accountability and latency constraints before any prompt reaches the LLM - is available as a standalone component.

Efficiency-as-Architecture Compliance-as-Architecture Human-in-loop refinement

Read the research About Sedasoft

Regulatory compliance-by-design

Not an add-on. An architecture.

Compliance is built into the pipeline - from provenance trails to the governance dashboard - rather than bolted on after the fact.

EU AI Act compliance

High-risk provisions, August 2026

US AI framework alignment

NIST AI RMF, Executive Orders

Human-in-loop refinement

Governed through the compliance dashboard

89.2%

LoComo conversational memory

100%

Unanswerable accuracy (4 benchmarks)

62%

Token reduction vs standard RAG

0.4g

CO₂ saved per query

All verified via public harnesses. Full benchmark results →

Production safety property

The system that knows when it doesn't know.

Across four independent benchmarks - LifeBench, BEAM, LoComo, and MemSim - SiteEngine AI achieved 100% accuracy on unanswerable queries. Every time the relevant fact was missing from memory, the system correctly refused to answer instead of hallucinating.

This is not a coincidence. It is a design consequence of how the memory layer frames injected context - as background, not fact. Production agents that refuse when uncertain are recoverable. Agents that confabulate wrong memories are not.

See the full benchmark results →

100%

Unanswerable accuracy

LifeBench30/30

BEAM abstention40/40

LoComo refusalsverified

MemSim boundaryverified

SedaSoft is an independent innovation laboratory. The principles it builds on - structure before scale, provenance before generation, accountability before deployment - predate the current AI moment by twenty-seven years. The EU AI Act has caught up to those principles. The rest of the industry is catching up now. More about sedasoft →

The Efficiency Engine

The first per-query carbon accounting framework for AI inference.

Existing benchmarks measure energy at the model layer. SedaSoft measures it at the application layer - where architectural decisions multiply or eliminate whatever the model saved.

The Efficiency Engine uses local ONNX neural networks for classification, routing, embedding and compression. The LLM API is called only for the final generation step. The result: 62% fewer tokens per query (measured across four academic RAG datasets), 75% fewer API calls, and a methodology aligned with EU AI Act energy reporting requirements from August 2026.

The Efficiency Engine is available as a standalone component - independently of the full SiteEngine AI platform. If token cost, carbon accountability, or AI efficiency at the application layer is your immediate challenge, we would welcome a conversation.

62%

Average token reduction vs standard RAG

75%

Fewer cloud API calls

7 of 8

Pipeline stages run locally

Per query

Carbon measurement at application layer

Read the Efficiency Engine thesis

Research & Development

Five pillars. One body of work.

Each thesis documents one of the pillars of our production infrastructure - its architecture, methodology, and empirical results. Written in the style of academic research and benchmarked against external, reproducible datasets.

AI Architecture

SiteEngine AI

Multi-tenant RAG platform with cognitive AI, PAD emotional model, Ebbinghaus memory, and the first production implementation of Communication Accommodation Theory.

Read thesis

Efficiency Carbon

The Efficiency Engine

Cross-system architecture for cost-aware, carbon-reduced AI. First per-query carbon accounting framework at the application layer. Self-regulating health gating.

Read thesis

Document Processing

Baibelfish

Multi-format document ingestion engine. Content-aware chunking across 12 formats. Benchmarked against HotpotQA, SQuAD, FinQA, and three other public datasets.

Read thesis

Staging Architecture

DeepThought

Atomic document staging and promotion for production RAG. Transactional ingestion with rollback, audit trails, and expert promotion. Evaluated on 35,000+ document sections.

Read thesis

Historical In production since 1998

SiteEngine (original)

An information management platform that anticipated declarative configuration, page inheritance, and content-type abstraction by more than a decade. Still running on the same architecture principles 27 years on.

Read thesis

Full research index

All five theses with abstracts, key contributions, and reading links.

Cognitive Architecture

AI that builds a relationship, not just a context window

The SiteEngine AI cognitive system is grounded in established psychological models rather than engineered from intuition. The PAD emotional model governs emotional state. Ebbinghaus forgetting curves govern memory. Communication Accommodation Theory governs how the system adapts to individual users over time.

PAD Emotional Model

Pleasure, Arousal, Dominance - text-based emotional state modelling. No biometrics. EU AI Act compliant by design.

Ebbinghaus Memory Decay

Forgetting curves fade outdated context naturally. The system remembers what matters and releases what has become stale.

Communication Accommodation

First production implementation of Adaptive Relationship Theory in an AI system. The platform adjusts register, pace, and depth to each user over time.

Platform Architecture

Built in Go. Deployed as compiled binaries.

SiteEngine AI is approximately 60,000 lines of Go - not Python, not a framework assembled from libraries. Compiled binaries. Multi-tenant isolation. 17-stage hybrid pipeline. The platform is built the way infrastructure should be built: for reliability first, performance second, and convenience last.

17-stage hybrid pipeline

Local ONNX neural networks handle stages 1-7. Stage 8 - generation - calls the LLM. Everything else stays inside your perimeter.

Dgraph knowledge graph

Entity-relationship mapping across the full document corpus. Cross-document connections that vector search alone cannot find.

80+ MCP tools for Claude Desktop

Full Model Context Protocol integration, giving Claude Desktop direct access to the SiteEngine AI platform and knowledge graph.

Wait List at https://gethydrate.dev

Hydrate: because AI coding tools have no memory

The /clear problem is real. Every time you start a new Claude Code session or hit /clear mid-session, you lose the context you spent the last hour building.

Hydrate is a thin integration layer that captures a project's context automatically and injects it into your next session - in Claude Code, VS Code Copilot, Mistral or any MCP client. It's the first tool to make memory portable across sessions, tools and models, it delivers aggressive token compression and it uses the same SiteEngine AI infrastructure that scored 89.2% on LoComo.

In Claude Code, three small binaries ride on the existing hook system. A Stop hook captures context before /clear. A prompt hook enriches every turn with relevant memory. A /hydrate command restores context selectively after clearing. All three are stateless, removable, and designed to be deleted when they are no longer needed.

"Get Opus-quality responses at Haiku-level token cost. Move context between models. Three thin hooks. All removable."

Claude Code hooks SiteEngine AI memory Up to 95% token reduction

Economy modes

Full

summary_32 + all facts + preferences

~1,500 tokens

Economy

summary_16 + top 5 facts

~400 tokens

Turbo

summary_16 + top 2 facts

~150 tokens

Turbo mode makes Opus cheaper per turn than Haiku at full context.

Slash commands

/hydrateAuto-load relevant past sessions

/hydrate-lastMost recent relevant session

/hydrate-projectScoped to current project

/hydrate-weekLast 7 days of sessions

If this work is relevant to yours, we'd like to talk.

SedaSoft is interested in research collaboration, academic partnership and conversations with organisations working on adjacent problems in AI efficiency, responsible deployment, or cognitive systems architecture.

Start a conversation Browse the research