Case Studies

Proof, not promises.

Two goals. Two domains. One Umma. Each from a single prompt — and every claim is grounded and traceable.

The work

A security audit of Netflix's Lemur

From one sentence — a full security audit: 238 flaws found, 6 multi-step attack chains, every finding traced to its source.

238 security flaws found 327 files of code read Every claim validated to source

Read the case study → Case study · Personal

A family navigating a dementia diagnosis

1h 33m

For the biggest decisions in life, where other AI deflects — a four-option care plan, a 10-year financial model, and a hard truth five independent analyses converged on.

4 care paths weighed Up to $2M saved over 10 years 1 hard truth surfaced

Read the case study →

Umma vs. the frontier models

Agent mode is not an operating system.

What Big Tech promises

“Give GPT-5.5 a messy, multi-part task and trust it to plan, use tools, check its work … and keep going.” GPT-5.5, openai.com ↗

“Give it a goal and Claude works on your computer … to return a finished deliverable — but consequential decisions remain with the user.” Claude Opus 4.8, anthropic.com ↗

“Your 24/7 personal AI agent … takes action on your behalf and is under your direction.” Gemini 3.5 Flash, blog.google ↗

Capability	Umma operating system	GPT-5.5 OpenAI	Claude Opus 4.8 Anthropic	Gemini 3.5 Flash Google
Designed to pursue goals	✓	~	~	~
Continuous self across sessions	✓	✗	✗	✗
Honest, truthful	✓	✗	✗	✗
Builds and keeps its own capabilities	✓	✗	✗	✗
Verifiable trace of every claim	✓	✗	✗	✗

Umma vs. the agents

Agents sell goals. Only Umma achieves them.

Their own words — against the receipts.

Manus

general AI agent

They say “It bridges minds and actions — it delivers results, getting everything done while you rest.” manus.im ↗
That needs Clarify the goal, get your sign-off, and verify its own work against reality.
It doesn’t Asked to verify, it fabricates curl responses for a state that never existed. Rio Times ↗
Umma does Umma validates every claim instead of asserting them. See the Lemur audit →

Hermes Agent

Nous Research

They say “The agent that grows with you — it remembers what it learns and gets more capable the longer it runs.” hermes-agent.nousresearch.com ↗
That needs An identity it can audit, and a gate before new skills are allowed to land.
It doesn’t Its self-improvement quietly opens security holes — the bug Nous Research itself calls “most dangerous,” because it “looks like success.” Nous Research ↗GitHub #7826 ↗
Umma does Umma's growth is governed, versioned, and refusable. How she's built →

OpenClaw

open-source agent

They say “The AI that actually does things.” openclaw.ai ↗
That needs Real boundaries on what it runs, an audit trail, and a supply chain that can't be poisoned.
It doesn’t A one-click bug let attackers take over the app (rated 8.8/10 critical), its add-on store was poisoned, and a safety lead's emails were deleted despite stop commands. NVD ↗The Hacker News ↗Fast Company ↗
Umma does Umma quarantines every capability and logs every call — nothing ships unproven. What she can do →

Your turn

Bring your hardest problem. Get in touch →