Stop Gambling Your Enterprise Logic on
Single-LLM Hallucinations.
An open-source Multi-LLM Orchestration API. Quorum routes your prompt to 8 models in parallel (Claude, GPT, Gemini, Llama, DeepSeek, and more via NVIDIA AI Foundation), scores semantic agreement via cosine on embeddings, and returns the top-weighted answer with full audit trail. Apache 2.0 + HSP patent. BYOK.
Why One Model Isn't Enough
If you ship with AI you already feel these three. Quorum names them and routes around them.
Single-Model Bias
Claude says one thing, GPT another, Gemini a third. You pick one and pretend it's right. Quorum fans the prompt to all 8, scores semantic agreement via cosine on embeddings, and shows you exactly where they diverge — divergence is the signal worth knowing.
Vendor Lock-in & Outages
Anthropic has an incident, Claude API goes down, your app is dead. Quorum keeps 8 backends in rotation — BYOK for paid providers, NVIDIA AI Foundation free tier for 6 OSS models. One vendor fails, the rest carry the consensus.
Cost Blow-up
A naive 8-model fan-out costs ~£0.02/query. Quorum's MoE router (functional in v0.1.x) picks the 2–4 best models per query class so a real consensus run costs ~$0.000001 with NVIDIA free tier. We log the price next to every answer.
How Quorum Works
Three stages, async fan-out, full audit trail. Two lines of Python.
Intelligent Router
Prompt analyzed.
Parallel Execution
Isolated processing.
Semantic Consensus
Cosine similarity on embeddings — paraphrases count.
# pip install quorum-aifrom quorum import consensus# Fan out to every configured provider, score agreementresult = await consensus("Draft the SEC filing.")print(result.answer)print(f"confidence: {result.confidence:.0%}")print(f"cost: ${result.total_cost_usd:.6f}")# Or self-host: docker run -p 8080:8080 quorum-ai# Or BYOK CLI: quorum ask "..." --all
The Monster in Two Pictures
Real models. Real numbers. No 90% token-savings fantasy.

One prompt → 8 frontier and OSS LLMs in parallel → semantic agreement scoring → top-weighted answer. Numbers from a live run today.

v0.1.5: 10 of 13 evolution loops functional (memory, router, RLHF, A/B, synthetic data, Hebbian, meta-learner, competition, self-prompting, adversarial probing). The remaining 3 are research-grade scaffold — and we say so in the README.
What Ships in v0.1.x
Apache 2.0 core, BYOK any backend. Honest scorecard — open-source means you can verify every claim against the repo.
Vectorized Semantic Consensus
We don't just check for matching words. Quorum compares vector embeddings of each model's output using Cosine Similarity to ensure deep, semantic agreement before returning a binding answer to your users.
HSP Gate (Patent-Pending)
Patent-pending Hybrid Sovereign Protocol (PCT/US26/11908). For high-stakes evolution actions (promoting a checkpoint, A/B-test push, federated update), the gate routes the decision through an async human-approval webhook (HSP_GATE_WEBHOOK env var) before the function executes. Optional, off by default — fully Apache 2.0 without it.
EU AI Act Evidence Helper
Every query generates a SHA-256 hash-chained PDF certificate via reportlab — every model that ran, its weight, raw response, cost, latency. Designed as evidence for Articles 12 (record-keeping) and 13 (transparency) of EU AI Act enforcement (starts 2026-08-02). Helper, not a substitute for professional compliance audit.
Transparent Pricing. Bulletproof Logic.
Choose the right license for your operation.
Self-host or use sandbox.
- Self-hosted: unlimited (Apache 2.0)
- Hosted sandbox: 100 queries/mo
- All 8 providers, BYOK
For solo devs & indie hackers.
- 5,000 hosted queries/month
- All 8 providers, MoE router
- BYOK supported (your provider keys)
- All currently-functional evolution loops
Dedicated nodes & EU compliance.
- Unlimited requests
- On-Premise / VPC Deployment
- Dedicated BYOK Management
- Custom Model Fine-tuning
The Economics of Consensus
Absolute transparency. Infinite scalability. Zero markup.
BYOK — Your Provider Keys
Zero token markup. Quorum never proxies your API keys; the orchestration fee covers the engine only.
MoE Router (Functional)
Per-query-class routing picks 2–4 of 8 providers. Top-K threshold today; multi-armed bandit on roadmap.
Distillation Loop (Roadmap)
Cheap models learn from expensive consensus. Skeleton in v0.1.x; functional target v1.0.
{
"billing_mode": "BYOK",
"orchestration_fee": "£49/mo",
"providers": [
"$ANTHROPIC_API_KEY",
"$OPENAI_API_KEY",
"$GEMINI_API_KEY",
"$NVIDIA_API_KEY // 6 OSS models free",
"$REPLICATE_API_TOKEN",
"ollama://localhost // self-hosted Llama"
],
"router": "moe_top_k",
"observed_cost_per_query": "$0.000001",
"hsp_gate": false // opt-in
}