Patent Pending PCT/US26/11908 Secure API

Stop Gambling Your Enterprise Logic on
Single-LLM Hallucinations.

An open-source Multi-LLM Orchestration API. Quorum routes your prompt to 8 models in parallel (Claude, GPT, Gemini, Llama, DeepSeek, and more via NVIDIA AI Foundation), scores semantic agreement via cosine on embeddings, and returns the top-weighted answer with full audit trail. Apache 2.0 + HSP patent. BYOK.

NEWVS Code Extension v0.1.0 is now live.
Architected for precision with
Google CloudAnthropicOpenAIMeta

Why One Model Isn't Enough

If you ship with AI you already feel these three. Quorum names them and routes around them.

Single-Model Bias

Claude says one thing, GPT another, Gemini a third. You pick one and pretend it's right. Quorum fans the prompt to all 8, scores semantic agreement via cosine on embeddings, and shows you exactly where they diverge — divergence is the signal worth knowing.

Vendor Lock-in & Outages

Anthropic has an incident, Claude API goes down, your app is dead. Quorum keeps 8 backends in rotation — BYOK for paid providers, NVIDIA AI Foundation free tier for 6 OSS models. One vendor fails, the rest carry the consensus.

Cost Blow-up

A naive 8-model fan-out costs ~£0.02/query. Quorum's MoE router (functional in v0.1.x) picks the 2–4 best models per query class so a real consensus run costs ~$0.000001 with NVIDIA free tier. We log the price next to every answer.

How Quorum Works

Three stages, async fan-out, full audit trail. Two lines of Python.

1

Intelligent Router

Prompt analyzed.

8

Parallel Execution

Isolated processing.

C

Semantic Consensus

Cosine similarity on embeddings — paraphrases count.

# pip install quorum-aifrom quorum import consensus# Fan out to every configured provider, score agreementresult = await consensus("Draft the SEC filing.")print(result.answer)print(f"confidence: {result.confidence:.0%}")print(f"cost: ${result.total_cost_usd:.6f}")# Or self-host: docker run -p 8080:8080 quorum-ai# Or BYOK CLI: quorum ask "..." --all

The Monster in Two Pictures

Real models. Real numbers. No 90% token-savings fantasy.

One prompt fanning out to 8 LLMs (Claude Sonnet 4.6, GPT-5, Gemini Flash, Grok-4, Llama 3.3 70B, Mistral Large, Command R+, DeepSeek V4) into a consensus core. Live output: confidence 87%, cost $0.011, latency 9.5s.

One prompt → 8 frontier and OSS LLMs in parallel → semantic agreement scoring → top-weighted answer. Numbers from a live run today.

Evolution loops in v0.1.5 — 10 functional including 5 promoted via Quorum-driven design (Hebbian, Meta-learner, Competition/ELO) designed by Quorum consensus itself. Patent-pending HSP PCT/US26/11908.

v0.1.5: 10 of 13 evolution loops functional (memory, router, RLHF, A/B, synthetic data, Hebbian, meta-learner, competition, self-prompting, adversarial probing). The remaining 3 are research-grade scaffold — and we say so in the README.

What Ships in v0.1.x

Apache 2.0 core, BYOK any backend. Honest scorecard — open-source means you can verify every claim against the repo.

Vectorized Semantic Consensus

We don't just check for matching words. Quorum compares vector embeddings of each model's output using Cosine Similarity to ensure deep, semantic agreement before returning a binding answer to your users.

HSP Gate (Patent-Pending)

Patent-pending Hybrid Sovereign Protocol (PCT/US26/11908). For high-stakes evolution actions (promoting a checkpoint, A/B-test push, federated update), the gate routes the decision through an async human-approval webhook (HSP_GATE_WEBHOOK env var) before the function executes. Optional, off by default — fully Apache 2.0 without it.

EU AI Act Evidence Helper

Every query generates a SHA-256 hash-chained PDF certificate via reportlab — every model that ran, its weight, raw response, cost, latency. Designed as evidence for Articles 12 (record-keeping) and 13 (transparency) of EU AI Act enforcement (starts 2026-08-02). Helper, not a substitute for professional compliance audit.

Transparent Pricing. Bulletproof Logic.

Choose the right license for your operation.

Free / Open Source
£0/mo

Self-host or use sandbox.

  • Self-hosted: unlimited (Apache 2.0)
  • Hosted sandbox: 100 queries/mo
  • All 8 providers, BYOK
Clone on GitHub
Pro (Most Popular)
£49/mo

For solo devs & indie hackers.

  • 5,000 hosted queries/month
  • All 8 providers, MoE router
  • BYOK supported (your provider keys)
  • All currently-functional evolution loops
Start Pro Trial
Enterprise
Custom

Dedicated nodes & EU compliance.

  • Unlimited requests
  • On-Premise / VPC Deployment
  • Dedicated BYOK Management
  • Custom Model Fine-tuning
Contact Sales

The Economics of Consensus

Absolute transparency. Infinite scalability. Zero markup.

$

BYOK — Your Provider Keys

Zero token markup. Quorum never proxies your API keys; the orchestration fee covers the engine only.

R

MoE Router (Functional)

Per-query-class routing picks 2–4 of 8 providers. Top-K threshold today; multi-armed bandit on roadmap.

Distillation Loop (Roadmap)

Cheap models learn from expensive consensus. Skeleton in v0.1.x; functional target v1.0.

quorum.config.json
{
  "billing_mode": "BYOK",
  "orchestration_fee": "£49/mo",
  "providers": [
    "$ANTHROPIC_API_KEY",
    "$OPENAI_API_KEY",
    "$GEMINI_API_KEY",
    "$NVIDIA_API_KEY // 6 OSS models free",
    "$REPLICATE_API_TOKEN",
    "ollama://localhost // self-hosted Llama"
  ],
  "router": "moe_top_k",
  "observed_cost_per_query": "$0.000001",
  "hsp_gate": false // opt-in
}