Completions

Govern swarm · Governance-decision-router agent · Multi- dimensional threshold-routing skill · Build pillar · Published August 3, 2026

How to build multi-dimensional threshold routing for AI-decision pipelines

Every AI-drafted output carries multiple confidence signals at once — brand voice, sentiment, factual confidence, claim substantiation, reach magnitude, compliance confidence, PII / PHI leak risk, hallucination, prompt-injection, jailbreak, toxicity, plus dozens of per-anchor regulatory-confidence scores. Single-threshold guardrail vendors (Guardrails AI + NeMo Guardrails + Llama Guard + Lakera Guard + Robust Intelligence + Aporia + CalypsoAI + Protect AI + Garak + Patronus AI + Arthur Shield + Fiddler AI) apply one cutoff per rule and ignore the composition. The Score + Compose + Gate + Audit skill bundle on the governance-decision-router agent composes the N-dimension threshold under explicit calibration discipline (Platt scaling + isotonic regression + temperature scaling + quantile mapping + conformal prediction) with multiple-comparisons correction, and hands off to the five-destination routing sibling skill that picks the terminal destination. Real regulatory anchors preserved in every per-output audit record: EU AI Act Article 14 + Annex III + Article 22 + 50 + FTC Endorsement Guides + Fake Review Rule + per-vertical overlays + ECOA + GDPR Article 22 + NIST AI RMF.

The 4-skill bundle on the governance-decision-router agent

Score

N-dimension per-output scoring across the standing 40+ dimensions (brand voice + sentiment + factual confidence + claim substantiation + reach magnitude + compliance confidence + empathy / de-escalation + PII / PHI leak risk + hallucination + LLM self-disclosure + prompt-injection + jailbreak + toxicity + profanity + discrimination + harassment + violence + self-harm + sexual / graphic + hate symbol + legal threat + PR crisis trigger + per-regulatory-anchor confidence). Grounded in the runtime brand-voice gate sibling skill + claims-allowlist sibling skill + multi-LLM ensemble agreement under per-vendor zero-retention. Vendor surface includes Guardrails AI + NeMo Guardrails + Llama Guard + Lakera Guard + Robust Intelligence + Aporia + CalypsoAI + Protect AI + Garak + Patronus AI + Arthur Shield + Fiddler AI + OpenAI Moderation + Perspective API + Hive + Galileo + DeepEval + Ragas + TruLens + Phoenix + UpTrain + Inspect AI + Promptfoo + Confident AI. Per-dimension confidence tier + explainability trace written into Audit.

Compose

Four coordinated subsystems. Threshold definition (per- dimension value + direction + confidence interval + per- derivation differentiation + version + PR-style review + multi-stakeholder approval across brand-lead + legal + compliance + franchise-owner + DMO-director + CMO + fractional-CMO + general-counsel + outside-counsel). Per- dimension calibration (Platt scaling + isotonic regression + temperature scaling + quantile mapping + conformal prediction Vovk-Shafer-Gammerman + Mondrian conformal + adaptive conformal + ECE + reliability diagram + Brier score). Aggregation semantics (AND + OR + NAND + NOR + XOR + K-of-N + weighted-sum + min + max + percentile + Bayesian posterior + Dempster-Shafer evidence combination + fuzzy logic + Borda-Condorcet rank + multi-LLM-as-judge). Multiple- comparisons correction (Bonferroni + Holm-Bonferroni + Benjamini-Hochberg FDR + Benjamini-Yekutieli + Tukey HSD + Šidák) when N dimensions × composition would otherwise inflate per-output false-positive rate. Canary rollout + shadow mode + rollback. Composed routing decision hands off to the five-destination routing sibling skill and the FBC override-learning sibling skill.

Gate

Five anchors before commit. Calibration discipline (Platt + isotonic + temperature + quantile mapping + conformal prediction + ECE + reliability diagram + Brier score) + multiple-comparisons correction across N-dimension composition. EU AI Act Article 14 human oversight (the composed routing decision determines whether human review fires) + Article 13 + 15 + 22 + 26 + 50 + Annex III + Article 9 + 10 + 11 + 12 + NIST AI RMF + ISO 42001 + 23894 + 5338. FTC Section 5 + Pfizer 1972 + Endorsement Guides 16 CFR Part 255 + Fake Review Rule 16 CFR Part 465 + MARS + Made-in-USA + Green Guides + Negative-Option + Lanham + per-vertical (HIPAA + FINRA 2210 + SEC Reg S-K + FDA 21 CFR Part 11 + DSCSA + CPSC CPSIA + EPA FIFRA + DEA + CFPB UDAAP + FDD Item 19 per FTC Franchise Rule 16 CFR 436). ECOA Regulation B + Fair Housing + GLBA + COPPA + CCPA + CPRA opt-out + 17-state + GDPR Article 22 + LGPD + DPDP + PIPEDA + CASL. Sarbanes- Oxley 302 / 404 / 906 + PCAOB AS 2201 + SEC Regulation G + SEC C&DI Q100 / 101 / 102 + SEC Reg S-K Item 303 + AICPA non-GAAP + PCAOB AS 2410. Policy-as-code via OPA Rego + AWS Cedar + Casbin + Cerbos + Oso + Styra DAS + Permit.io.

Audit

Per-output WORM record: routing-decision ID + per-banner + per-location + per-AI-agent + per-output + 40+ dimension snapshot + per-threshold snapshot + per-aggregation snapshot + per-multiple-comparisons-correction record + composed routing decision + priority + precedence + escalation timer + canary rollout stage + per-anchor Gate decision with evidence + per-vendor LLM zero-retention verification + per- handoff record to the five-destination routing sibling skill and the FBC override-learning sibling skill. Storage: AWS S3 Object Lock + Azure Blob immutable + Google Cloud Storage Bucket Lock + Wasabi WORM. Retention stacks (longest applicable wins): 7-year FTC + 7-year IRS + 7-year FDD + per-state franchise + 7-year HIPAA + 7-year SOX 802 + 6-year SEC + 5-year PCAOB + GDPR Article 30 + EU AI Act Article 12 + SOC 2 CC7 / CC8. End-to-end replay rewinds every stage.

The real vendor ecosystem this sits above

Guardrails + safety + evaluation

Guardrails AI + NeMo Guardrails + Llama Guard + Lakera Guard + Robust Intelligence + Aporia + CalypsoAI + Protect AI + Garak AI safety. OpenAI Moderation + Perspective API + Hive + Galileo + Patronus AI + Arthur Shield + Fiddler AI shield. DeepEval + Ragas + TruLens + Phoenix + UpTrain + Inspect AI + Promptfoo + Confident AI evaluation backs Score per- dimension scoring.

LLM ensemble + observability + numerical

OpenAI + Anthropic + Google + Mistral + Cohere + Meta + AWS Bedrock + Azure OpenAI + Vertex AI LLM providers under per- vendor zero-retention. LangSmith + Weights & Biases + Arize + WhyLabs + Helicone + Langfuse + PromptLayer + Galileo observability. scikit-learn + PyTorch + JAX + statsmodels numerical stack backs the Platt / isotonic / temperature / quantile-mapping / conformal-prediction calibration pipeline and the multiple-comparisons correction routines.

Policy-as-code + WORM + sibling skills

OPA Rego + AWS Cedar + Casbin + Cerbos + Oso + Styra DAS + Permit.io policy-as-code expresses every Gate rule including the calibration anchor and the EU AI Act Article 14 / 22 / 50 + per-vertical + ECOA + Reg G stack. AWS S3 Object Lock + Azure Blob immutable + Google Cloud Storage Bucket Lock + Wasabi compliance WORM holds the per-output audit substrate. Sibling skills on the governance-decision-router agent: five-destination routing (downstream destination picker); borderline routing; FBC override-learning; routing-audit- trail; nested-autonomy profile-inheritance; marketing-AI- autonomy-profile-configuration.

The 6-workstream reporting cycle

Numeric uplift commitments are not made up-front. The engagement ships a pre-engagement baseline across six workstreams; the cycle tracks delta against that baseline. Reporting is the substrate, not the promise.

  1. Score coverage. Per-dimension scoring coverage across the 40+ standing dimensions; per-vendor guardrail + safety + evaluation vendor uptime; per-vendor LLM zero- retention verification per ensemble call; sibling-skill grounding completeness (brand-voice + claims-allowlist + forbidden-phrase).
  2. Compose quality. Per-dimension calibration (Platt + isotonic + temperature + quantile mapping + conformal prediction) ECE + reliability diagram + Brier score conformance; aggregation choice distribution across AND / OR / NAND / NOR / XOR / K-of-N / weighted-sum / Bayesian posterior / Dempster- Shafer / fuzzy logic / Borda-Condorcet / multi-LLM-as-judge; multiple-comparisons correction adherence; threshold recalibration cadence; canary rollout latency.
  3. Gate quality. Per-anchor evaluation completeness (calibration + EU AI Act + FTC + per-vertical + ECOA + GDPR Article 22 + SOX + Reg G); per-anchor pass / fail / route-to- counsel distribution; remediation-loop turnaround.
  4. Audit quality. Per-output WORM record completeness; retention-window coverage (longest of 7-year FTC + 7-year IRS + 7-year FDD + per-state franchise + 7-year HIPAA + 7-year SOX 802 + 6-year SEC + 5-year PCAOB + GDPR Article 30 + EU AI Act Article 12 + SOC 2 CC7 / CC8); end-to-end replay success rate.
  5. Compliance posture. Article 14 human-oversight coverage via composed routing distribution; Article 50 disclosure stamp coverage; Annex III high-risk classification detection; per-vertical overlay coverage; CCPA opt-out + GDPR Article 22 transparency stamp coverage; SOX / Reg G reconciliation when threshold-routing affects public-co disclosure.
  6. Audit-trail completeness. Per-anchor regulatory citation completeness; sibling-handoff pointer completeness into the governance-decision-router bundle (five-destination routing + borderline routing + FBC override-learning + routing- audit-trail + nested-autonomy profile-inheritance + marketing- AI-autonomy-profile-configuration + content-approval-workflow + multi-stakeholder-approval-routing + tiered-content-filtering + auto-publish-threshold-gating + severity-routing + ai- routing-decision-audit-trail + multi-stream severity routing + versioned-history regulatory-defense).

Frequently asked questions

What is multi-dimensional threshold routing for AI-decision pipelines — and why does one cutoff per rule break at the N-dimension grain?

Every AI-drafted output (marketing claim, creative, product description, review response, social post, ad copy, email, SMS, push, landing-page copy, blog post, FAQ, GBP post, call-transcript summary) carries multiple confidence signals at once. Brand-voice score, sentiment, factual confidence, claim substantiation, reach magnitude, compliance confidence, empathy / de-escalation tone, PII / PHI leak risk, hallucination detection, prompt-injection detection, jailbreak detection, toxicity, profanity, discrimination, harassment, violence, self-harm, sexual / graphic content, hate symbol, legal threat, PR crisis trigger, and dozens of regulatory-anchor confidence scores (FTC substantiation, FTC MARS, FTC AI disclosure, FTC Endorsement Guides, FTC Made-in-USA, FTC Fake Review Rule, CFPB UDAAP, FDA 21 CFR Part 101, FDA DSHEA, HIPAA, FINRA Rule 2210, SEC Reg S-K, EU AI Act Article 50, NIST AI RMF). Single-threshold guardrail vendors apply one cutoff per rule and ignore the composition. The four-skill bundle on the governance-decision-router agent — Score, Compose, Gate, Audit — sits above the per-vendor guardrail surface (Guardrails AI + NeMo Guardrails + Llama Guard + Lakera Guard + Robust Intelligence + Aporia + CalypsoAI + Protect AI + Garak + OpenAI Moderation + Perspective API + Hive + Galileo + Patronus AI + Arthur Shield + Fiddler AI) and composes the N-dimension threshold under explicit calibration discipline with multiple-comparisons correction, then hands off to the five-destination routing sibling skill that picks the terminal destination.

Why do Guardrails AI + NeMo Guardrails + Llama Guard + Lakera Guard + Robust Intelligence + Aporia + CalypsoAI + Protect AI break at multi-dimensional threshold-routing scale?

Each guardrail vendor ships a per-tenant per-rule flat-threshold primitive — one cutoff, binary pass / fail, applied uniformly across all outputs. None composes multiple dimensions under explicit aggregation semantics (AND + OR + NAND + NOR + XOR + K-of-N + weighted-sum + min + max + percentile + Bayesian posterior + Dempster-Shafer evidence combination + fuzzy-logic + Borda-Condorcet rank aggregation + multi-LLM-as-judge). None applies calibration discipline per dimension (Platt scaling + isotonic regression + temperature scaling + quantile mapping + conformal prediction). None applies multiple-comparisons correction when N dimensions are composed — naively combining N raw per-dimension thresholds inflates the per-output false-positive rate. None differentiates per-banner, per-location, per-vertical, per-jurisdiction, per-language, per-audience, per-channel, per-time-of-day, per-day-of-week, per-seasonality thresholds. None enforces the regulatory anchor stack (EU AI Act Article 14 + 22 + 26 + 50 + Annex III + FTC Endorsement Guides + Fake Review Rule + per-vertical overlays + ECOA + GDPR Article 22 + CCPA opt-out + SOX 302 / 404 / 906 + Reg G non-GAAP) before the composed routing decision commits. None writes a WORM record of every dimension snapshot + aggregation choice + threshold derivation. The four-skill bundle Score + Compose + Gate + Audit sits above the per-vendor guardrail surface — it does not replace it. Score scores each dimension. Compose applies aggregation + calibration + multiple-comparisons correction. Gate enforces the per-anchor stack. Audit writes the per-output WORM record and hands off to the five-destination routing sibling skill for the terminal decision.

What does Score do — N-dimension per-output scoring grounded in guardrail + safety + evaluation vendors?

Score runs the per-dimension scoring across the standing 40+ dimensions: brand-voice score (grounded in the runtime brand-voice gate sibling skill); sentiment + emotion + sarcasm + aggression / empathy; factual confidence + claim substantiation (grounded in the claims-allowlist sibling skill + named-entity disambiguation + numeric-claim validation + LLM self-consistency + multi-LLM ensemble agreement across OpenAI GPT-4o + Anthropic Claude Opus + Claude Sonnet + Claude Haiku + Google Gemini Pro 2 + Mistral Large 2 + Cohere Command R+ + Meta Llama 3 70B + Qwen 2 + DeepSeek V3 under per-vendor zero-retention); reach magnitude + impression projection + audience size + per-platform amplification + virality coefficient; compliance confidence (per-anchor pre-evaluation across the regulatory stack); empathy / de-escalation tone; PII / PHI leak risk detection; hallucination detection; LLM self-disclosure detection; prompt-injection detection; jailbreak detection; toxicity + profanity + discrimination + harassment + violence + self-harm + sexual / graphic content + hate symbol + legal threat + PR crisis trigger. Vendor surface includes Guardrails AI + NeMo Guardrails + Llama Guard + Lakera Guard + Robust Intelligence + Aporia + CalypsoAI + Protect AI + Garak + OpenAI Moderation + Perspective API + Hive + Galileo + Patronus AI + Arthur Shield + Fiddler AI; evaluation backed by DeepEval + Ragas + TruLens + Phoenix + UpTrain + Inspect AI + Promptfoo + Confident AI; observability via LangSmith + Weights & Biases + Arize + WhyLabs + Helicone + Langfuse + PromptLayer. Per-dimension confidence tier + explainability trace written into Audit.

What does Compose do — aggregation semantics + per-dimension calibration + multiple-comparisons correction + threshold composition + canary rollout?

Compose applies four coordinated subsystems. Threshold definition: per-dimension threshold value + direction (greater-than, greater-than-or-equal, less-than, less-than-or-equal, within-range, outside-range) + threshold confidence interval + per-derivation differentiation (static, per-banner, per-location, per-location-cluster, per-vertical, per-jurisdiction, per-language, per-audience, per-channel, per-time-of-day, per-day-of-week, per-seasonality, per-event-context) + threshold version + PR-style review + multi-stakeholder approval (brand-lead + legal + compliance + franchise-owner + DMO-director + CMO + fractional-CMO + general-counsel + outside-counsel). Per-dimension calibration: Platt scaling + isotonic regression + temperature scaling + quantile mapping + conformal prediction (Vovk-Shafer-Gammerman framework + Mondrian conformal + adaptive conformal); expected calibration error (ECE) + reliability diagram + Brier score per dimension. Aggregation semantics: AND / OR / NAND / NOR / XOR / K-of-N (2-of-3, 3-of-5, N-of-M) / weighted-sum / min / max / percentile / Bayesian posterior aggregation / Dempster-Shafer evidence combination / fuzzy logic / Borda-Condorcet rank aggregation / multi-LLM-as-judge aggregation. Composed routing decision (per-output composed routing decision + priority + precedence — most-restrictive-dimension-wins, most-permissive-dimension-wins, weighted-priority, multi-stakeholder tiebreak) with escalation timer + fallback action. Multiple-comparisons correction (Bonferroni + Holm-Bonferroni + Benjamini-Hochberg FDR + Benjamini-Yekutieli + Tukey HSD + Šidák) applied when N dimensions × threshold composition would otherwise inflate the per-output false-positive rate. Canary rollout (1% + 5% + 10% + 25% + 50% + 100%) + shadow mode + rollback. The Compose decision hands off to the five-destination routing sibling skill for the terminal destination (auto-publish + batch-review + send-to-FBC + escalate-to-team-lead + reject-with-feedback) and to the FBC override-learning sibling skill that feeds threshold recalibration.

What does Gate do — EU AI Act Article 14 + Annex III + Article 22 + 50 + FTC + per-vertical + ECOA + GDPR Article 22 + SOX + Reg G anchors?

Gate evaluates five operationally distinctive anchors before the composed routing decision commits. Anchor 1 (the most operationally distinctive — distinctive to multi-dimensional threshold routing because naive composition produces miscalibrated decisions): calibration discipline (Platt scaling + isotonic regression + temperature scaling + quantile mapping + conformal prediction Vovk-Shafer-Gammerman + Mondrian conformal + adaptive conformal); per-dimension expected calibration error (ECE) + reliability diagram + Brier score; multiple-comparisons correction (Bonferroni + Holm-Bonferroni + Benjamini-Hochberg FDR + Benjamini-Yekutieli + Tukey HSD + Šidák) across N-dimension composition. Anchor 2 (AI-governance — pairs with the five-destination routing sibling skill): EU AI Act Article 14 human oversight (the composed routing decision is precisely the architecture that determines whether human review fires) + Article 13 transparency to deployers + Article 15 accuracy + robustness + cybersecurity + Article 22 transparency of automated decisions + Article 26 deployer obligations + Article 50 transparency for AI-generated content + Annex III high-risk classification + Article 9 risk-management system + Article 10 data governance + Article 11 technical documentation + Article 12 record-keeping; NIST AI Risk Management Framework Govern + Map + Measure + Manage; ISO 42001 + ISO 23894 + ISO 5338. Anchor 3 (FTC + per-vertical regulatory overlays): FTC Section 5 + Pfizer 1972 substantiation + FTC Endorsement Guides 16 CFR Part 255 (2023 AI-content) + FTC Fake Review Rule 16 CFR Part 465 (October 2024) + FTC MARS + Made-in-USA Labeling Rule + Green Guides + Health Products Compliance Guide + Negative-Option Rule + Lanham Act 15 USC 1125(a); HIPAA when healthcare; FINRA Rule 2210 + SEC Reg S-K when financial services; FDA 21 CFR Part 11 + DSCSA + 21 CFR Part 101 + DSHEA + medical device when FDA-regulated; CPSC CPSIA + EPA FIFRA + DEA + CFPB UDAAP when relevant; FDD Item 19 per FTC Franchise Rule 16 CFR 436 + 15-state franchise registration when franchise. Anchor 4 (anti-discrimination + privacy + automated-decisioning): ECOA Regulation B disparate-impact + Fair Housing Act + GLBA Safeguards Rule + COPPA 15 USC 6501; CCPA + CPRA right to opt out of automated decision-making + 17-state comprehensive privacy + GDPR Article 22 automated decisions + LGPD + DPDP + PIPEDA + CASL. Anchor 5 (financial-reporting discipline when threshold routing affects public-co or PE-sponsor disclosures): Sarbanes-Oxley Section 302 + 404 + 906 + PCAOB AS 2201 + SEC Regulation G non-GAAP + SEC C&DI Q100 / 101 / 102 + SEC Reg S-K Item 303 + AICPA non-GAAP + PCAOB AS 2410; per-vendor LLM zero-retention verified per call. Policy-as-code expression via OPA Rego + AWS Cedar + Casbin + Cerbos + Oso + Styra DAS + Permit.io.

What does Audit do — per-output WORM record + end-to-end replay + handoff to five-destination routing + FBC override-learning?

Audit writes a per-output WORM record at every Compose decision: per-routing-decision ID + per-banner pointer + per-location pointer + per-AI-agent pointer + per-output pointer + per-dimension snapshot (the full 40+ dimension scoring vector + per-dimension confidence tier + per-dimension explainability trace) + per-threshold snapshot (value + direction + confidence interval + derivation + calibration method + version + PR-style review evidence + multi-stakeholder approval evidence) + per-aggregation snapshot (which aggregation semantic + composition trace + Borda-Condorcet rank evidence + Dempster-Shafer belief mass + multi-LLM-as-judge evidence) + per-multiple-comparisons-correction record (Bonferroni / Holm / Benjamini-Hochberg / Šidák choice + adjusted significance level + family-wise error rate target + FDR target) + composed routing decision + priority + precedence + escalation timer + fallback action + canary rollout stage + shadow mode evidence + rollback pointer + per-anchor Gate decision with evidence (EU AI Act Article 14 + Annex III + Article 22 + 50 + FTC + per-vertical + ECOA + GDPR Article 22 + SOX / Reg G) + per-vendor LLM zero-retention verification + handoff record to the five-destination routing sibling skill (auto-publish + batch-review + send-to-FBC + escalate-to-team-lead + reject-with-feedback) and to the FBC override-learning sibling skill (override reason + threshold-recalibration target + cross-output / cross-banner / cross-location correlation). Storage on AWS S3 Object Lock + Azure Blob immutable + Google Cloud Storage Bucket Lock + Wasabi compliance WORM. Retention stacks (longest applicable wins): 7-year FTC substantiation + 7-year IRS + 7-year FDD + per-state franchise registration + 7-year HIPAA medical record + 7-year SOX Section 802 + 6-year SEC + 5-year PCAOB + GDPR Article 30 + EU AI Act Article 12 + SOC 2 CC7 / CC8. End-to-end replay rewinds Score + Compose + Gate + handoff with confidence tier and explainability at every stage. Sibling handoffs flow into the parent multi-dimensional-threshold-routing commercial pillar, the five-destination routing sibling build-pillar, the borderline routing sibling build-pillar, the FBC override-learning sibling build-pillar, the content-approval-workflow + multi-stakeholder-approval-routing + tiered-content-filtering + auto-publish-threshold-gating + severity-routing + ai-routing-decision-audit-trail commercial siblings, the routing-audit-trail sibling build-pillar, the multi-stream severity routing sibling build-pillar, the versioned-history regulatory-defense sibling, the nested-autonomy profile-inheritance sibling, and the marketing-AI-autonomy-profile-configuration sibling.

Engage Completions on the governance-decision-router bundle

The Score + Compose + Gate + Audit four-skill bundle ships as the orchestration layer above your existing guardrail + safety + evaluation + LLM ensemble surface. Calibration discipline + multiple- comparisons correction + EU AI Act Article 14 + Annex III + Article 22 + 50 + FTC Endorsement Guides + Fake Review Rule + per-vertical overlays + ECOA + GDPR Article 22 + NIST AI RMF anchors are preserved in every per-output audit record. Tier 1 AI Readiness Assessment scopes the bundle in two to three weeks; Tier 3 Fractional CMO with AI Swarm operates the bundle end-to-end.