Data-layer swarm · Vertical-Compliance-Overlay-Manager Agent · Llm-semantic-compliance-scoring skill · Build pillar · Published August 16, 2026
How to build a marketing-content LLM-as-judge semantic compliance scorer
This guide explains how to architect the llm-semantic-compliance-scoring skill on the compliance-overlay-manager agent end-to-end at multi-vertical marketing-content LLM-as-judge scale: per-portfolio per-banner per-content-piece per-canonical-eval-dimension-pointer + per-canonical-multi-judge-ensemble-spec + per-canonical-scoring-aggregation-spec + per-canonical-routing-spec + per-canonical-compliance-gate-spec + per-canonical-audit-trail + per-portfolio-audit-trail.
What you will build
- Per-portfolio per-banner per-content-piece per-canonical-eval-dimension-pointer across 45+ judgment aspects — brand-voice spec adherence + claims allowlist + forbidden phrase library + per-vertical compliance overlay (HIPAA + FINRA + FDA + DEA + cannabis + alcohol + tobacco + lottery + FCRA + ECOA + Fair Housing + GLBA) + FTC substantiation + MARS + AI-disclosure + endorsement + fake-review rule of 2024 + Made-in-USA + Green Guides + Health Products Compliance Guide + factual accuracy hallucination detection + coherence + fluency + tone match + cadence match + empathy + resolution effectiveness + CTA effectiveness + channel-format compliance + multi-language quality + WCAG plain-language readability + reading-level (Flesch-Kincaid + SMOG + Coleman-Liau + Gunning-Fog) + sentiment + politeness + bias detection (gender + race + religion + LGBTQ + disability + age + national origin) + disparate impact (ECOA + Fair Housing) + HIPAA-PHI leak + PCI scope leak + PII leak + prompt-injection trace + jailbreak trace + data-exfiltration attempt + RAG citation accuracy + RAG context faithfulness/relevance + RAG answer relevance + toxicity + profanity + spam pattern + persuasion dark pattern + manipulation + misinformation + disinformation + deepfake tampering.
- Per-canonical-multi-judge-ensemble-spec — per-18-LLM-judge-pool (GPT-4o + GPT-4-turbo + Claude Sonnet + Claude Opus + Claude Haiku + Gemini Pro + Gemini Flash + Mistral Large + Mistral Nemo + Cohere Command + Llama 3 70B + Llama 3 405B + Meta LlamaGuard + ShieldGemma + NVIDIA NeMo Guardrails + Azure AI Content Safety + AWS Bedrock Guardrails + Google Vertex AI Guardrails Aegis) + per-17-eval-pattern (pairwise comparison + single-answer grading + reference-based + reference-free + chain-of-thought + tree-of-thought + self-consistency + Constitutional AI + aspect-based + G-Eval + GEMBA-MQM + RAGAS 4-metric + DeepEval + LLM-as-jury) + per-12-prompt-engineering-technique (chain-of-thought + few-shot per-vertical golden + Constitutional + tree-of-thought + self-consistency sampling + multi-turn + step-back + generated knowledge + plan-and-solve + ReAct + Reflexion + self-refine) + per-eval-dataset (golden per-vertical + adversarial + held-out validation + time-decayed regression + LLM/prompt/tool version regression + drift detection) + per-8-bias-mitigation (position + verbosity + self-enhancement + anchoring + confirmation + cultural + length + style) + per-judge-confidence-tier.
- Per-canonical-scoring-aggregation-spec — per-aspect-weighted-sum + per-aspect-threshold-pass-fail + per-6-ensemble-aggregation (majority vote + weighted average + Borda count + Condorcet + Bayesian belief + Dempster-Shafer) + per-4-inter-judge-agreement (Cohen kappa + Krippendorff alpha + Fleiss kappa + Gwet AC1) + per-6-per-judge-calibration (Brier score + ECE + reliability diagram + Platt scaling + isotonic regression + temperature scaling) + per-5-confidence-interval (Wilson + Clopper-Pearson + Jeffreys + bootstrap percentile + Bayesian credible 95%) + per-judge-bias-correction + per-aggregation-confidence-tier + per-Shapley-LIME-counterfactual-anchor-per-aspect-trace.
- Per-canonical-routing-spec + per-canonical-compliance-gate-spec — per-routing-tier (auto-publish + rep review + manager review + compliance review + legal review + reject-redraft + emergency pause + five-destination handoff) + per-remediation-suggestion (specific rewrite + claim substitution + disclaimer addition + trademark substitution + reading-level simplification + tone shift + CTA rephrase + channel truncation + language translation) + per-multi-arm-bandit-UCB-Thompson + per-causal-uplift-CATE + per-FBC-feedback-loop + per-routing-confidence-tier + per-EU-AI-Act-Article-50-transparency-the-judge-must-disclose + per-EU-AI-Act-Article-13-14-15-high-risk-system + per-EU-AI-Act-Article-22-automated-decision-profiling + per-NIST-AI-RMF-GOVERN-MAP-MEASURE-MANAGE + per-ISO-42001-AI-management-system + per-ISO-27001-information-security + per-SOC-2-Type-II + per-NIST-Privacy-Framework + per-OECD-AI-Principles + per-G7-Hiroshima-Process-Code-of-Conduct + per-White-House-AI-Bill-of-Rights + per-Executive-Order-14110-AI + per-EEOC-AI-employment + per-FCRA-AI-credit + per-ECOA-Reg-B-disparate-impact + per-Fair-Housing-Act-disparate-impact + per-HIPAA-no-PHI-in-judgment-trace + per-PCI-no-CHD-in-judgment-trace + per-FTC-Section-5-unfair-deceptive + per-CFPB-UDAAP + per-FDA-21-CFR-Part-11-electronic-signature + per-CCPA-CPRA + per-GDPR-Article-6-7-17-22 + per-LGPD + per-DPDP + per-PIPEDA + per-Digital-Services-Act-Article-30 + per-Digital-Markets-Act + per-OPA-Cedar-Casbin-Cerbos-Oso-policy-as-code + per-compliance-confidence-tier.
- Per-canonical-cross-skill-handoff + per-canonical-audit-trail — per-handoff-to-30-sibling-skills + per-per-judgment-canonical-audit-record + per-immutable-WORM-storage + per-7-year-IRS-tax-retention + per-7-year-FTC-substantiation-retention + per-7-year-HIPAA-medical-record-retention + per-6-year-SEC-record-retention + per-3-year-FINRA-record-retention + per-FDA-21-CFR-Part-11-electronic-signature-retention.
Why per-vendor-Braintrust-account-flat-eval-run breaks at multi-vertical marketing-content LLM-as-judge scale
Per-vendor-Braintrust-canonical-account-flat-eval-run ships per-account per-flat-eval-run primitive — typically a prompt engineer wires Braintrust to their LLM application, runs a single GPT-4 judge against a fixed rubric, sees a pass/fail score in the dashboard, and ships. No per-canonical-eval-dimension taxonomy across the 45+ judgment aspects, no per-canonical-multi-judge-ensemble resolving Constitutional-AI-principles + G-Eval-chain-of-thought + GEMBA-MQM + RAGAS-faithfulness-answer-relevance-context-precision-context-recall + DeepEval-metrics + LLM-as-jury-majority-vote-weighted-average-Borda-count-Condorcet + Bayesian-belief-aggregation + Cohen-kappa-Krippendorff-alpha-Fleiss-kappa-inter-judge-agreement + per-judge-calibration-Brier-ECE-reliability-diagram + Wilson-Clopper-Pearson-Jeffreys-confidence-interval + position-bias-verbosity-bias-self-enhancement-bias-correction, no per-content scoring aggregation with per-aspect-weighted-sum + per-aspect-threshold-pass-fail + multi-judge-ensemble-Borda-Condorcet, no per-content routing with auto-publish-tier + human-review-tier + reject-tier + remediation-suggestion-with-specific-rewrite, no per-judgment compliance gate with EU-AI-Act-Article-50 (the judge itself must disclose) / NIST-AI-RMF-GOVERN-MAP-MEASURE-MANAGE / ISO-42001 / ISO-27001 / SOC-2-Type-II enforcement, no per-judgment audit trail with regulatory-defense retention. Per-vendor-LangSmith + W&B-Weave + Humanloop + Promptfoo + Helicone + Galileo + Arize-Phoenix + Patronus-AI + Confident-AI-DeepEval + MLflow-LLM + Comet-ML-LLM + Logfire + OpenLLMetry + Langfuse + TruEra + WhyLabs + Fiddler-AI + Robust-Intelligence + Latitude + Vellum + PromptLayer + OpenAI-Evals + Anthropic-Eval + Google-Vertex-AI-Eval + Azure-AI-Foundry-Evaluations + AWS-Bedrock-Evaluations + NVIDIA-NeMo-Evaluator-canonical-account-flat-eval-run ship per-vendor per-native account-flat-eval-run primitives.
At 1-account-1-flat-eval-run scale per-account per-flat-eval-run primitive is enough. At multi-vertical marketing-content LLM-as-judge scale per-canonical-eval-dimension-pointer + per-canonical-multi-judge-ensemble-spec + per-canonical-scoring-aggregation-spec + per-canonical-routing-spec + per-canonical-compliance-gate-spec + per-canonical-audit-trail.
The judge-must-disclose-itself anchor under EU AI Act Article 50 is the operationally distinctive constraint: any judge-issued score that downstream propagates into customer-facing content makes the judge a Article-50 transparency-required system whose disclosure trace must be retained. Per-vendor account-flat-eval-run primitives produce a score without Article-50 disclosure framing — unfit for multi-vertical operators where regulated-vertical content (healthcare + financial + cannabis + alcohol + tobacco + franchise FDD) triggers Article-50 enforcement exposure across every judgment the system issues.
The operator-side architecture above per-vendor-flat-eval-run primitive is canonical-eval-dimension-pointer + per-multi-judge-ensemble-spec + per-scoring-aggregation-spec + per-routing-spec + per-compliance-gate-spec + per-cross-skill-handoff + per-audit-trail + per-portfolio-audit-trail.
What is in market today
Per-platform per-LLM-eval-platform-vendor
Braintrust, LangSmith, Weights & Biases Weave, Humanloop, Promptfoo, Helicone, Galileo, Arize Phoenix, Patronus AI, Confident AI/DeepEval, MLflow LLM, Comet ML LLM, Logfire, OpenLLMetry, Langfuse, TruEra, WhyLabs, Fiddler AI, Robust Intelligence, Latitude, Vellum, PromptLayer. Per-account per-flat-eval-run primitive. Per-canonical-eval-dimension-pointer-canonical-multi-judge-ensemble-canonical-scoring-aggregation-canonical-routing-canonical-compliance-gate-canonical-audit-trail is not the primitive.
Per-platform per-foundation-model-eval-vendor
OpenAI Evals, Anthropic Eval, Google Vertex AI Eval, Azure AI Foundry Evaluations, AWS Bedrock Evaluations, NVIDIA NeMo Evaluator, Mistral Le Chat Eval, Cohere Eval. Per-account per-flat-foundation-model-eval primitive (typically blind to per-content 18-LLM-judge-pool + 17-eval-pattern + 12-prompt-engineering-technique + 8-bias-mitigation + 6-aggregation-method + 4-inter-judge-agreement + 6-judge-calibration + 5-confidence-interval semantics). Per-canonical-per-content-judge-pool-canonical-per-content-eval-pattern-canonical-per-content-prompt-engineering-canonical-per-content-eval-dataset-canonical-per-content-bias-mitigation-canonical-per-content-judge-confidence-tier-canonical-per-content-judge-explainability is not the primitive.
Per-platform per-eval-framework-research
G-Eval (GPT-4 chain-of-thought), GEMBA-MQM (machine translation quality), RAGAS (RAG: faithfulness + answer relevance + context precision + context recall), DeepEval, MT-Bench, AlpacaEval, Chatbot Arena, LMSYS Arena. Per-account per-flat-research-benchmark primitive (typically blind to per-content per-aspect-weighted-sum + per-aspect-threshold-pass-fail + 6-ensemble-aggregation-method + Borda-count + Condorcet + Bayesian-belief + Dempster-Shafer + Shapley-LIME-counterfactual-anchor per-aspect-trace semantics). Per-canonical-per-content-per-aspect-weighted-sum-canonical-per-content-per-aspect-threshold-pass-fail-canonical-per-content-multi-judge-ensemble-aggregation-canonical-per-content-inter-judge-agreement-canonical-per-content-per-judge-calibration-canonical-per-content-confidence-interval-canonical-per-content-per-judge-bias-correction-canonical-per-content-aggregation-confidence-tier-canonical-per-content-aggregation-explainability is not the primitive.
Per-platform per-AI-GRC-vendor + per-CMP-vendor
Credo AI, Holistic AI, Trustible, Saidot, Anekanta AI, Mostly AI, IBM watsonx.governance, ServiceNow AI Control Tower, Sapien, Surge AI, Snorkel AI, OneTrust, TrustArc, Ketch, Securiti, Privacera, Skyflow, BigID, DataGrail, Transcend. Per-account per-flat-AI-governance-report or per-flat-consent primitive (typically blind to per-judgment EU AI Act Article 50 the-judge-must-disclose + Article 13/14/15 high-risk + Article 22 automated decision profiling + EEOC AI employment + FCRA AI credit + ECOA disparate impact + Fair Housing disparate impact + HIPAA no-PHI in judgment trace + PCI no-CHD in judgment trace + FDA 21 CFR Part 11 electronic-signature + Executive Order 14110 + G7 Hiroshima + White House AI Bill of Rights + OECD AI Principles + NIST AI RMF/Privacy Framework semantics). Per-canonical-per-judgment-EU-AI-Act-Article-50-canonical-per-judgment-EU-AI-Act-Article-13-14-15-canonical-per-judgment-EU-AI-Act-Article-22-canonical-per-judgment-NIST-AI-RMF-canonical-per-judgment-ISO-42001-canonical-per-judgment-ISO-27001-canonical-per-judgment-SOC-2-Type-II-canonical-per-judgment-OECD-AI-Principles-canonical-per-judgment-G7-Hiroshima-canonical-per-judgment-White-House-AI-Bill-of-Rights-canonical-per-judgment-Executive-Order-14110-canonical-per-judgment-EEOC-AI-employment-canonical-per-judgment-FCRA-AI-credit-canonical-per-judgment-ECOA-Reg-B-disparate-impact-canonical-per-judgment-Fair-Housing-Act-disparate-impact-canonical-per-judgment-HIPAA-no-PHI-canonical-per-judgment-PCI-no-CHD-canonical-per-judgment-FDA-21-CFR-Part-11 is not the primitive.
How the architecture is built
- Per-portfolio per-banner per-content-piece per-canonical-eval-dimension-pointer-substrate. Per-45-canonical-eval-dimension canonical-source.
- Per-portfolio per-canonical-multi-judge-ensemble-spec. Per-18-LLM-judge-pool + per-17-eval-pattern + per-12-prompt-engineering-technique + per-eval-dataset + per-8-bias-mitigation + per-judge-confidence-tier canonical-multi-judge.
- Per-portfolio per-canonical-scoring-aggregation-spec. Per-aspect-weighted-sum + per-aspect-threshold-pass-fail + per-6-ensemble-aggregation + per-4-inter-judge-agreement + per-6-per-judge-calibration + per-5-confidence-interval + per-judge-bias-correction + per-aggregation-confidence-tier + per-Shapley-LIME-counterfactual-anchor-per-aspect-trace canonical-scoring-aggregation.
- Per-portfolio per-canonical-routing-spec. Per-routing-tier + per-remediation-suggestion + per-multi-arm-bandit-UCB-Thompson + per-causal-uplift-CATE + per-FBC-feedback-loop + per-routing-confidence-tier canonical-routing.
- Per-portfolio per-canonical-compliance-gate-spec. Per-EU-AI-Act-Article-50 + per-EU-AI-Act-Article-13-14-15 + per-EU-AI-Act-Article-22 + per-NIST-AI-RMF + per-ISO-42001 + per-ISO-27001 + per-SOC-2-Type-II + per-NIST-Privacy-Framework + per-OECD-AI-Principles + per-G7-Hiroshima-Code-of-Conduct + per-White-House-AI-Bill-of-Rights + per-Executive-Order-14110 + per-EEOC-AI-employment + per-FCRA-AI-credit + per-ECOA-Reg-B-disparate-impact + per-Fair-Housing-Act-disparate-impact + per-HIPAA-no-PHI-in-judgment-trace + per-PCI-no-CHD-in-judgment-trace + per-FTC-Section-5 + per-CFPB-UDAAP + per-FDA-21-CFR-Part-11-electronic-signature + per-CCPA-CPRA + per-GDPR-Article-6-7-17-22 + per-LGPD + per-DPDP + per-PIPEDA + per-Digital-Services-Act-Article-30 + per-Digital-Markets-Act + per-OPA-Cedar-Casbin-Cerbos-Oso-policy-as-code canonical-compliance.
- Per-portfolio per-canonical-cross-skill-handoff. Per-handoff-to-30-sibling-skills canonical-handoff.
- Per-portfolio per-canonical-audit-trail + per-portfolio-audit-trail. Per-per-judgment-canonical-audit-record + per-immutable-WORM-storage + per-7-year-IRS-tax-retention + per-7-year-FTC-substantiation-retention + per-7-year-HIPAA-medical-record-retention + per-6-year-SEC-record-retention + per-3-year-FINRA-record-retention + per-FDA-21-CFR-Part-11-electronic-signature-retention canonical-audit.
- Per-portfolio per-compliance-overlay-manager-agent-canonical-bundle. Per-llm-semantic-compliance-scoring + per-pre-filter-deterministic-gates + per-per-vertical-overlay-authoring + per-per-jurisdiction-overlay-authoring + per-regulatory-change-monitoring + per-overlay-version-control + per-overlay-conflict-resolution-with-brand-spec + per-rule-extraction-from-source-docs canonical-bundle.
- Per-portfolio per-canonical-end-to-end-SLA. Per-eval-dimension-resolve-to-multi-judge-ensemble-to-scoring-aggregation-to-inter-judge-agreement-to-bias-correction-to-routing-tier-to-remediation-suggestion-to-compliance-gate-to-Article-50-disclosure-to-audit-trail-SLA canonical-end-to-end-SLA.
- Per-portfolio per-canonical-end-to-end-replay. Per-judge-pool-rewind + per-eval-pattern-rewind + per-scoring-aggregation-rewind + per-routing-rewind + per-compliance-gate-rewind + per-replay-confidence-tier + per-replay-explainability canonical-replay.
Frequently asked questions
What is LLM-as-judge — and what is a marketing-content semantic compliance scorer at multi-vertical scale?
LLM-as-judge is the methodology where one or more LLMs evaluate the outputs of another LLM against a rubric — pairwise comparison (A vs B), single-answer grading, reference-based scoring (against gold standard), reference-free scoring, chain-of-thought evaluation, tree-of-thought, self-consistency, Constitutional AI principles, aspect-based scoring, G-Eval (GPT-4 chain-of-thought), GEMBA-MQM (machine translation quality), RAGAS (RAG evaluation: faithfulness + answer relevance + context precision + context recall), DeepEval metrics, LLM-as-jury (multiple judges + ensemble), with position-bias / verbosity-bias / self-enhancement-bias mitigation. A multi-vertical marketing-content semantic compliance scorer runs per-portfolio per-banner per-content-piece per-canonical-eval-dimension-pointer (per-brand-voice-spec-adherence + per-claims-allowlist + per-forbidden-phrase-library + per-per-vertical-compliance-overlay + per-FTC-substantiation + per-FTC-MARS + per-FTC-AI-disclosure + per-FTC-endorsement-guides + per-FTC-fake-review-rule-of-2024 + per-FTC-Made-in-USA + per-FTC-Green-Guides + per-FTC-Health-Products-Compliance-Guide + per-factual-accuracy-hallucination-detection + per-coherence + per-fluency + per-tone-match + per-cadence-match + per-empathy + per-resolution-effectiveness + per-CTA-effectiveness + per-channel-format-compliance + per-multi-language-quality + per-WCAG-plain-language-readability + per-reading-level-Flesch-Kincaid-SMOG-Coleman-Liau-Gunning-Fog + per-sentiment + per-politeness + per-bias-detection-gender-race-religion-LGBTQ-disability-age-national-origin + per-disparate-impact-ECOA-Fair-Housing + per-HIPAA-PHI-leak + per-PCI-scope-leak + per-PII-leak + per-prompt-injection-trace + per-jailbreak-trace + per-data-exfiltration-attempt + per-RAG-citation-accuracy + per-RAG-context-faithfulness + per-RAG-context-relevance + per-RAG-answer-relevance + per-toxicity + per-profanity + per-spam-pattern + per-persuasion-dark-pattern + per-manipulation + per-misinformation + per-disinformation + per-deepfake-tampering + per-canonical-eval-dimension) + per-canonical-multi-judge-ensemble-spec + per-canonical-scoring-aggregation-spec + per-canonical-routing-spec + per-canonical-compliance-gate-spec + per-canonical-audit-trail + per-portfolio-audit-trail.
Why does per-vendor-Braintrust-canonical-account-flat-eval-run break at multi-vertical marketing-content LLM-as-judge scale?
Per-vendor-Braintrust-canonical-account-flat-eval-run ships per-account per-flat-eval-run primitive — typically a prompt engineer wires Braintrust to their LLM application, runs a single GPT-4 judge against a fixed rubric, sees a pass/fail score in the dashboard, and ships. No per-canonical-eval-dimension taxonomy across the 45+ judgment aspects, no per-canonical-multi-judge-ensemble resolving Constitutional-AI-principles + G-Eval-chain-of-thought + GEMBA-MQM + RAGAS-faithfulness-answer-relevance-context-precision-context-recall + DeepEval-metrics + LLM-as-jury-majority-vote-weighted-average-Borda-count-Condorcet + Bayesian-belief-aggregation + Cohen-kappa-Krippendorff-alpha-Fleiss-kappa-inter-judge-agreement + per-judge-calibration-Brier-ECE-reliability-diagram + Wilson-Clopper-Pearson-Jeffreys-confidence-interval + position-bias-verbosity-bias-self-enhancement-bias-correction, no per-content scoring aggregation with per-aspect-weighted-sum + per-aspect-threshold-pass-fail + multi-judge-ensemble-Borda-Condorcet, no per-content routing with auto-publish-tier + human-review-tier + reject-tier + remediation-suggestion-with-specific-rewrite, no per-judgment compliance gate with EU-AI-Act-Article-50 (the judge itself must disclose) / NIST-AI-RMF-GOVERN-MAP-MEASURE-MANAGE / ISO-42001 / ISO-27001 / SOC-2-Type-II enforcement, no per-judgment audit trail with regulatory-defense retention. Per-vendor-LangSmith + W&B-Weave + Humanloop + Promptfoo + Helicone + Galileo + Arize-Phoenix + Patronus-AI + Confident-AI-DeepEval + MLflow-LLM + Comet-ML-LLM + Logfire + OpenLLMetry + Langfuse + TruEra + WhyLabs + Fiddler-AI + Robust-Intelligence + Latitude + Vellum + PromptLayer + OpenAI-Evals + Anthropic-Eval + Google-Vertex-AI-Eval + Azure-AI-Foundry-Evaluations + AWS-Bedrock-Evaluations + NVIDIA-NeMo-Evaluator-canonical-account-flat-eval-run ship per-vendor per-native account-flat-eval-run primitives. At 1-account-1-flat-eval-run scale per-account per-flat-eval-run primitive is enough. At multi-vertical marketing-content LLM-as-judge scale per-canonical-eval-dimension-pointer + per-canonical-multi-judge-ensemble-spec + per-canonical-scoring-aggregation-spec + per-canonical-routing-spec + per-canonical-compliance-gate-spec + per-canonical-audit-trail.
How does per-content multi-judge-ensemble + per-content scoring-aggregation work?
Per-portfolio per-banner per-content-piece per-canonical-multi-judge-ensemble-spec runs per-portfolio per-canonical-per-content-judge-pool (per-GPT-4o + per-GPT-4-turbo + per-Claude-Sonnet + per-Claude-Opus + per-Claude-Haiku + per-Gemini-Pro + per-Gemini-Flash + per-Mistral-Large + per-Mistral-Nemo + per-Cohere-Command + per-Llama-3-70B + per-Llama-3-405B + per-Meta-LlamaGuard + per-ShieldGemma + per-NVIDIA-NeMo-Guardrails + per-Azure-AI-Content-Safety + per-AWS-Bedrock-Guardrails + per-Google-Vertex-AI-Guardrails-Aegis) + per-canonical-per-content-eval-pattern (per-pairwise-comparison-A-vs-B + per-single-answer-grading-rubric + per-reference-based-against-gold + per-reference-free + per-chain-of-thought + per-tree-of-thought + per-self-consistency + per-Constitutional-AI + per-aspect-based + per-G-Eval-GPT-4-chain-of-thought + per-GEMBA-MQM + per-RAGAS-faithfulness + per-RAGAS-answer-relevance + per-RAGAS-context-precision + per-RAGAS-context-recall + per-DeepEval-metric + per-LLM-as-jury) + per-canonical-per-content-prompt-engineering (per-chain-of-thought + per-few-shot-per-vertical-golden-example + per-Constitutional-principle + per-tree-of-thought + per-self-consistency-sampling + per-multi-turn-evaluation + per-step-back-prompting + per-generated-knowledge + per-plan-and-solve + per-ReAct-reason-act + per-Reflexion + per-self-refine) + per-canonical-per-content-eval-dataset (per-golden-example-per-vertical + per-adversarial-test-set-prompt-injection-jailbreak-edge-case + per-held-out-validation + per-time-decayed-regression-test + per-LLM-version-regression + per-prompt-template-regression + per-tool-version-regression + per-drift-detection) + per-canonical-per-content-bias-mitigation (per-position-bias + per-verbosity-bias + per-self-enhancement-bias + per-anchoring-bias + per-confirmation-bias + per-cultural-bias + per-length-bias-correction + per-style-bias-correction) + per-canonical-per-content-judge-confidence-tier + per-canonical-per-content-judge-explainability. Per-canonical-scoring-aggregation-spec runs per-portfolio per-canonical-per-content-per-aspect-weighted-sum + per-canonical-per-content-per-aspect-threshold-pass-fail + per-canonical-per-content-multi-judge-ensemble-aggregation (per-majority-vote + per-weighted-average + per-Borda-count + per-Condorcet + per-Bayesian-belief-aggregation + per-Dempster-Shafer-evidence-combination) + per-canonical-per-content-inter-judge-agreement (per-Cohen-kappa + per-Krippendorff-alpha + per-Fleiss-kappa + per-Gwet-AC1) + per-canonical-per-content-per-judge-calibration (per-Brier-score + per-ECE-expected-calibration-error + per-reliability-diagram + per-Platt-scaling + per-isotonic-regression + per-temperature-scaling) + per-canonical-per-content-confidence-interval (per-Wilson + per-Clopper-Pearson + per-Jeffreys + per-bootstrap-percentile + per-Bayesian-credible-interval-95-percent) + per-canonical-per-content-per-judge-bias-correction + per-canonical-per-content-aggregation-confidence-tier + per-canonical-per-content-aggregation-explainability (per-Shapley-attribution + per-LIME + per-counterfactual + per-anchor-explanation + per-per-aspect-trace-with-judge-rationale).
What does per-content routing + per-judgment compliance-gate do?
Per-portfolio per-banner per-content-piece per-canonical-routing-spec runs per-portfolio per-canonical-per-content-routing-tier (per-auto-publish-tier-when-all-aspects-pass-and-confidence-high + per-rep-review-tier-when-borderline-confidence + per-manager-review-tier-when-claims-flag + per-compliance-review-tier-when-vertical-overlay-flag + per-legal-review-tier-when-FINRA-FDA-DEA-FTC-violation + per-reject-and-redraft-tier-when-policy-violation + per-emergency-pause-tier-when-crisis-pattern + per-five-destination-routing-handoff) + per-canonical-per-content-remediation-suggestion (per-specific-rewrite + per-claim-substitution + per-disclaimer-addition + per-trademark-substitution + per-reading-level-simplification + per-tone-shift + per-CTA-rephrase + per-channel-truncation + per-language-translation) + per-canonical-per-content-multi-arm-bandit (per-UCB + per-Thompson-Sampling + per-Epsilon-Greedy + per-LinUCB + per-Contextual-bandit + per-causal-uplift-CATE-T-S-X-DR-learner + per-CausalML + per-DoubleML + per-EconML) + per-canonical-per-content-FBC-feedback-loop (per-realized-vs-predicted-publish-rate + per-realized-vs-predicted-human-overrule-rate + per-realized-vs-predicted-violation-rate + per-realized-vs-predicted-CTR + per-realized-vs-predicted-conversion + per-pattern-learning + per-multi-arm-bandit-regret + per-recalibration) + per-canonical-per-content-routing-confidence-tier + per-canonical-per-content-routing-explainability. Per-canonical-compliance-gate-spec runs per-portfolio per-canonical-per-judgment-EU-AI-Act-Article-50-transparency (the judge itself must disclose its AI nature) + per-canonical-per-judgment-EU-AI-Act-Article-13-14-15-high-risk-system (LLM-as-judge in regulated verticals = high-risk under EU AI Act) + per-canonical-per-judgment-EU-AI-Act-Article-22-automated-decision-profiling (human review tier required when judgment affects natural persons) + per-canonical-per-judgment-NIST-AI-RMF-GOVERN-MAP-MEASURE-MANAGE + per-canonical-per-judgment-ISO-42001-AI-management-system + per-canonical-per-judgment-ISO-27001-information-security + per-canonical-per-judgment-SOC-2-Type-II + per-canonical-per-judgment-NIST-Privacy-Framework + per-canonical-per-judgment-OECD-AI-Principles + per-canonical-per-judgment-G7-Hiroshima-Process-Code-of-Conduct + per-canonical-per-judgment-White-House-AI-Bill-of-Rights + per-canonical-per-judgment-Executive-Order-14110-AI + per-canonical-per-judgment-EEOC-AI-employment + per-canonical-per-judgment-FCRA-AI-credit (when content adjacent to credit decisioning) + per-canonical-per-judgment-ECOA-Reg-B-disparate-impact + per-canonical-per-judgment-Fair-Housing-Act-disparate-impact + per-canonical-per-judgment-HIPAA-no-PHI-in-judgment-trace + per-canonical-per-judgment-PCI-no-CHD-in-judgment-trace + per-canonical-per-judgment-FTC-Section-5-unfair-deceptive + per-canonical-per-judgment-CFPB-UDAAP + per-canonical-per-judgment-FDA-21-CFR-Part-11-electronic-signature (when judgment artifacts retained as regulatory evidence) + per-canonical-per-judgment-CCPA-CPRA + per-canonical-per-judgment-GDPR-Article-6-7-17-22 + per-canonical-per-judgment-LGPD + per-canonical-per-judgment-DPDP + per-canonical-per-judgment-PIPEDA + per-canonical-per-judgment-Digital-Services-Act-Article-30 + per-canonical-per-judgment-Digital-Markets-Act + per-canonical-per-judgment-OPA-Rego-AWS-Cedar-Casbin-Cerbos-Oso-policy-as-code + per-canonical-per-judgment-compliance-confidence-tier + per-canonical-per-judgment-compliance-explainability. The judge-must-disclose-itself anchor under EU AI Act Article 50 is the operationally distinctive constraint: any judge-issued score that downstream propagates into customer-facing content makes the judge a Article-50 transparency-required system whose disclosure trace must be retained.
What does per-judgment cross-skill-handoff + per-compliance-overlay-manager-agent-canonical-bundle do?
Per-portfolio per-judgment per-canonical-per-judgment-cross-skill-handoff runs per-portfolio per-canonical-per-judgment-handoff-to-llm-semantic-compliance-scoring (parent commercial pillar at /llm-semantic-compliance-scoring) + per-canonical-per-judgment-handoff-to-compliance-overlay-manager (parent agent) + per-canonical-per-judgment-handoff-to-pre-filter-deterministic-gates-build-pillar (sibling build-pillar at /how-to-build-tiered-pre-filter-deterministic-gates-for-ai-content-compliance — this is the Tier-3-through-Tier-4 implementation that pre-filter Tier-1 + Tier-2 routes into) + per-canonical-per-judgment-handoff-to-tiered-content-filtering (parent commercial of pre-filter at /tiered-content-filtering) + per-canonical-per-judgment-handoff-to-marketing-compliance-software + per-canonical-per-judgment-handoff-to-compliance-checklist + per-canonical-per-judgment-handoff-to-per-sku-compliance-gate + per-canonical-per-judgment-handoff-to-channel-policy-validation + per-canonical-per-judgment-handoff-to-product-compliance + per-canonical-per-judgment-handoff-to-brand-voice-management (sibling skill on brand-spec-authoring agent) + per-canonical-per-judgment-handoff-to-forbidden-phrase-library + per-canonical-per-judgment-handoff-to-claims-allowlist-substantiation + per-canonical-per-judgment-handoff-to-voice-attribute-extraction + per-canonical-per-judgment-handoff-to-structured-spec-authoring + per-canonical-per-judgment-handoff-to-borderline-routing (sibling skill on governance-decision-router agent) + per-canonical-per-judgment-handoff-to-five-destination-routing + per-canonical-per-judgment-handoff-to-fbc-override-learning + per-canonical-per-judgment-handoff-to-multi-dimensional-threshold-routing + per-canonical-per-judgment-handoff-to-marketing-ai-autonomy-profile-configuration-build-pillar + per-canonical-per-judgment-handoff-to-cs-agent-assist-build-pillar (sibling build-pillar at /how-to-build-multi-location-review-response-agent-assist — every drafted review response runs through this semantic scorer) + per-canonical-per-judgment-handoff-to-event-tie-in-drafting-build-pillar (sibling build-pillar at /how-to-build-per-location-event-tie-in-drafting-at-multi-location-scale — every event tie-in copy passes through this scorer) + per-canonical-per-judgment-handoff-to-weather-seasonality-patterns-build-pillar + per-canonical-per-judgment-handoff-to-per-location-dynamic-content-build-pillar (sibling build-pillar at /how-to-build-per-location-dynamic-content-for-multi-location-communications) + per-canonical-per-judgment-handoff-to-per-location-rich-result-eligibility-scoring-build-pillar + per-canonical-per-judgment-handoff-to-per-sku-description-generation-build-pillar (sibling build-pillar at /how-to-build-sku-by-channel-bulk-description-orchestration-at-catalog-scale) + per-canonical-per-judgment-handoff-to-routing-audit-trail-build-pillar + per-canonical-per-judgment-handoff-to-versioned-customer-history-DSAR-build-pillar + per-canonical-per-judgment-handoff-to-versioned-history-regulatory-defense-build-pillar + per-canonical-per-judgment-handoff-to-master-record-build-pillar + per-canonical-per-judgment-handoff-to-per-jurisdiction-compliance-multi-state-franchise-build-pillar + per-canonical-per-judgment-handoff-to-jsonld-generation-build-pillar + per-canonical-per-judgment-handoff-to-multi-source-attribution-preserving-lead-ingestion-build-pillar + per-canonical-per-judgment-handoff-to-per-location-multi-model-attribution-build-pillar + per-canonical-per-judgment-handoff-to-per-location-per-cohort-two-sigma-anomaly-detection-build-pillar (sibling build-pillar at /how-to-build-per-location-per-cohort-two-sigma-anomaly-detection — judge-output drift gets z-scored against cohort baseline). Per-compliance-overlay-manager-agent-canonical-bundle integrates the llm-semantic-compliance-scoring skill with sibling skills on the same compliance-overlay-manager agent: per-canonical-llm-semantic-compliance-scoring (this skill) + per-canonical-pre-filter-deterministic-gates + per-canonical-per-vertical-overlay-authoring + per-canonical-per-jurisdiction-overlay-authoring + per-canonical-regulatory-change-monitoring + per-canonical-overlay-version-control + per-canonical-overlay-conflict-resolution-with-brand-spec + per-canonical-rule-extraction-from-source-docs. Per-canonical-end-to-end-SLA runs per-canonical-per-content-eval-dimension-resolve-to-multi-judge-ensemble-to-scoring-aggregation-to-inter-judge-agreement-to-bias-correction-to-routing-tier-to-remediation-suggestion-to-compliance-gate-to-Article-50-disclosure-to-audit-trail-SLA canonical-SLA.
What does per-judgment audit-trail + per-canonical-end-to-end-replay do?
Per-portfolio per-judgment per-canonical-audit-trail runs per-portfolio per-canonical-per-judgment-canonical-audit-record (per-judgment-ID + per-banner-pointer + per-content-piece-pointer + per-canonical-eval-dimension-snapshot + per-judge-pool-snapshot + per-eval-pattern-snapshot + per-prompt-engineering-snapshot + per-eval-dataset-snapshot + per-bias-mitigation-snapshot + per-judge-confidence-tier-snapshot + per-judge-explainability-snapshot + per-per-aspect-weighted-sum-snapshot + per-per-aspect-threshold-pass-fail-snapshot + per-multi-judge-ensemble-aggregation-snapshot + per-Cohen-kappa-Krippendorff-alpha-Fleiss-kappa-Gwet-AC1-inter-judge-agreement-snapshot + per-judge-calibration-Brier-ECE-reliability-diagram-Platt-isotonic-temperature-snapshot + per-confidence-interval-Wilson-Clopper-Pearson-Jeffreys-bootstrap-Bayesian-95-percent-snapshot + per-judge-bias-correction-snapshot + per-aggregation-confidence-tier-snapshot + per-Shapley-LIME-counterfactual-anchor-per-aspect-trace-snapshot + per-routing-tier-snapshot + per-remediation-suggestion-snapshot + per-multi-arm-bandit-snapshot + per-causal-uplift-CATE-snapshot + per-FBC-feedback-loop-snapshot + per-routing-confidence-tier-snapshot + per-EU-AI-Act-Article-50-disclosure-snapshot + per-EU-AI-Act-Article-13-14-15-high-risk-snapshot + per-EU-AI-Act-Article-22-automated-decision-snapshot + per-NIST-AI-RMF-snapshot + per-ISO-42001-snapshot + per-ISO-27001-snapshot + per-SOC-2-Type-II-snapshot + per-NIST-Privacy-Framework-snapshot + per-OECD-AI-Principles-snapshot + per-G7-Hiroshima-Code-of-Conduct-snapshot + per-White-House-AI-Bill-of-Rights-snapshot + per-Executive-Order-14110-snapshot + per-EEOC-AI-employment-snapshot + per-FCRA-AI-credit-snapshot + per-ECOA-Reg-B-disparate-impact-snapshot + per-Fair-Housing-Act-disparate-impact-snapshot + per-HIPAA-no-PHI-in-judgment-trace-snapshot + per-PCI-no-CHD-in-judgment-trace-snapshot + per-FTC-Section-5-snapshot + per-CFPB-UDAAP-snapshot + per-FDA-21-CFR-Part-11-electronic-signature-snapshot + per-CCPA-CPRA-snapshot + per-GDPR-Article-6-7-17-22-snapshot + per-LGPD-snapshot + per-DPDP-snapshot + per-PIPEDA-snapshot + per-Digital-Services-Act-Article-30-snapshot + per-Digital-Markets-Act-snapshot + per-OPA-Cedar-Casbin-Cerbos-Oso-policy-snapshot + per-compliance-confidence-tier-snapshot + per-canonical-audit-record) + per-canonical-immutable-WORM-storage + per-canonical-7-year-IRS-tax-retention + per-canonical-7-year-FTC-substantiation-retention + per-canonical-7-year-HIPAA-medical-record-retention + per-canonical-6-year-SEC-record-retention + per-canonical-3-year-FINRA-record-retention + per-canonical-FDA-21-CFR-Part-11-electronic-signature-retention. Per-canonical-end-to-end-replay runs per-portfolio per-canonical-per-judgment-judge-pool-rewind + per-canonical-per-judgment-eval-pattern-rewind + per-canonical-per-judgment-scoring-aggregation-rewind + per-canonical-per-judgment-routing-rewind + per-canonical-per-judgment-compliance-gate-rewind + per-canonical-per-judgment-replay-confidence-tier + per-canonical-per-judgment-replay-explainability.
Engage the compliance-overlay-manager agent
Per-portfolio per-banner per-content-piece per-canonical-eval-dimension-pointer + per-canonical-multi-judge-ensemble-spec + per-canonical-scoring-aggregation-spec + per-canonical-routing-spec + per-canonical-compliance-gate-spec + per-canonical-audit-trail + per-portfolio-audit-trail shipped as the orchestration layer above your existing per-LLM-eval-platform-vendor + per-foundation-model-eval-vendor + per-eval-framework-research + per-AI-GRC-vendor + per-CMP-vendor primitive.
Related reading
- LLM semantic compliance scoring (parent commercial pillar — buyer-outcome framing)
- Tiered pre-filter deterministic gates for AI-content compliance (sibling build-pillar on the same compliance-overlay-manager agent — Tier-1 + Tier-2 pre-filters route the 10-30% gray zone into this LLM-as-judge Tier-3 + Tier-4 implementation)
- Marketing AI autonomy-profile configuration (sibling build-pillar — defines the upstream autonomy posture + routing thresholds that determine which judgments trigger auto-publish vs human review)