Govern-Output Swarm · Forbidden-Phrase-Library Agent · Multi-Brand-Forbidden-Phrase-Library Skill · Build pillar · Published October 7, 2026
How to build a multi-brand forbidden-phrase library for AI-output gating
A 4-skill bundle (Catalog + Match + Block + Replace) layered above the existing OpenAI + Anthropic + Google + Mistral + Cohere + Meta + AWS Bedrock + Azure OpenAI + Vertex AI LLM-provider substrate + the Pinecone + Weaviate + Qdrant + Chroma + Milvus + pgvector + Vespa + LanceDB RAG vector substrate + the LangSmith + Weights & Biases + Arize + WhyLabs + Helicone + Langfuse observability substrate + the Lakera Guard + Robust Intelligence + HiddenLayer + CalypsoAI + Protect AI + Garak AI-safety substrate + the OPA Rego + AWS Cedar + Casbin + Cerbos + Oso + Styra DAS + Permit.io policy-as-code substrate + the iManage + NetDocuments + Worldox + OpenText + DocuWare + M-Files + Box + SharePoint + Google Workspace document- management substrate. Anchored on Lanham Act 15 USC 1125(a) competitor-mark misuse + FTC Green Guides 16 CFR Part 260 + FTC Section 5 + FTC Endorsement Guides 16 CFR Part 255 + per-industry hard-prohibited language (FDA + FINRA Rule 2210 + CFPB UDAAP + EPA) + Title VII Civil Rights Act + ADA Title III + per-state civil rights statutes + per-banner brand-voice spec hard-prohibitions + CCPA + CPRA + state-comprehensive -privacy + GDPR + NIST AI RMF + ISO 42001 + EU AI Act.
The 4-skill bundle on the forbidden-phrase-library agent
Multi-brand forbidden-phrase library is one skill on the forbidden-phrase-library agent. The skill decomposes into four operationally distinct sub- skills, each with its own success criteria and its own handoff to the next.
1. Catalog
Operator-counsel-documented forbidden-phrase library in versioned registry. Per entry: phrase or phrase-pattern (exact string + regex + semantic embedding template); category (competitor-mark misuse + FTC Green Guides + FDA hard-prohibition + FINRA hard-prohibition + CFPB hard-prohibition + EPA hard-prohibition + Title VII + ADA inclusion-sensitive + per-banner brand-voice hard-prohibition); per-banner scope + per- jurisdiction scope + per-channel scope (paid + organic + email + SMS + GBP + listings + reviews + voice + in-store); document-management pointer (iManage + NetDocuments + Worldox + OpenText + DocuWare + M-Files + Box + SharePoint location of operator-counsel rationale + supporting case law + supporting enforcement history); operator- counsel sign-off; recommended replacement language.
2. Match
Per-output detection at every AI-output gate before publish. Layers in order: exact-string match against Catalog (fast filter for known competitor names + known hard-prohibited phrases); regex match (template variants + capitalization + punctuation + spacing); semantic-embedding similarity via Pinecone + Weaviate + Qdrant + Chroma + Milvus + pgvector + Vespa + LanceDB catching paraphrases of forbidden concepts; LLM- assisted Match under per-vendor zero-retention augments semantic-embedding for context-sensitive cases. LLM is NEVER sole gating mechanism — pattern + embedding + LLM ensemble feed Match decision. FALSE-NEGATIVE COST MUCH HIGHER than false-positive so Match DEFAULTS TO ESCALATION at borderline + routes edge cases to operator counsel.
3. Block
Any output matching Catalog entry routes to operator review + prevented from auto-publish. Blocked output surfaces specific forbidden phrase + Catalog category + operator-counsel rationale + recommended replacement where one exists. Operator counsel decides: rewrite per recommended replacement; rewrite from scratch; reject entirely (upstream agent regenerates); add new exception to Catalog (if counsel concludes match was incorrect and phrase is allowed in this specific context — exceptions documented per entry with counsel rationale).
4. Replace
Automated corrective language for routine cases where Catalog entry has high-confidence recommended replacement (e.g., outdated terminology replaced by current inclusive language with no semantic loss). Replace NEVER auto- publishes; surfaces corrective draft to human reviewer who decides whether to accept. Auto- publish NEVER HAPPENS for content that originally matched a forbidden-phrase entry.
The real ecosystem this skill sits above
LLM + RAG + observability substrate
OpenAI, Anthropic, Google, Mistral, Cohere, Meta, AWS Bedrock, Azure OpenAI, Vertex AI LLM providers under per-vendor zero-retention. Pinecone, Weaviate, Qdrant, Chroma, Milvus, pgvector, Vespa, LanceDB RAG vector for Catalog phrase-pattern embedding storage. LangSmith, Weights & Biases, Arize, WhyLabs, Helicone, Langfuse observability.
AI safety + policy-as-code substrate
Lakera Guard, Robust Intelligence, HiddenLayer, CalypsoAI, Protect AI, Garak AI safety for prompt -injection + hallucination defense layered alongside forbidden-phrase Match. OPA Rego, AWS Cedar, Casbin, Cerbos, Oso, Styra DAS, Permit.io policy-as-code for Block gating.
Document management substrate
iManage, NetDocuments, Worldox, OpenText, DocuWare, M-Files, Box, SharePoint, Google Workspace for operator-counsel rationale documentation + supporting case law + enforcement-history storage per Catalog entry. Each entry maintains a stable document pointer so counsel can update rationale without breaking the registry version pointer.
5-anchor compliance overlay
Anchor 1 — Lanham Act competitor-mark misuse + FTC Green Guides + per-industry hard-prohibited language (operationally distinctive)
Lanham Act 15 USC 1125(a) competitor-mark misuse covers phrases that incorporate a competitor trademark in a way that implies endorsement, comparison, or affiliation without authorization. FTC Green Guides 16 CFR Part 260 specify what general environmental claims require (eco- friendly + biodegradable + compostable + carbon -neutral + recycled content + non-toxic + ozone -friendly + renewable each carry specific substantiation requirements); unqualified general claims like green or environmentally friendly are presumed deceptive without context-specific substantiation. Per- industry hard-prohibited language: FDA prohibits unapproved drug claims (dietary supplement cannot claim to cure disease; cosmetic cannot claim to treat medical condition); FINRA Rule 2210 prohibits investment communications that suggest guaranteed returns; CFPB UDAAP prohibits consumer -finance dark patterns + bait-and-switch language; EPA prohibits specifically cited environmental claims. Operationally distinctive — these are hard-prohibition cases where context cannot rescue the phrase, distinct from the substantiation-allowlist frame.
Anchor 2 — Title VII + ADA Title III + per-state civil rights statutes
Title VII Civil Rights Act + ADA Title III + per- state civil rights statutes create exposure when AI output uses slurs + outdated terminology + stereotypes + disparaging language about protected classes (race + color + religion + sex + national origin + age + disability + genetic information). Operator-counsel maintains the inclusion-sensitive language portion of the Catalog with periodic refresh as standards evolve.
Anchor 3 — Per-banner brand-voice spec hard- prohibitions
Per-banner brand-voice spec hard-prohibitions are operator-defined per banner (phrases the brand will never use under any circumstance regardless of legal substantiation availability). These are operator-counsel-approved per banner via brand- voice-gate sibling handoff.
Anchor 4 — CCPA + CPRA + state-comprehensive- privacy + GDPR
Operator + customer data + audit-trail data is personal information under California Consumer Privacy Act + California Privacy Rights Act + 18 state-comprehensive-privacy statutes + GDPR in EU jurisdictions. DSAR overlay tagging preserves fulfillment evidence per record.
Anchor 5 — NIST AI RMF + ISO 42001 + EU AI Act + per-vendor LLM zero-retention
When AI-assisted Match is used (LLM-classified per-output against Catalog category list), NIST AI Risk Management Framework + ISO 42001 + applicable EU AI Act articles + per-vendor LLM zero-retention posture apply. LLM is NEVER sole gating mechanism — pattern + embedding + LLM ensemble feed Match decision; human reviewer remains authoritative authorization path.
6-workstream pre-engagement-baseline reporting cycle
Per-category coverage + per-Match accuracy are what the data shows after the workflow is built, not numbers Completions promises in advance.
- Catalog coverage. Per-category enumeration completeness, per-banner per- jurisdiction per-channel scope coverage, per-entry document-management pointer, operator-counsel sign- off completeness, Catalog registry version pointer freshness.
- Match quality. Per-output exact- string + regex + semantic-embedding + LLM-assisted match accuracy, per-match false-negative + false- positive route-to-counsel rate, borderline-default- to-escalation adherence.
- Block quality. Per-output unmatched-phrase hold, per-hold operator-review routing, per-hold counsel-decision capture, per- Catalog-exception documentation completeness, auto -publish-prevention completeness.
- Replace quality. Per-recommended- replacement confidence calibration, per-replacement human-reviewer accept rate, per-replacement semantic-loss minimization, auto-publish-prevention for replaced content.
- 5-anchor compliance posture freshness. Lanham Act 15 USC 1125(a) + FTC Green Guides 16 CFR Part 260 + FTC Section 5 + FTC Endorsement Guides + per-industry hard-prohibited language (FDA + FINRA Rule 2210 + CFPB UDAAP + EPA) + Title VII + ADA Title III + per-state civil rights statutes + per-banner brand-voice spec hard-prohibition + CCPA + CPRA + state-comprehensive-privacy + GDPR + NIST AI RMF + ISO 42001 + EU AI Act + per-vendor LLM zero-retention posture.
- Audit-trail completeness. Per- Catalog entry record, per-Match decision record, per-Block decision record, per-Replace record.
Frequently asked questions
What does a multi-brand forbidden-phrase library for AI-output gating actually solve?
A claims-allowlist (sibling skill) defines what claims the operator IS willing to make with substantiation. A forbidden-phrase library defines what phrases the operator IS NEVER willing to publish regardless of substantiation: competitor trademarks the operator cannot use under the Lanham Act; environmental claims that fail FTC Green Guides (16 CFR Part 260) without specific substantiation; per-industry hard-prohibited language (FDA prohibitions on drug + device + dietary-supplement claims that no substantiation can rescue; FINRA Rule 2210 prohibitions on investment-grade communications; CFPB UDAAP prohibitions on consumer-finance dark patterns; EPA prohibitions on misleading green-marketing); inclusion-sensitive language that creates Title VII or ADA exposure (outdated terminology + slurs + stereotypes); per-banner brand-voice spec hard-prohibitions (each banner has phrases the brand will never use regardless of context). The two skills complement: allowlist gates IN substantiated claims; forbidden-phrase library gates OUT hard-prohibited content. Together they bracket AI-output safety.
Why is Lanham Act + FTC Green Guides + per-industry hard-prohibitions + Title VII + ADA + per-banner brand-voice the operationally distinctive frame?
The forbidden-phrase frame is operationally distinct from the substantiation frame because no substantiation file can rescue a phrase the operator cannot publish at all. Lanham Act 15 USC 1125(a) competitor-mark misuse covers phrases that incorporate a competitor trademark in a way that implies endorsement, comparison, or affiliation without authorization. FTC Green Guides (16 CFR Part 260) specify what general environmental claims require ("eco-friendly" + "biodegradable" + "compostable" + "carbon-neutral" + "recycled content" + "non-toxic" + "ozone-friendly" + "renewable" each carry specific substantiation requirements); unqualified general claims like "green" or "environmentally friendly" are presumed deceptive without context-specific substantiation. Per-industry hard-prohibited language: FDA prohibits unapproved drug claims (a dietary supplement cannot claim to "cure" disease; a cosmetic cannot claim to "treat" a medical condition); FINRA Rule 2210 prohibits investment communications that suggest guaranteed returns; CFPB UDAAP prohibits consumer-finance dark patterns + bait-and-switch language; EPA prohibits environmental claims that have been specifically cited in enforcement. Title VII + ADA + per-state civil rights statutes create exposure when AI output uses slurs + outdated terminology + stereotypes + disparaging language about protected classes. Per-banner brand-voice spec hard-prohibitions are operator-defined per banner (phrases the brand will never use under any circumstance). Operationally distinctive — these are hard-prohibition cases where context cannot rescue the phrase.
How does the Catalog skill enumerate the operator-defined forbidden-phrase library?
The Catalog sub-skill builds the operator-counsel-documented forbidden-phrase library in a versioned registry. Per entry: phrase or phrase-pattern (exact string + regular expression + semantic embedding template); category (competitor-mark misuse + FTC Green Guides green-marketing violation + FDA hard-prohibited claim + FINRA hard-prohibited investment claim + CFPB hard-prohibited consumer-finance claim + EPA hard-prohibited environmental claim + Title VII + ADA inclusion-sensitive + per-banner brand-voice hard-prohibition); per-banner scope (which banners apply this rule); per-jurisdiction scope (which jurisdictions apply); per-channel scope (paid + organic + email + SMS + GBP + listings + reviews + voice + in-store); document-management pointer (iManage + NetDocuments + Worldox + OpenText + DocuWare + M-Files + Box + SharePoint location of operator-counsel rationale + supporting case law + supporting enforcement history); operator-counsel sign-off (counsel identity + sign-off date + comments); recommended replacement language where a sanitized version of the underlying intent exists. The Catalog registry version pointer is captured per edit so the Match sub-skill can reconstruct which forbidden phrases applied at a given decision time.
How does the Match skill detect forbidden-phrase usage in AI-generated output?
Match runs at every AI-output gate before publish. Per-output detection layers in order: exact-string match against Catalog entries (fast filter for known competitor names + known hard-prohibited phrases); regular-expression match (template variants + capitalization + punctuation + spacing variants); semantic-embedding similarity against Catalog phrase-pattern embeddings (Pinecone + Weaviate + Qdrant + Chroma + Milvus + pgvector + Vespa + LanceDB) catching paraphrases of forbidden concepts; LLM-assisted Match (LLM-classified per-output against Catalog category list under per-vendor zero-retention) augments semantic-embedding match for context-sensitive cases. The LLM is never the sole gating mechanism — pattern + embedding + LLM ensemble votes feed the Match decision. False-negative cost (publishing forbidden phrase) is much higher than false-positive cost (flagging a non-forbidden phrase) so Match defaults to escalation at the borderline + routes edge cases to operator counsel rather than auto-passing.
How do Block and Replace prevent publication and offer corrective output?
Block routes any output that matches a Catalog entry to operator review and prevents auto-publish. The blocked output surfaces the specific forbidden phrase + the Catalog category + the operator-counsel rationale + the recommended replacement where one exists. Operator counsel decides whether to: rewrite the content per the recommended replacement; rewrite from scratch; reject the content entirely (in which case the upstream agent regenerates); add a new exception to the Catalog (if the counsel concludes the match was incorrect and the phrase is allowed in this specific context — exceptions are documented per entry with counsel rationale). Replace optionally provides automated corrective language for routine cases where a Catalog entry has a high-confidence recommended replacement (e.g., outdated terminology replaced by current inclusive language with no semantic loss). Replace never auto-publishes; it surfaces the corrective draft to the human reviewer who decides whether to accept. Auto-publish never happens for content that originally matched a forbidden-phrase entry.
How does Completions report on this without fabricating KPI commitments?
Pre-engagement baseline is established in the first 30 days. Reporting cycles cover the six workstreams: Catalog coverage (per-category enumeration completeness + per-banner per-jurisdiction per-channel scope coverage + per-entry document-management pointer + operator-counsel sign-off completeness + Catalog registry version pointer freshness), Match quality (per-output exact-string + regex + semantic-embedding + LLM-assisted match accuracy + per-match false-negative + false-positive route-to-counsel rate + borderline-default-to-escalation adherence), Block quality (per-output unmatched-phrase hold + per-hold operator-review routing + per-hold counsel-decision capture + per-Catalog-exception documentation completeness + auto-publish-prevention completeness), Replace quality (per-recommended-replacement confidence calibration + per-replacement human-reviewer accept rate + per-replacement semantic-loss minimization + auto-publish-prevention for replaced content), 5-anchor compliance posture freshness (Lanham Act 15 USC 1125(a) competitor-mark misuse + FTC Green Guides 16 CFR Part 260 + FTC Section 5 + FTC Endorsement Guides + per-industry hard-prohibited language (FDA + FINRA Rule 2210 + CFPB UDAAP + EPA) + Title VII + ADA Title III + per-state civil rights statutes + per-banner brand-voice spec hard-prohibition posture + CCPA + CPRA + state-comprehensive-privacy + GDPR + NIST AI RMF + ISO 42001 + EU AI Act + per-vendor LLM zero-retention posture), audit-trail completeness (per-Catalog entry record + per-Match decision record + per-Block decision record + per-Replace record).
Engage Completions
Operators publishing AI-generated content across multi-banner + multi-jurisdiction + multi-vertical scope need a forbidden-phrase library that gates out hard-prohibited content the way the claims-allowlist gates in substantiated claims. Completions architects the workflow as a 4-skill bundle layered above the existing OpenAI + Anthropic + Bedrock + Vertex LLM + Pinecone + Weaviate + OPA Rego + Cedar + iManage + NetDocuments ecosystem. Start with the Tier 1 AI Readiness Assessment ($10k, 2-3 weeks), build with the Tier 2 Setup Sprint ($25-50k, 4-8 weeks), or engage Tier 3 Fractional CMO with AI Swarm ($15-25k per month, 6-month minimum).
Related reading
- How to build a claims-allowlist + substantiation file for AI-generated marketing — sibling build- pillar (complement: allowlist gates IN substantiated claims; forbidden-phrase library gates OUT hard-prohibited content)
- How to build a franchisee content-moderation queue — sibling build-pillar (Classify cross-references forbidden-phrase library when franchisee-generated content is checked)
- How to build routing audit trails for AI-output governance — sibling build-pillar (per-Catalog + per-Match + per-Block + per-Replace records emit into this substrate)