Profanity filter + forbidden-phrase library for brand AI content
Regex + semantic-match enforcement of competitor names, off-brand language, regulated claims, and deprecated terms — checked at every AI content output across the catalog.
The problem
You have a banned-words list in a Google Doc — competitors you don't mention by name, deprecated product terms, off-brand slang, claims that legal does not allow. Your AI agents ignore the doc. Generic content-moderation tools handle user-generated content; they do not know your brand's specific forbidden phrases.
You tried OpenAI's Moderation API but it catches hate speech and harassment, not "do not use the word revolutionary in our copy." Google Perspective and Hive cover similar ground at the platform level — built for forum / gaming / social user-generated content at scale, not for brand-produced AI content with operator-specific prohibitions.
Profanity-filter APIs (Bad Words API, ProfanityCensor, WebPurify at $10-$1,000/month) handle generic profanity but require custom integration plus per-brand list maintenance. Content-moderation BPO services (Telus International, TCS, Genpact at $0.10-$2 per item) put humans in the loop for user-generated content — wrong surface for brand-produced AI content.
Your Google Doc list goes stale within a quarter. Competitor names slip into comparison content. Deprecated product names appear in new location pages. Regulated claims appear without substantiation. Trademark-protected competitor terms appear in paid creative.
What success looks like
Every AI-produced content output checks against an operator-curated brand-specific phrase library before publishing. Competitor names get caught. Off-brand language ("guys" instead of "team", "revolutionary" when the brand avoids hyperbole) gets caught. Regulated claims requiring substantiation get blocked unless substantiation links exist. Deprecated product terms get blocked. Trademark / IP-protected competitor language gets blocked.
Regex catches exact-string prohibitions cheaply at the edge. Semantic-match enforcement (LLM-based) catches paraphrased violations regex cannot detect — synonymous off-brand language, implied prohibited claims.
Multi-brand portfolios and multi-vertical operators get per-brand and per-vertical library extensions through an inheritance hierarchy. Adds and removes flow through PR-style review across corporate, franchisee council, legal, and compliance.
The library composes with the broader brand-voice gate (loop 002) and the per-vertical compliance overlay (loop 001) — together they enforce every operator-defined prohibition at every AI content output across the catalog.
How most operators solve this today
Five tiers of incumbent tools — none enforce operator-specific brand prohibitions on brand-produced AI content at the output gate.
AI content moderation (Hive, Spectrum Labs, Two Hat / Microsoft, OpenAI Moderation API, Google Perspective, Modulate)
Free / $1,000-$50,000+/mo
Built for user-generated content moderation (forums, gaming, social) at scale. Catches hate speech, harassment, sexual content, violence. Does not enforce brand-specific prohibitions on brand-produced content.
Profanity-filter APIs (Bad Words API, ProfanityCensor, WebPurify)
$10-$1,000/month
Generic profanity detection. Requires custom integration plus per-brand list maintenance. No semantic-match, no PR-style review.
Content-moderation BPO (Telus International, TCS, Genpact)
$0.10-$2 per item
Humans-in-the-loop user-generated content moderation. Wrong surface for brand-produced AI content with operator-specific prohibitions.
CMS profanity plugins (WordPress, Drupal modules)
$10-$100/year
Bolted into the operator CMS. Generic profanity filtering. Does not extend to AI agents producing content outside the CMS.
DIY (regex blocklist in code + Google Doc + brand-manager review)
Internal FTE time
Google Doc goes stale within a quarter. No semantic-match; no AI-agent integration; no audit trail.
What changes when this is an agent skill
The Completions forbidden-phrase skill combines regex deterministic filtering with LLM-based semantic-match enforcement on every AI-produced content output. Regex catches exact-string prohibitions cheaply at the edge; semantic-match catches paraphrased violations (synonymous off-brand language, implied prohibited claims) regex cannot detect.
The library is operator-curated and brand-specific — competitor names you do not mention, deprecated product terms, off-brand slang, regulated claims requiring substantiation, hyperbole / superlatives the brand avoids, trademark / IP-protected language, plus per-vertical-compliance forbidden terms.
Multi-brand portfolios and multi-vertical operators get per-brand and per-vertical library extensions through an inheritance hierarchy. Corporate sets base prohibitions; brand / vertical sub-libraries extend or override within bounds. PR-style review (corporate, franchisee council, legal, compliance) approves every add and remove with audit trail.
The library composes with the broader brand-voice gate (loop 002) and the per-vertical compliance overlay (loop 001). Together they enforce every operator-defined prohibition at every AI content output — page generator, GBP, review response, social, email, paid creative, product descriptions, CS-assist.
Foundation-skill pricing ($2-4k/mo paired with brand-spec-authoring rental) replaces the Google Doc plus the brand-manager review time plus the per-platform profanity-filter subscriptions.
Agents that include this skill
Skills live inside agent rentals. To get this skill in production, hire any of the agents below — context-tuning at onboarding is included in the first month.
Brand-Spec Authoring + Maintenance Agent
Produces and maintains the canonical brand spec every content-producing agent's brand-voice gate enforces.
Early-adopter
$2,000–$3,500/mo
FAQ
- What is a profanity filter for brand content?
- A library of forbidden phrases (profanity, off-brand slang, competitor names, regulated claims, hyperbole) that AI agents check every output against. The Completions skill combines regex plus semantic-match enforcement with operator-curated brand-specific prohibitions.
- How is this different from OpenAI's Moderation API or Google Perspective?
- Those APIs detect hate speech, harassment, sexual content, and violence in user-generated content. This skill enforces BRAND-SPECIFIC prohibitions (competitor names, off-brand language, regulated claims, deprecated product terms) in BRAND-PRODUCED content.
- How is this different from a content-moderation BPO (Telus, TCS, Genpact)?
- BPO services use humans-in-the-loop to moderate user-generated content at $0.10-$2.00 per item. This skill does programmatic enforcement for brand-produced AI content at the output gate.
- What kinds of phrases does the library include?
- Competitor names, deprecated product terms, off-brand slang, regulated claims requiring substantiation, hyperbole / superlatives the brand avoids, trademark / IP-protected language, profanity, per-vertical-compliance forbidden terms.
- How does semantic-match enforcement work?
- LLM-based semantic classification catches paraphrased violations regex cannot detect (synonyms of forbidden terms, implied prohibited claims). Works in combination with regex pre-filter for efficiency.
- Can different brands or verticals have different libraries?
- Yes. Multi-brand portfolios get per-brand libraries; multi-vertical operators get per-vertical extensions. Inheritance hierarchy lets corporate set base prohibitions; brand / vertical sub-libraries extend or override.
- How does this compose with brand-voice-gate and the compliance overlay?
- Brand-voice-gate is the broader gate that includes this library plus tone, formality, and structural rules. Per-vertical-compliance-overlay handles regulator-required prohibitions. This skill maintains the brand-specific phrase library; the gate enforces.