Get-found swarm · Per-Location-Page-Generator Agent · Content-distinctness-gate skill · Build pillar · Published June 26, 2026
How to architect a pre-publish content distinctness gate end-to-end
This guide explains how to architect the content-distinctness-gate skill on the per-location-page-generator agent end-to-end at programmatic-SEO scale: per-portfolio per-page per-canonical-pre-publish-gate-spec + per-cross-page-similarity-detection + per-doorway-page-detection + per-LLM-as-judge-distinctness-evaluation + per-AI-generated-content-detection + per-thin-content-detection + per-page-distinctness-severity-tiering + per-pre-publish-decisioning-block-flag-warn-allow + per-pre-publish-routing-to-stakeholder + per-pre-publish-rewrite-recommendation + per-post-publish-monitoring + per-Helpful-Content-System-compliance-attestation + per-portfolio audit-trail.
What you will build
- Per-portfolio per-canonical-pre-publish-gate-spec — per-pre-publish trigger (page-author submit + batch publish queue + scheduled publish cron) + per-multi-step orchestration + per-vendor-failure fallback + per-bypass-rule with audit-trail.
- Per-canonical-cross-page-similarity-detection — per-MinHash LSH (shingle size 3/5/10 + permutation 128/256) + per-SimHash + per-Levenshtein distance + per-Sentence-Transformer cosine + per-OpenAI text-embedding-3 cosine + per-Cohere embed-v3 cosine + per-cross-page percentile ranking + per-cluster detection via DBSCAN/HDBSCAN.
- Per-canonical-doorway-page-detection — Google Quality Rater Guidelines Section 7.4 + Google Helpful Content System September 2023 Spam Update + per-LLM classification (templated thin content with keyword permutation substitution + low-effort content with search-engine targeting + multiple pages with substantially same content + near-duplicate affiliate + AI mass-produced without human edit) + per-confidence scoring + per-LLM rewrite recommendation.
- Per-canonical-LLM-as-judge-distinctness-evaluation — per-multi-LLM judge (GPT-4o + Claude Sonnet + Gemini Pro) + per-prompt template + per-distinctness rubric (unique perspective + original research + local context grounding + expert attribution + information density + helpfulness score) + per-output schema + per-ensemble consensus + per-citation grounding.
- Per-canonical-AI-generated-content-detection — per-multi-detector ensemble (Originality.ai + GPTZero + Sapling + Writer.com + Content at Scale + ZeroGPT + OpenAI Classifier) + per-perplexity/burstiness detection + per-stylometric detection + per-watermark detection (OpenAI + Anthropic + Google) + per-confidence scoring + per-multi-detector consensus + per-AI-vs-human-edit classification.
- Per-canonical-thin-content-detection + per-page-distinctness-severity-tiering + per-pre-publish-decisioning — per-word-count spec + per-density spec + per-information-gain spec + per-Helpful Content System helpfulness LLM evaluation + per-question-coverage completeness + per-actionability + per-severity tier (Tier-1 Critical doorway/thin/AI-mass-produced + Tier-2 High cross-page similarity over 70% + Tier-3 Medium 50-70% + Tier-4 Low borderline stylistic) + per-confidence + per-Google HCU/Spam-Update fine-risk estimation + per-decision (Tier-1 blocked + Tier-2 flagged for rewrite + Tier-3 warned with suggestion + Tier-4 allowed with monitoring) + per-stakeholder-override rule + per-bypass-rule with audit-trail.
- Per-canonical-pre-publish-routing + per-rewrite-recommendation + per-post-publish-monitoring + per-Helpful-Content-System-compliance-attestation — per-Tier-1 to SEO Director block + per-Tier-2 to content team rewrite + per-Tier-3 to content team suggestion + per-Tier-4 to content team monitoring + per-SLA + per-LLM rewrite recommendation (unique angle + local context + expert attribution + original research + information density + helpfulness) + per-prioritization + per-implementation-cost estimation + per-Google Search Console impression/CTR/position anomaly detection + per-Google algorithm-update correlation + per-deindexing detection + per-helpful-content-update impact tracking + per-spam-update impact tracking + per-takedown/rewrite recommendation + per-Helpful Content System September 2022/December 2022/September 2023 + March 2024 Core/Spam update compliance + per-people-first content attestation + per-EEAT (Experience + Expertise + Authoritativeness + Trustworthiness) attestation + per-author credentials + per-original research attestation.
Why per-vendor-Originality.ai-single-account breaks at programmatic-SEO scale
Per-vendor-Originality.ai-canonical-AI-Detection ships per-account per-document per-AI-score primitive. Per-vendor-Copyscape + Plagspotter + Plagscan + Quetext + Grammarly + Turnitin + iThenticate + GPTZero + Sapling + Writer.com + Content at Scale + ZeroGPT-canonical-single-account ship per-vendor per-native content-distinctness primitives.
At 1-page-1-vendor scale per-account per-document per-AI-score primitive is enough. At programmatic-SEO 200-location-1000-plus-page scale per-pre-publish-gate-multi-step-orchestration + per-cross-page-similarity-MinHash-LSH-Sentence-Transformer + per-doorway-page-Google-Quality-Rater-Guidelines-Section-7.4 + per-LLM-as-judge-multi-LLM-ensemble + per-AI-content-multi-detector-ensemble + per-thin-content-LLM-helpfulness-spec + per-Helpful-Content-System-September-2023-Spam-Update + per-severity-tiering + per-decision-block-flag-warn-allow + per-routing-per-severity-per-stakeholder + per-LLM-rewrite-recommendation + per-post-publish-GSC-anomaly-detection + per-FDD-trademark-compliance.
Per-cross-vendor-content-distinctness-fragmentation + per-pre-publish-orchestration-blind + per-cross-page-similarity-blind + per-doorway-page-detection-blind + per-LLM-as-judge-blind + per-multi-detector-ensemble-blind + per-thin-content-blind + per-severity-tiering-blind + per-stakeholder-routing-blind + per-rewrite-recommendation-blind + per-post-publish-monitoring-blind + per-HCS-compliance-blind.
The operator-side architecture above per-vendor-content-distinctness primitive is canonical-pre-publish-gate-spec + per-cross-page-similarity-detection + per-doorway-page-detection + per-LLM-as-judge-distinctness-evaluation + per-AI-generated-content-detection + per-thin-content-detection + per-page-distinctness-severity-tiering + per-pre-publish-decisioning + per-pre-publish-routing + per-pre-publish-rewrite-recommendation + per-post-publish-monitoring + per-Helpful-Content-System-compliance-attestation + per-portfolio-audit-trail.
What is in market today
Per-platform per-content-similarity-vendor
Copyscape, Plagspotter, Plagscan, Quetext, Grammarly Plagiarism, Turnitin, iThenticate, ProWritingAid, Unicheck, PlagiarismCheck, Plagtracker. Per-account per-document primitive. Per-canonical-cross-page-similarity-MinHash-LSH-canonical-SimHash-canonical-Sentence-Transformer-canonical-OpenAI-Cohere-embedding-canonical-percentile-ranking-canonical-cluster-detection is not the primitive.
Per-platform per-AI-content-detector-vendor
Originality.ai, GPTZero, Sapling AI Content Detector, Writer.com AI Content Detector, Content at Scale AI Detector, ZeroGPT, OpenAI Classifier (deprecated), Hugging Face detection models, Crossplag, Compilatio. Per-account per-document per-AI-score primitive. Per-canonical-multi-detector-ensemble-canonical-perplexity-burstiness-canonical-stylometric-canonical-watermark-detection-OpenAI-Anthropic-Google-canonical-consensus-canonical-AI-vs-human-edit-classification is not the primitive.
Per-platform per-LLM-as-judge-vendor
OpenAI GPT-4o, Anthropic Claude Sonnet, Google Gemini Pro, Cohere Command R+, AWS Bedrock Guardrails, LangChain Evaluation, Promptfoo, Braintrust, LangSmith, Helicone, Phoenix (Arize), W&B (Weights & Biases) Weave. Per-API-key per-call primitive. Per-canonical-LLM-as-judge-distinctness-rubric-canonical-unique-perspective-original-research-local-context-expert-attribution-information-density-helpfulness-canonical-multi-LLM-ensemble-consensus-canonical-citation-grounding is not the primitive.
Per-platform per-Google-Search-Console-anomaly-vendor
Google Search Console, Ahrefs (GSC connector), Semrush (GSC connector), SEOmonitor, AccuRanker, Wincher, BrightEdge, Conductor, ContentKing, Indexsy, Plerdy. Per-account per-property per-impression-history primitive. Per-canonical-impression-anomaly-canonical-CTR-anomaly-canonical-position-volatility-canonical-Google-algorithm-update-correlation-canonical-deindexing-detection-canonical-helpful-content-update-impact-canonical-spam-update-impact is not the primitive.
How the architecture is built
- Per-portfolio per-canonical-pre-publish-gate-trigger-orchestration. Per-author-submit + per-batch-queue + per-scheduled-cron + per-multi-step + per-vendor-failure-fallback + per-bypass-with-audit canonical-gate.
- Per-portfolio per-canonical-per-page-content-extraction. Per-main-content + per-boilerplate-strip + per-template-strip canonical-extraction.
- Per-portfolio per-canonical-cross-page-similarity-detection. Per-MinHash-LSH + per-SimHash + per-Levenshtein + per-Sentence-Transformer-cosine + per-OpenAI-text-embedding-3 + per-Cohere-embed-v3 + per-percentile-ranking + per-DBSCAN-HDBSCAN-cluster canonical-similarity.
- Per-portfolio per-canonical-doorway-page-detection. Per-Google-Quality-Rater-Guidelines-Section-7.4 + per-Helpful-Content-System-September-2023-Spam-Update + per-LLM-classification + per-confidence-scoring canonical-doorway.
- Per-portfolio per-canonical-LLM-as-judge-distinctness-evaluation. Per-GPT-4o + per-Claude-Sonnet + per-Gemini-Pro + per-distinctness-rubric + per-ensemble-consensus + per-citation-grounding canonical-LLM-judge.
- Per-portfolio per-canonical-AI-generated-content-detection. Per-multi-detector-ensemble-Originality-GPTZero-Sapling-Writer-Content-at-Scale-ZeroGPT + per-perplexity-burstiness + per-stylometric + per-watermark + per-multi-detector-consensus + per-AI-vs-human-edit canonical-AI-detection.
- Per-portfolio per-canonical-thin-content-detection. Per-word-count + per-density + per-information-gain + per-HCS-helpfulness-LLM + per-question-coverage + per-actionability canonical-thin-content.
- Per-portfolio per-canonical-page-distinctness-severity-tiering + per-fine-risk-estimation. Per-Tier-1-4 + per-confidence + per-Google-HCU-Spam-Update-fine-risk canonical-severity.
- Per-portfolio per-canonical-pre-publish-decisioning. Per-Tier-1-blocked + per-Tier-2-flagged-rewrite + per-Tier-3-warned-suggestion + per-Tier-4-allowed-monitoring + per-stakeholder-override + per-bypass-with-audit canonical-decision.
- Per-portfolio per-canonical-pre-publish-routing. Per-Tier-1-SEO-Director + per-Tier-2-content-team-rewrite + per-Tier-3-content-team-suggestion + per-Tier-4-content-team-monitoring + per-SLA canonical-routing.
- Per-portfolio per-canonical-pre-publish-rewrite-recommendation. Per-unique-angle + per-local-context + per-expert-attribution + per-original-research + per-information-density + per-helpfulness + per-prioritization + per-implementation-cost canonical-rewrite.
- Per-portfolio per-canonical-post-publish-monitoring. Per-GSC-impression-anomaly + per-CTR-anomaly + per-position-volatility + per-Google-algorithm-update-correlation + per-deindexing-detection + per-helpful-content-update-impact + per-spam-update-impact + per-takedown-rewrite-recommendation canonical-post-publish.
- Per-portfolio per-canonical-Helpful-Content-System-compliance-attestation + per-portfolio-audit-trail. Per-HCS-September-2022 + per-Update-December-2022 + per-Update-September-2023 + per-March-2024-Core-Spam-Update + per-people-first + per-EEAT + per-author-credentials + per-original-research + per-pre-publish-helpful-content-LLM-evaluation + per-CSV-export + per-SOC2-export + per-FDD-export + per-immutable-storage canonical-HCS-audit.
Frequently asked questions
What is a pre-publish content distinctness gate for programmatic SEO?
Pre-publish content distinctness gate runs per-portfolio per-page per-canonical-pre-publish-gate-spec + per-canonical-cross-page-similarity-detection + per-canonical-doorway-page-detection + per-canonical-LLM-as-judge-distinctness-evaluation + per-canonical-AI-generated-content-detection + per-canonical-thin-content-detection + per-canonical-page-distinctness-severity-tiering + per-canonical-pre-publish-decisioning-block-flag-warn-allow + per-canonical-pre-publish-routing-to-stakeholder + per-canonical-pre-publish-rewrite-recommendation + per-canonical-post-publish-monitoring + per-canonical-Helpful-Content-System-compliance-attestation + per-portfolio audit-trail. Per-canonical-pre-publish-gate-spec runs per-portfolio per-canonical-pre-publish-trigger-spec (per-page-author-submit + per-batch-publish-queue + per-scheduled-publish-cron per-canonical-trigger) + per-canonical-pre-publish-gate-multi-step-orchestration + per-canonical-pre-publish-gate-fallback-on-vendor-failure + per-canonical-pre-publish-gate-bypass-rule-with-audit-trail. The per-platform content-distinctness vendor category includes Originality.ai, Copyscape, Plagspotter, Plagscan, Quetext, Grammarly Plagiarism, Turnitin, iThenticate, GPTZero, Sapling AI Content Detector, Writer.com AI Content Detector, Content at Scale AI Detector, ZeroGPT, Hugging Face Sentence Transformers, OpenAI Classifier.
Why does per-vendor-Originality.ai-canonical-AI-Detection-canonical-single-account break down at programmatic-SEO scale?
Per-vendor-Originality.ai-canonical-AI-Detection ships per-account per-document per-AI-score primitive. Per-vendor-Copyscape + per-Plagspotter + per-Plagscan + per-Quetext + per-Grammarly + per-Turnitin + per-iThenticate + per-GPTZero + per-Sapling + per-Writer.com + per-Content-at-Scale + per-ZeroGPT-canonical-single-account ship per-vendor per-native content-distinctness primitives. At 1-page-1-vendor scale per-account per-document per-AI-score primitive is enough. At programmatic-SEO 200-location-1000-plus-page scale per-canonical-pre-publish-gate-spec-canonical-multi-step-orchestration + per-canonical-cross-page-similarity-detection-canonical-MinHash-LSH-Sentence-Transformer + per-canonical-doorway-page-detection-canonical-Google-Quality-Rater-Guidelines-Section-7.4 + per-canonical-LLM-as-judge-canonical-multi-LLM-ensemble + per-canonical-AI-generated-content-detection-canonical-multi-detector-ensemble + per-canonical-thin-content-detection-canonical-LLM-helpfulness-spec + per-canonical-Helpful-Content-System-compliance-canonical-Google-HCU-September-2023-Spam-Update + per-canonical-page-distinctness-severity-tiering + per-canonical-pre-publish-decisioning-canonical-block-flag-warn-allow + per-canonical-pre-publish-routing-canonical-per-severity-per-stakeholder + per-canonical-pre-publish-rewrite-recommendation-canonical-LLM-generated + per-canonical-post-publish-monitoring-canonical-Google-Search-Console-anomaly-detection + per-canonical-FDD-trademark-compliance.
How does per-portfolio per-canonical-cross-page-similarity-detection + per-doorway-page-detection work?
Per-portfolio per-canonical-cross-page-similarity-detection runs per-portfolio per-canonical-per-page-content-extraction (per-main-content-block + per-boilerplate-strip + per-template-strip per-canonical-content-extraction) + per-canonical-MinHash-Locality-Sensitive-Hashing (per-shingle-size-3 + per-shingle-size-5 + per-shingle-size-10 + per-permutation-128 + per-permutation-256 per-canonical-MinHash) + per-canonical-SimHash + per-canonical-Levenshtein-distance + per-canonical-Sentence-Transformer-cosine-similarity + per-canonical-OpenAI-text-embedding-3-cosine + per-canonical-Cohere-embed-v3-cosine + per-canonical-cross-page-similarity-percentile-ranking + per-canonical-cross-page-similarity-cluster-detection-DBSCAN-HDBSCAN. Per-canonical-doorway-page-detection runs per-portfolio per-canonical-Google-Quality-Rater-Guidelines-Section-7.4-doorway-page-spec + per-canonical-Google-Helpful-Content-System-September-2023-Spam-Update-spec + per-canonical-doorway-page-LLM-classification (per-templated-thin-content-only-keyword-permutation-substitution + per-low-effort-content-with-search-engine-targeting + per-multiple-pages-with-substantially-same-content + per-near-duplicate-affiliate-content + per-AI-mass-produced-without-human-edit per-canonical-doorway-classification) + per-canonical-doorway-page-confidence-scoring + per-canonical-doorway-page-LLM-rewrite-recommendation.
What does per-portfolio per-canonical-LLM-as-judge + per-AI-content-detection + per-thin-content-detection do?
Per-portfolio per-canonical-LLM-as-judge-distinctness-evaluation runs per-portfolio per-canonical-multi-LLM-judge (per-GPT-4o + per-Claude-Sonnet + per-Gemini-Pro per-canonical-LLM) + per-canonical-LLM-judge-prompt-template + per-canonical-LLM-judge-distinctness-rubric (per-unique-perspective + per-original-research + per-local-context-grounding + per-expert-attribution + per-information-density + per-helpfulness-score per-canonical-distinctness-rubric) + per-canonical-LLM-judge-output-schema + per-canonical-LLM-judge-multi-LLM-ensemble-consensus + per-canonical-LLM-judge-citation-grounding. Per-canonical-AI-generated-content-detection runs per-portfolio per-canonical-multi-detector-ensemble (per-Originality.ai + per-GPTZero + per-Sapling + per-Writer.com + per-Content-at-Scale + per-ZeroGPT + per-OpenAI-Classifier per-canonical-detector) + per-canonical-perplexity-burstiness-detection + per-canonical-stylometric-detection + per-canonical-watermark-detection-OpenAI-Anthropic-Google + per-canonical-AI-detection-confidence-scoring + per-canonical-AI-detection-multi-detector-consensus + per-canonical-AI-detection-vs-human-edit-classification. Per-canonical-thin-content-detection runs per-portfolio per-canonical-content-word-count-spec + per-canonical-content-density-spec + per-canonical-content-information-gain-spec + per-canonical-content-Helpful-Content-System-helpfulness-LLM-evaluation + per-canonical-content-question-coverage-completeness-spec + per-canonical-content-actionability-spec.
What does per-portfolio per-canonical-page-distinctness-severity-tiering + per-pre-publish-decisioning + per-routing-rewrite-recommendation do?
Per-portfolio per-canonical-page-distinctness-severity-tiering runs per-portfolio per-canonical-per-page-severity-spec (per-Tier-1-Critical-doorway-page-thin-content-AI-mass-produced + per-Tier-2-High-cross-page-similarity-over-70-percent + per-Tier-3-Medium-cross-page-similarity-50-to-70-percent + per-Tier-4-Low-borderline-stylistic per-canonical-severity) + per-canonical-per-page-confidence-scoring + per-canonical-per-page-Google-HCU-spam-update-fine-risk-estimation. Per-canonical-pre-publish-decisioning runs per-portfolio per-canonical-per-page-decision-spec (per-Tier-1-blocked-pre-publish + per-Tier-2-flagged-for-rewrite + per-Tier-3-warned-with-rewrite-suggestion + per-Tier-4-allowed-with-monitoring per-canonical-decision) + per-canonical-per-page-decision-stakeholder-override-rule + per-canonical-per-page-decision-bypass-rule-with-audit-trail. Per-canonical-pre-publish-routing runs per-portfolio per-canonical-per-severity-stakeholder-routing (per-Tier-1-routes-to-SEO-Director-block + per-Tier-2-routes-to-content-team-rewrite + per-Tier-3-routes-to-content-team-suggestion + per-Tier-4-routes-to-content-team-monitoring per-canonical-routing) + per-canonical-per-severity-SLA. Per-canonical-pre-publish-rewrite-recommendation runs per-portfolio per-canonical-per-page-LLM-rewrite-recommendation (per-unique-angle-addition + per-local-context-grounding-addition + per-expert-attribution-addition + per-original-research-addition + per-information-density-improvement + per-helpfulness-improvement per-canonical-rewrite) + per-canonical-per-page-rewrite-prioritization + per-canonical-per-page-rewrite-implementation-cost-estimation.
What does per-portfolio per-canonical-post-publish-monitoring + per-Helpful-Content-System-compliance-attestation + per-per-location-page-generator-agent-canonical-bundle do?
Per-portfolio per-canonical-post-publish-monitoring runs per-portfolio per-canonical-Google-Search-Console-impression-anomaly-detection + per-canonical-Google-Search-Console-CTR-anomaly-detection + per-canonical-Google-Search-Console-position-volatility-detection + per-canonical-Google-algorithm-update-correlation + per-canonical-per-page-post-publish-deindexing-detection + per-canonical-per-page-post-publish-helpful-content-update-impact-tracking + per-canonical-per-page-post-publish-spam-update-impact-tracking + per-canonical-per-page-post-publish-takedown-rewrite-recommendation. Per-canonical-Helpful-Content-System-compliance-attestation runs per-portfolio per-canonical-Helpful-Content-System-September-2022-launch + per-canonical-Helpful-Content-Update-December-2022 + per-canonical-Helpful-Content-Update-September-2023 + per-canonical-March-2024-Core-Update-Spam-Update + per-canonical-people-first-content-attestation + per-canonical-EEAT-Experience-Expertise-Authoritativeness-Trustworthiness-attestation + per-canonical-author-credentials-attestation + per-canonical-original-research-attestation + per-canonical-pre-publish-helpful-content-LLM-evaluation. Per-per-location-page-generator-agent-canonical-bundle integrates the content-distinctness-gate skill with sibling skills on the same agent: per-canonical-location-page-authoring (sibling, build-pillar shipped at /how-to-build-per-location-landing-pages-at-scale — upstream producer of pages this gate evaluates) + per-canonical-per-location-page-content-cannibalization (sibling, complementary cannibalization defense) + per-canonical-multi-location-internal-linking (sibling, consumer of distinctness-gated pages for internal-linking) + per-canonical-multi-location-jsonld-generation (sibling, consumer of distinctness-gated pages for JSON-LD generation).
Engage the per-location-page-generator agent
Per-portfolio per-page per-canonical-pre-publish-gate-spec + per-cross-page-similarity-detection + per-doorway-page-detection + per-LLM-as-judge-distinctness-evaluation + per-AI-generated-content-detection + per-thin-content-detection + per-page-distinctness-severity-tiering + per-pre-publish-decisioning + per-pre-publish-routing + per-pre-publish-rewrite-recommendation + per-post-publish-monitoring + per-Helpful-Content-System-compliance-attestation + per-portfolio audit-trail shipped as the orchestration layer above your existing per-content-similarity-vendor + per-AI-content-detector-vendor + per-LLM-as-judge-vendor + per-Google-Search-Console-anomaly-vendor primitive.
Related reading
- Programmatic SEO (parent commercial pillar — buyer-outcome framing)
- Per-location landing pages at scale (sibling build-pillar on per-location-page-generator agent — upstream producer of pages this gate evaluates)
- Cross-location cannibalization detection (companion architecture — distinctness substrate for cannibalization defense)