Completions

Get-found swarm · Internal Link Orchestration Agent · Orphan-page-detection skill · Build pillar · Published August 21, 2026

How to find orphan pages at multi-location scale

This guide explains how to architect the orphan-page-detection skill on the internal-link-orchestration agent end-to-end at multi-location find-orphan-pages scale: per-portfolio per-banner per-canonical-crawl-source-pointer + per-canonical-truth-source-pointer + per-canonical-orphan-class-spec + per-canonical-detection-engine-spec + per-canonical-revenue-impact-estimation-spec + per-canonical-remediation-routing-spec + per-canonical-link-equity-flow-spec + per-canonical-compliance-gate-spec + per-canonical-audit-trail + per-portfolio-audit-trail.

What you will build

  • Per-portfolio per-banner per-canonical-crawl-source-pointer across 15+ crawler vendors — Screaming Frog SEO Spider + Sitebulb + OnCrawl + DeepCrawl/Lumar + Botify + JetOctopus + ContentKing/Conductor + Ahrefs Site Audit + Semrush Site Audit + Moz Pro Crawler + Sistrix Optimizer + Searchmetrics + Sitechecker.pro + Audisto + Ryte.
  • Per-canonical-truth-source-pointer— across 10+ source-of-truth datasets that define "what should exist": XML sitemap + sitemap_index.xml + per-sub-sitemap + database/CMS source (Shopify + WordPress + Sanity + Contentful + Strapi + Webflow + Next.js generated + Sitecore + AEM Adobe Experience Manager) + GSC Performance API URLs + GBP per-location URL list + PIM Akeneo/Salsify/inriver/Pimcore SKU list + internal CRM customer-facing URL list + Vercel/Cloudflare/CDN access logs + server logs Splunk/ELK/DataDog + backlinks file Ahrefs/Semrush/Majestic/Moz Link Explorer/Cognitive SEO + historical 200-OK URL list.
  • Per-canonical-orphan-class-spec— 10 classes: true-orphan (in DB + 200 OK + zero internal links across crawl) + sitemap-only-orphan (sitemap.xml + zero internal links) + database-only-orphan (CMS + not in sitemap + not internally linked) + GSC-orphan (the highest-value diff: Google sees impressions/clicks but internal link graph hides the page) + backlink-only-orphan (third-party linked but not internally) + render-mode-orphan (server-rendered + missing from client-side React/Vue nav after hydration) + pagination-orphan (rel="next"/rel="prev" deprecated 2019 + canonical-to-page-1 misuse common) + faceted-navigation-orphan + per-vertical-orphan (MedicalBusiness/FinancialService/Restaurant-Menu schema gap) + stranded-orphan (no link + no inbound external + no traffic = candidate for 410-gone) + per-orphan-confidence-tier.
  • Per-canonical-detection-engine-spec — per-sitemap-vs-crawl-diff + per-database-vs-crawl-diff + per-GSC-vs-crawl-diff (highest-value diff) + per-backlinks-vs-crawl-diff + per-server-log-vs-crawl-diff + per-historical-200-OK-vs-current-crawl-diff + per-set-theoretic-intersection-union + per-continuous-per-N-hour-recrawl + per-on-deploy-recrawl + per-on-CMS-change-event-driven (Sanity/Contentful/Strapi/Webflow/Shopify/WordPress webhook) + per-on-sitemap-change-event-driven + per-detection-confidence-tier.
  • Per-canonical-revenue-impact-estimation-spec + per-canonical-remediation-routing-spec + per-canonical-link-equity-flow-spec — per-baseline-organic-traffic-projection (Bayesian prior from comparable non-orphan + cohort mean from similar template + GSC existing impression floor) + per-CTR-uplift-from-internal-link-equity-flow + per-conversion-rate + per-revenue-per-visit + per-Bayesian-PyMC-Stan-NumPyro-bambi + per-causal-uplift-CATE-T-S-X-DR-learner + per-synthetic-control + per-difference-in-differences-DiD + per-regression-discontinuity-at-internal-link-add-cutoff + per-Monte-Carlo-simulation + per-sensitivity-analysis + per-loss-from-orphan-status-per-page-per-month + per-revenue-impact-confidence-tier + per-FBC-feedback-loop + per-auto-add-to-sitemap + per-auto-add-internal-link-from-semantically-related-parent (LLM-ensemble classifier) + per-auto-fix-canonical + per-auto-301-redirect (truly defunct) + per-auto-410-gone (stranded) + per-auto-noindex + per-auto-restore (CMS-only + traffic-demand) + per-editorial-review-queue + per-five-destination-routing-handoff + per-per-vertical-remediation-modifier (healthcare HIPAA + financial FINRA + cannabis state board) + per-multi-arm-bandit-UCB-Thompson + per-remediation-confidence-tier + per-PageRank-style-equity-scoring + per-authority-delta-before-vs-after-orphan-resolution + per-anchor-text-diversity-check (Helmer-Yule index + exact-match/partial-match/branded/naked-URL distribution) + per-per-template-internal-link-allocation-budget + per-link-equity-confidence-tier.
  • Per-canonical-compliance-gate-spec — per-Google-Terms-of-Service + per-robots.txt-respect + per-CMS-API-rate-limit + per-WCAG-2.2-AA + per-ARIA + per-EAA-EN-301-549 + per-Section-508 + per-ADA-Title-III + per-FTC-substantiation-stale-claims + per-FTC-Made-in-USA-stale + per-FTC-Green-Guides-stale + per-FTC-Health-Products-Compliance-Guide-stale + per-FTC-fake-review-rule-of-2024-stale-AggregateRating + per-HIPAA-stale-PHI (orphan healthcare pages may have pre-Safe-Harbor PHI — auto-restore must include de-identification re-scan) + per-FINRA-2210-stale-disclaimer (FINRA-regulated-communication time-bomb requires supervisory review pre-use) + per-FDA-OPDP-Rx-drug-stale-indications-contraindications-black-box-warnings + per-FDA-Part-101-nutrition-label-stale + per-FDA-Part-117-food-safety-stale + per-FDA-cosmetic-rule-stale + per-cannabis-state-board-stale-claim + per-alcohol-TABC-CalABC-SLA-stale-warning + per-Surgeon-General-warning-stale + per-tobacco-FDA-stale + per-California-Prop-65-stale + per-CCPA-CPRA-stale-privacy-disclosure + per-GDPR-Article-13-14-stale-information-notice + per-EU-AI-Act-Article-50-stale-AI-disclosure + per-Digital-Services-Act-Article-30 + per-NIST-AI-RMF + per-ISO-42001 + per-ISO-27001 + per-SOC-2-Type-II + per-OPA-Cedar-Casbin-Cerbos-Oso-policy-as-code + per-compliance-confidence-tier.
  • Per-canonical-cross-skill-handoff + per-canonical-audit-trail — per-handoff-to-30-sibling-skills + per-per-orphan-canonical-audit-record + per-immutable-WORM-storage + per-7-year-IRS-tax-retention + per-7-year-FTC-substantiation-retention + per-7-year-HIPAA-medical-record-retention + per-6-year-SEC-record-retention + per-3-year-FINRA-record-retention.

Why per-vendor-Screaming-Frog-account-flat-crawl-snapshot breaks at multi-location find-orphan-pages scale

Per-vendor-Screaming-Frog-canonical-account-flat-crawl-snapshot ships per-account per-flat-crawl-snapshot primitive — typically an SEO opens Screaming Frog on a laptop, points it at the homepage, lets the crawl run for 6-12 hours, exports a CSV of pages found by following internal links, manually imports the sitemap, and diffs the two lists in Excel. No per-canonical-crawl-source taxonomy across the 15+ crawler vendors, no per-canonical-truth-source-pointer across the 10+ source-of-truth datasets, no per-canonical-orphan-class taxonomy across true-orphan/sitemap-only/database-only/GSC-orphan/backlink-only/render-mode/pagination/faceted-nav/per-vertical/stranded, no per-canonical-detection-engine resolving sitemap-vs-crawl-diff/database-vs-crawl-diff/GSC-vs-crawl-diff/backlinks-vs-crawl-diff/server-log-vs-crawl-diff/set-theoretic-intersection-union/continuous/on-deploy/on-CMS-change/on-sitemap-change, no per-orphan revenue-impact estimation resolving Bayesian-PyMC-Stan-NumPyro-bambi/causal-uplift-CATE/synthetic-control/DiD/regression-discontinuity/Monte-Carlo/sensitivity-analysis/loss-from-orphan-status, no per-canonical-remediation-routing resolving auto-add-to-sitemap/auto-add-internal-link-from-semantically-related-parent/auto-fix-canonical/auto-301/auto-410/auto-noindex/auto-restore/editorial-review/five-destination-routing, no per-canonical-link-equity-flow resolving PageRank-style-equity-scoring/authority-delta/anchor-text-diversity/internal-link-allocation-budget, no per-orphan compliance gate with HIPAA stale-PHI / FINRA 2210 stale-disclaimer / FDA OPDP stale-Rx-page / FDA Part 101 stale-nutrition / FDA Part 117 stale-food-safety / cannabis state board stale-claim / alcohol TABC stale-warning / EU AI Act Article 50 / Digital Services Act Article 30 / WCAG / ADA enforcement, no per-orphan audit trail with regulatory-defense retention. Per-vendor-Sitebulb + OnCrawl + DeepCrawl-Lumar + Botify + JetOctopus + ContentKing-Conductor + Ahrefs-Site-Audit + Semrush-Site-Audit + Moz-Pro-Crawler + Sistrix-Optimizer + Searchmetrics + Sitechecker.pro + Audisto + Ryte-canonical-account-flat-crawl-snapshot ship per-vendor per-native account-flat-crawl-snapshot primitives.

At 1-site-1-flat-crawl-snapshot scale per-account per-flat-crawl-snapshot primitive is enough. At multi-location find-orphan-pages scale per-canonical-crawl-source-pointer + per-canonical-truth-source-pointer + per-canonical-orphan-class-spec + per-canonical-detection-engine-spec + per-canonical-revenue-impact-estimation-spec + per-canonical-remediation-routing-spec + per-canonical-link-equity-flow-spec + per-canonical-compliance-gate-spec + per-canonical-audit-trail.

The stale-compliance-content anchor is the operationally distinctive constraint for multi-location operators. A 50-500 location operator at quarterly orphan-detection accumulation of 4,000-12,000 orphan URLs per quarter routinely carries stale FTC substantiation, stale HIPAA-compliant text (pre-Safe-Harbor PHI), stale FINRA 2210 disclaimers, stale FDA OPDP Rx-drug indications/contraindications/black-box-warnings, stale FDA Part 101 nutrition labels, stale cannabis state-board claims, stale alcohol TABC Surgeon General warnings, and stale California Prop 65 warnings on the orphan pages. When the orphan-detection engine auto-restores or auto-relinks these pages, the system must re-scan for regulatory currency or risk re-exposing the operator to enforcement actions across the entire portfolio. Per-vendor account-flat-crawl-snapshot primitives produce a list of URLs with no stale-compliance-content awareness — they auto-suggest internal-link additions that propagate years-old regulatory violations across hundreds of locations.

GSC-orphan (URLs Google sees impressions for but where the internal link graph hides the page) is the second operationally distinctive class: the GSC-orphan is the highest-value remediation target because traffic potential already exists and only the internal-link-equity-flow gap is throttling realization. Per-vendor flat-crawl primitives that do not ingest GSC Performance API cannot detect this class at all.

The operator-side architecture above per-vendor-flat-crawl-snapshot primitive is canonical-crawl-source-pointer + per-truth-source-pointer + per-orphan-class-spec + per-detection-engine-spec + per-revenue-impact-estimation-spec + per-remediation-routing-spec + per-link-equity-flow-spec + per-compliance-gate-spec + per-cross-skill-handoff + per-audit-trail + per-portfolio-audit-trail.

What is in market today

Per-platform per-SEO-crawler-vendor

Screaming Frog SEO Spider, Sitebulb, OnCrawl, DeepCrawl/Lumar, Botify, JetOctopus, ContentKing/Conductor, Ahrefs Site Audit, Semrush Site Audit, Moz Pro Crawler, Sistrix Optimizer, Searchmetrics Suite, Sitechecker.pro, Audisto, Ryte. Per-account per-flat-crawl-snapshot primitive. Per-canonical-crawl-source-pointer-canonical-truth-source-pointer-canonical-orphan-class-canonical-detection-engine-canonical-revenue-impact-estimation-canonical-remediation-routing-canonical-link-equity-flow-canonical-compliance-gate-canonical-audit-trail is not the primitive.

Per-platform per-GSC-server-log-vendor

Search Console Performance API, Bing Webmaster Tools, Cloudflare Logs, Vercel Logs, Akamai Logs, Fastly Logs, AWS CloudFront Logs, Splunk, ELK Stack, DataDog Logs, New Relic Logs, Sumo Logic, Logz.io, Logfile Analyser Screaming Frog, JetOctopus Logfile, Botify Logfile, OnCrawl Logfile. Per-account per-flat-log-row primitive (typically blind to per-truth-source-pointer set-theoretic intersection union with sitemap + database + GSC + backlinks + historical 200-OK URL list semantics). Per-canonical-truth-source-XML-sitemap-canonical-truth-source-database-CMS-Shopify-WordPress-Sanity-Contentful-Strapi-Webflow-Next.js-Sitecore-AEM-canonical-truth-source-GSC-Performance-API-canonical-truth-source-GBP-per-location-URL-canonical-truth-source-PIM-SKU-canonical-truth-source-internal-CRM-URL-canonical-truth-source-CDN-access-log-canonical-truth-source-server-log-canonical-truth-source-backlinks-file-canonical-truth-source-historical-200-OK-URL-list is not the primitive.

Per-platform per-CMS-PIM-vendor

Shopify, WordPress, Sanity, Contentful, Strapi, Webflow CMS, Next.js generated pages, Sitecore, Adobe Experience Manager, Drupal, Joomla, Squarespace, Wix, BigCommerce, Magento/Adobe Commerce, WooCommerce, Akeneo, Salsify, Productsup, inriver, Pimcore, Plytix, Syndigo. Per-account per-flat-CMS-row or per-flat-PIM-attribute primitive (typically blind to per-detection-engine sitemap-vs-crawl-diff + database-vs-crawl-diff + GSC-vs-crawl-diff + backlinks-vs-crawl-diff + server-log-vs-crawl-diff + set-theoretic intersection/union + continuous/on-deploy/on-CMS-change/on-sitemap-change semantics). Per-canonical-sitemap-vs-crawl-diff-canonical-database-vs-crawl-diff-canonical-GSC-vs-crawl-diff-canonical-backlinks-vs-crawl-diff-canonical-server-log-vs-crawl-diff-canonical-historical-200-OK-vs-current-crawl-diff-canonical-set-theoretic-intersection-union-canonical-continuous-per-N-hour-recrawl-canonical-on-deploy-recrawl-canonical-on-CMS-change-event-driven-canonical-on-sitemap-change-event-driven-canonical-detection-confidence-tier is not the primitive.

Per-platform per-compliance-GRC-vendor + per-CMP-vendor

Hyperproof, Drata, Vanta, Thoropass, Tugboat Logic, Compliance.ai, Ascent RegTech, OneTrust, TrustArc, Ketch, Securiti, Privacera, Skyflow, BigID, DataGrail, Transcend, Osano, Cookiebot, Didomi, Sourcepoint, Iubenda, HIPAA Vault, Atlantic Health Solutions, Compliancy Group, HIPAA Secure Now, Foley & Lardner Lawyer Compliance, ABA Center for Professional Responsibility. Per-account per-flat-compliance-report or per-flat-consent primitive (typically blind to per-orphan stale-FTC-substantiation + stale-FTC-Made-in-USA + stale-FTC-Green-Guides + stale-FTC-fake-review-rule-of-2024-AggregateRating + stale-HIPAA-PHI orphan healthcare = Safe Harbor re-scan required + stale-FINRA-2210-disclaimer FINRA-regulated-communication time-bomb + stale-FDA-OPDP-Rx-indications-contraindications-black-box + stale-cannabis-state-board-claim + stale-alcohol-TABC-Surgeon-General-warning + stale-California-Prop-65 + stale-CCPA-privacy-disclosure + stale-GDPR-Article-13-14-information-notice + stale-EU-AI-Act-Article-50-AI-disclosure semantics). Per-canonical-per-orphan-FTC-substantiation-stale-canonical-per-orphan-FTC-Made-in-USA-stale-canonical-per-orphan-FTC-Green-Guides-stale-canonical-per-orphan-FTC-Health-Products-Compliance-Guide-stale-canonical-per-orphan-FTC-fake-review-rule-of-2024-stale-canonical-per-orphan-HIPAA-stale-PHI-canonical-per-orphan-FINRA-2210-stale-disclaimer-canonical-per-orphan-FDA-OPDP-Rx-stale-canonical-per-orphan-FDA-Part-101-stale-canonical-per-orphan-FDA-Part-117-stale-canonical-per-orphan-FDA-cosmetic-stale-canonical-per-orphan-cannabis-state-board-stale-canonical-per-orphan-alcohol-TABC-stale-canonical-per-orphan-Surgeon-General-warning-stale-canonical-per-orphan-tobacco-FDA-stale-canonical-per-orphan-California-Prop-65-stale-canonical-per-orphan-CCPA-CPRA-stale-canonical-per-orphan-GDPR-Article-13-14-stale-canonical-per-orphan-EU-AI-Act-Article-50-stale is not the primitive.

How the architecture is built

  1. Per-portfolio per-banner per-canonical-crawl-source-pointer-substrate. Per-15-canonical-crawl-source canonical-source.
  2. Per-portfolio per-canonical-truth-source-pointer-substrate. Per-10-canonical-truth-source canonical-source.
  3. Per-portfolio per-canonical-orphan-class-spec. Per-10-orphan-class + per-orphan-confidence-tier canonical-orphan-class.
  4. Per-portfolio per-canonical-detection-engine-spec. Per-sitemap-vs-crawl-diff + per-database-vs-crawl-diff + per-GSC-vs-crawl-diff + per-backlinks-vs-crawl-diff + per-server-log-vs-crawl-diff + per-historical-200-OK-vs-current-crawl-diff + per-set-theoretic-intersection-union + per-continuous-per-N-hour-recrawl + per-on-deploy-recrawl + per-on-CMS-change-event-driven + per-on-sitemap-change-event-driven + per-detection-confidence-tier canonical-detection.
  5. Per-portfolio per-canonical-revenue-impact-estimation-spec. Per-baseline-organic-traffic-projection + per-CTR-uplift-from-internal-link-equity + per-conversion-rate + per-revenue-per-visit + per-Bayesian-PyMC-Stan-NumPyro-bambi + per-causal-uplift-CATE + per-synthetic-control + per-difference-in-differences + per-regression-discontinuity + per-Monte-Carlo + per-sensitivity-analysis + per-loss-from-orphan-status + per-revenue-impact-confidence-tier + per-FBC-feedback-loop canonical-revenue-impact.
  6. Per-portfolio per-canonical-remediation-routing-spec. Per-auto-add-to-sitemap + per-auto-add-internal-link-from-semantically-related-parent + per-auto-fix-canonical + per-auto-301-redirect + per-auto-410-gone + per-auto-noindex + per-auto-restore + per-editorial-review-queue + per-five-destination-routing-handoff + per-per-vertical-remediation-modifier + per-multi-arm-bandit-UCB-Thompson + per-remediation-confidence-tier canonical-remediation.
  7. Per-portfolio per-canonical-link-equity-flow-spec. Per-PageRank-style-equity-scoring + per-authority-delta-before-vs-after-orphan-resolution + per-anchor-text-diversity-Helmer-Yule-index + per-per-template-internal-link-allocation-budget + per-link-equity-confidence-tier canonical-link-equity-flow.
  8. Per-portfolio per-canonical-compliance-gate-spec. Per-Google-Terms-of-Service + per-robots.txt-respect + per-CMS-API-rate-limit + per-WCAG-2.2-AA + per-ARIA + per-EAA-EN-301-549 + per-Section-508 + per-ADA-Title-III + per-FTC-substantiation-stale + per-FTC-Made-in-USA-stale + per-FTC-Green-Guides-stale + per-FTC-Health-Products-Compliance-Guide-stale + per-FTC-fake-review-rule-of-2024-stale-AggregateRating + per-HIPAA-stale-PHI + per-FINRA-2210-stale-disclaimer + per-FDA-OPDP-Rx-stale + per-FDA-Part-101-stale + per-FDA-Part-117-stale + per-FDA-cosmetic-stale + per-cannabis-state-board-stale + per-alcohol-TABC-CalABC-SLA-stale + per-Surgeon-General-warning-stale + per-tobacco-FDA-stale + per-California-Prop-65-stale + per-CCPA-CPRA-stale + per-GDPR-Article-13-14-stale + per-EU-AI-Act-Article-50-stale + per-Digital-Services-Act-Article-30 + per-NIST-AI-RMF + per-ISO-42001 + per-ISO-27001 + per-SOC-2-Type-II + per-OPA-Cedar-Casbin-Cerbos-Oso-policy-as-code canonical-compliance.
  9. Per-portfolio per-canonical-cross-skill-handoff. Per-handoff-to-30-sibling-skills canonical-handoff.
  10. Per-portfolio per-canonical-audit-trail + per-portfolio-audit-trail. Per-per-orphan-canonical-audit-record + per-immutable-WORM-storage + per-7-year-IRS-tax-retention + per-7-year-FTC-substantiation-retention + per-7-year-HIPAA-medical-record-retention + per-6-year-SEC-record-retention + per-3-year-FINRA-record-retention canonical-audit.

Frequently asked questions

What is orphan-page detection at multi-location scale — and how do you find orphan pages across 50-500 stores?

An orphan page is a URL that exists on the domain (returns 200 OK) but has no internal-link path from another page on the same domain. At multi-location scale a 50-500-location operator with PDPs + per-location landing pages + per-vertical schema variants + per-state legal disclaimers + per-promo period pages routinely accumulates 4,000-12,000 orphan URLs per quarter. Per-portfolio per-banner per-canonical-crawl-source-pointer (per-Screaming-Frog-SEO-Spider + per-Sitebulb + per-OnCrawl + per-DeepCrawl-Lumar + per-Botify + per-JetOctopus + per-ContentKing-Conductor + per-Ahrefs-Site-Audit + per-Semrush-Site-Audit + per-Moz-Pro-Crawler + per-Sistrix-Optimizer + per-Searchmetrics + per-Sitechecker.pro + per-Audisto + per-Ryte + per-canonical-crawl-source) + per-canonical-truth-source-pointer (per-XML-sitemap + per-sitemap_index.xml + per-per-sub-sitemap + per-database-CMS-source (per-Shopify + per-WordPress + per-Sanity + per-Contentful + per-Strapi + per-Webflow + per-Next.js-generated + per-Sitecore + per-AEM-Adobe-Experience-Manager) + per-GSC-Performance-API-URLs + per-GBP-per-location-URL-list + per-PIM-Akeneo-Salsify-inriver-Pimcore-SKU-list + per-internal-CRM-customer-facing-URL-list + per-Vercel-Cloudflare-CDN-access-logs + per-server-log-Splunk-ELK-DataDog-Logs + per-backlinks-file-Ahrefs-Semrush-Majestic-Moz-Link-Explorer + per-historical-200-OK-URL-list + per-canonical-truth-source) + per-canonical-orphan-class-spec + per-canonical-detection-engine-spec + per-canonical-revenue-impact-estimation-spec + per-canonical-remediation-routing-spec + per-canonical-link-equity-flow-spec + per-canonical-compliance-gate-spec + per-canonical-audit-trail + per-portfolio-audit-trail.

Why does per-vendor-Screaming-Frog-canonical-account-flat-crawl-snapshot break at multi-location find-orphan-pages scale?

Per-vendor-Screaming-Frog-canonical-account-flat-crawl-snapshot ships per-account per-flat-crawl-snapshot primitive — typically an SEO opens Screaming Frog on a laptop, points it at the homepage, lets the crawl run for 6-12 hours, exports a CSV of pages found by following internal links, manually imports the sitemap, and diffs the two lists in Excel. No per-canonical-crawl-source taxonomy across the 15+ crawler vendors, no per-canonical-truth-source-pointer across the 10+ source-of-truth datasets that define "what should exist" (XML sitemap + database/CMS source + GSC Performance API + GBP per-location URL list + PIM SKU list + internal CRM portal pages + Vercel/Cloudflare/CDN access logs + server logs + backlinks file + historical 200-OK URL list), no per-canonical-orphan-class resolving true-orphan (in DB but no internal link) vs sitemap-only-orphan vs database-only-orphan vs GSC-orphan (GSC sees traffic but no internal link path) vs backlink-only-orphan vs render-mode-orphan (server-rendered but missing from client-side nav) vs pagination-orphan vs faceted-navigation-orphan vs per-vertical-orphan vs stranded-orphan (no internal link + no inbound external + no traffic), no per-canonical-detection-engine resolving sitemap-vs-crawl-diff / database-vs-crawl-diff / GSC-vs-crawl-diff / backlinks-vs-crawl-diff / server-log-vs-crawl-diff / set-theoretic intersection/union across all sources / continuous per-N-hour recrawl / on-deploy recrawl / on-CMS-change event-driven / on-sitemap-change event-driven, no per-canonical-revenue-impact-estimation resolving per-page baseline organic traffic projection + per-page CTR uplift from internal-link-equity flow + per-page conversion rate + per-page revenue per visit + Bayesian PyMC/Stan/NumPyro/bambi + causal uplift CATE + synthetic control / DiD / regression discontinuity / loss-from-orphan-status estimate, no per-canonical-remediation-routing resolving auto-add-to-sitemap + auto-add-internal-link-from-semantically-related-parent + auto-fix-canonical + auto-301-redirect (truly defunct) + auto-410-gone (permanently removed) + auto-noindex (intentionally orphaned) + auto-restore (CMS-only orphan + traffic data shows demand) + editorial-review-queue + five-destination-routing-handoff, no per-canonical-link-equity-flow resolving PageRank-style equity scoring + per-page authority delta before-vs-after-orphan-resolution + anchor-text diversity check + per-template internal-link allocation budget, no per-orphan compliance gate with HIPAA stale-PHI + FINRA 2210 stale-disclaimer / FDA OPDP stale-Rx-page / FDA Part 101 stale-nutrition-label / FDA Part 117 stale-food-safety / cannabis state board stale-claim / alcohol TABC stale-warning / EU AI Act Article 50 / Digital Services Act Article 30 / WCAG / ADA enforcement, no per-orphan audit trail with regulatory-defense retention. Per-vendor-Sitebulb + OnCrawl + DeepCrawl-Lumar + Botify + JetOctopus + ContentKing-Conductor + Ahrefs-Site-Audit + Semrush-Site-Audit + Moz-Pro-Crawler + Sistrix-Optimizer + Searchmetrics + Sitechecker.pro + Audisto + Ryte-canonical-account-flat-crawl-snapshot ship per-vendor per-native account-flat-crawl-snapshot primitives. At 1-site-1-flat-crawl-snapshot scale per-account per-flat-crawl-snapshot primitive is enough. At multi-location find-orphan-pages scale per-canonical-crawl-source-pointer + per-canonical-truth-source-pointer + per-canonical-orphan-class-spec + per-canonical-detection-engine-spec + per-canonical-revenue-impact-estimation-spec + per-canonical-remediation-routing-spec + per-canonical-link-equity-flow-spec + per-canonical-compliance-gate-spec + per-canonical-audit-trail.

How does per-page orphan-class-detection + per-orphan revenue-impact-estimation work?

Per-portfolio per-banner per-page per-canonical-orphan-class-spec runs per-portfolio per-canonical-per-page-true-orphan (in DB + returns 200 OK + zero internal links across full site crawl) + per-canonical-per-page-sitemap-only-orphan (in sitemap.xml + zero internal links — common when sitemap auto-includes all DB rows but the CMS-rendered navigation skips a category) + per-canonical-per-page-database-only-orphan (in CMS + not in sitemap + not internally linked — common with unpublished or future-publish content) + per-canonical-per-page-GSC-orphan (GSC Performance API shows impressions/clicks for the URL but no internal link path — high-value orphan; the page is winning SERP positions while strangled internally) + per-canonical-per-page-backlink-only-orphan (third-party backlink target + no internal link — equity leakage) + per-canonical-per-page-render-mode-orphan (server-rendered or static-generated but missing from client-side React/Vue nav after hydration — JS-rendered nav gap) + per-canonical-per-page-pagination-orphan (paginated page 2+ exists but not linked from page 1; rel="next"/rel="prev" deprecated 2019 + canonical-to-page-1 misuse common) + per-canonical-per-page-faceted-navigation-orphan (filter combination URL exists but no internal link path) + per-canonical-per-page-per-vertical-orphan (medical procedure page not in MedicalBusiness schema graph + financial product page not in FinancialService schema graph + restaurant menu item page not in Restaurant Menu schema graph) + per-canonical-per-page-stranded-orphan (no internal link + no inbound external + no traffic + no impressions + candidate for 410-gone) + per-canonical-per-page-orphan-confidence-tier + per-canonical-per-page-orphan-explainability. Per-canonical-detection-engine-spec runs per-portfolio per-canonical-sitemap-vs-crawl-diff (set-theoretic A-minus-B where A = sitemap URLs and B = crawl-discovered URLs from internal-link follow) + per-canonical-database-vs-crawl-diff + per-canonical-GSC-vs-crawl-diff (the highest-value diff — URLs Google sees but the internal-link graph hides) + per-canonical-backlinks-vs-crawl-diff (Ahrefs + Semrush + Majestic + Moz Link Explorer + Cognitive SEO backlinks file) + per-canonical-server-log-vs-crawl-diff (URLs that returned 200 in last 90 days of server logs but appear as orphans in current crawl) + per-canonical-per-page-historical-200-OK-vs-current-crawl-diff + per-canonical-set-theoretic-intersection-union (intersection of multiple diffs = high-confidence orphan; union = full orphan candidate pool) + per-canonical-continuous-per-N-hour-recrawl + per-canonical-on-deploy-recrawl + per-canonical-on-CMS-change-event-driven (Sanity + Contentful + Strapi + Webflow + Shopify + WordPress webhook) + per-canonical-on-sitemap-change-event-driven + per-canonical-detection-confidence-tier. Per-canonical-revenue-impact-estimation-spec runs per-portfolio per-canonical-per-orphan-baseline-organic-traffic-projection-if-linked (per-Bayesian-prior-from-comparable-non-orphan-page + per-cohort-mean-from-similar-template + per-GSC-existing-impression-floor) + per-canonical-per-orphan-CTR-uplift-from-internal-link-equity-flow (per-PageRank-equity-injection + per-anchor-text-relevance + per-template-link-position) + per-canonical-per-orphan-conversion-rate + per-canonical-per-orphan-revenue-per-visit + per-canonical-per-orphan-Bayesian-PyMC-Stan-NumPyro-bambi + per-canonical-per-orphan-causal-uplift-CATE-T-S-X-DR-learner + per-canonical-per-orphan-synthetic-control + per-canonical-per-orphan-difference-in-differences-DiD + per-canonical-per-orphan-regression-discontinuity-at-internal-link-add-cutoff + per-canonical-per-orphan-Monte-Carlo-simulation + per-canonical-per-orphan-sensitivity-analysis + per-canonical-per-orphan-loss-from-orphan-status-per-page-per-month + per-canonical-per-orphan-revenue-impact-confidence-tier + per-canonical-per-orphan-FBC-feedback-loop (per-realized-vs-predicted-traffic-uplift-after-remediation + per-realized-vs-predicted-conversion-uplift + per-realized-vs-predicted-revenue-uplift + per-pattern-learning + per-multi-arm-bandit-regret + per-recalibration).

What does per-orphan remediation-routing + per-orphan link-equity-flow + per-orphan compliance-gate do?

Per-portfolio per-banner per-page per-orphan per-canonical-remediation-routing-spec runs per-portfolio per-canonical-per-orphan-auto-add-to-sitemap (when CMS publishes the page but sitemap generator missed it) + per-canonical-per-orphan-auto-add-internal-link-from-semantically-related-parent (per-LLM-ensemble-classifier identifies semantic parent + adds contextual anchor + respects per-template internal-link allocation budget) + per-canonical-per-orphan-auto-fix-canonical (when canonical tag points to wrong URL) + per-canonical-per-orphan-auto-301-redirect (when page is truly defunct + has backlinks to preserve) + per-canonical-per-orphan-auto-410-gone (when stranded orphan with no traffic + no backlinks + no demand signal) + per-canonical-per-orphan-auto-noindex (intentionally orphaned page like checkout or thank-you) + per-canonical-per-orphan-auto-restore (CMS-only orphan + GSC traffic shows demand + revenue impact estimation > threshold) + per-canonical-per-orphan-editorial-review-queue (when confidence below auto-apply threshold) + per-canonical-per-orphan-five-destination-routing-handoff (sibling skill on governance-decision-router agent) + per-canonical-per-orphan-per-vertical-remediation-modifier (healthcare orphan + stale PHI = must remediate immediately; financial orphan + stale FINRA disclaimer = supervisory review; cannabis orphan + stale state-board claim = legal review) + per-canonical-per-orphan-multi-arm-bandit-UCB-Thompson-Epsilon-Greedy-LinUCB-Contextual + per-canonical-per-orphan-remediation-confidence-tier + per-canonical-per-orphan-remediation-explainability. Per-canonical-link-equity-flow-spec runs per-portfolio per-canonical-per-page-PageRank-style-equity-scoring + per-canonical-per-page-authority-delta-before-vs-after-orphan-resolution + per-canonical-per-page-anchor-text-diversity-check (per-Helmer-Yule-anchor-diversity-index + per-exact-match-vs-partial-match-vs-branded-vs-naked-URL-distribution) + per-canonical-per-template-internal-link-allocation-budget (per-page-template caps internal-link count to avoid PageRank dilution + respects W3C recommendation) + per-canonical-per-page-link-equity-confidence-tier. Per-canonical-compliance-gate-spec runs per-portfolio per-canonical-per-orphan-Google-Terms-of-Service + per-canonical-per-orphan-robots.txt-respect + per-canonical-per-orphan-per-CMS-API-rate-limit + per-canonical-per-orphan-WCAG-2.2-AA-orphan-accessibility-regression-check (orphan pages often have stale ARIA + missing alt-text from prior template versions) + per-canonical-per-orphan-ARIA + per-canonical-per-orphan-EAA-EN-301-549 + per-canonical-per-orphan-Section-508 + per-canonical-per-orphan-ADA-Title-III + per-canonical-per-orphan-FTC-substantiation (orphan PDPs may have stale claims that violate substantiation when re-linked) + per-canonical-per-orphan-FTC-Made-in-USA-stale + per-canonical-per-orphan-FTC-Green-Guides-stale + per-canonical-per-orphan-FTC-Health-Products-Compliance-Guide-stale + per-canonical-per-orphan-FTC-fake-review-rule-of-2024-stale-AggregateRating + per-canonical-per-orphan-HIPAA-stale-PHI (orphan healthcare pages may have stale PHI that leaked before HIPAA Safe Harbor de-identification was enforced — auto-restore must include PHI re-scan) + per-canonical-per-orphan-FINRA-2210-stale-disclaimer (orphan financial pages = FINRA-regulated-communication time-bomb; auto-restore requires supervisory review pre-use) + per-canonical-per-orphan-FDA-OPDP-Rx-drug-stale (Rx drug pages with outdated indications/contraindications/black-box warnings) + per-canonical-per-orphan-FDA-Part-101-nutrition-label-stale + per-canonical-per-orphan-FDA-Part-117-food-safety-stale + per-canonical-per-orphan-FDA-cosmetic-rule-stale + per-canonical-per-orphan-cannabis-state-board-stale-claim + per-canonical-per-orphan-alcohol-TABC-CalABC-SLA-stale-warning + per-canonical-per-orphan-Surgeon-General-warning-stale + per-canonical-per-orphan-tobacco-FDA-stale + per-canonical-per-orphan-California-Prop-65-stale + per-canonical-per-orphan-CCPA-CPRA-stale-privacy-disclosure + per-canonical-per-orphan-GDPR-Article-13-14-stale-information-notice + per-canonical-per-orphan-EU-AI-Act-Article-50-stale-AI-disclosure + per-canonical-per-orphan-Digital-Services-Act-Article-30 + per-canonical-per-orphan-NIST-AI-RMF + per-canonical-per-orphan-ISO-42001 + per-canonical-per-orphan-ISO-27001 + per-canonical-per-orphan-SOC-2-Type-II + per-canonical-per-orphan-OPA-Rego-AWS-Cedar-Casbin-Cerbos-Oso-policy-as-code + per-canonical-per-orphan-compliance-confidence-tier. The stale-compliance-content anchor is the operationally distinctive constraint for multi-location operators: orphan pages frequently carry stale FTC substantiation, stale HIPAA-compliant text, stale FINRA disclaimers, or stale FDA OPDP indications — when the orphan-detection engine auto-restores or auto-relinks these pages, the system must re-scan for regulatory currency or risk re-exposing the operator to enforcement actions across the entire portfolio.

What does per-orphan cross-skill-handoff + per-internal-link-orchestration-agent-canonical-bundle do?

Per-portfolio per-orphan per-canonical-per-orphan-cross-skill-handoff runs per-portfolio per-canonical-per-orphan-handoff-to-multi-location-orphan-page-detection (parent commercial pillar at /multi-location-orphan-page-detection) + per-canonical-per-orphan-handoff-to-internal-link-orchestration (parent agent) + per-canonical-per-orphan-handoff-to-url-hierarchy-authoring + per-canonical-per-orphan-handoff-to-canonical-tag-management + per-canonical-per-orphan-handoff-to-link-sculpting-at-scale + per-canonical-per-orphan-handoff-to-rich-result-eligibility-scoring-build-pillar (sibling build-pillar at /how-to-build-per-location-rich-result-eligibility-scoring-and-revenue-impact-estimation — adjacent crawl/render surface) + per-canonical-per-orphan-handoff-to-jsonld-generation-build-pillar (sibling build-pillar at /how-to-build-17-schema-class-jsonld-generation-from-master-record — orphan pages often have stale JSON-LD) + per-canonical-per-orphan-handoff-to-continuous-schema-audit + per-canonical-per-orphan-handoff-to-per-vertical-schema-validation + per-canonical-per-orphan-handoff-to-auto-compose-schema + per-canonical-per-orphan-handoff-to-per-vertical-catalog-schema-validation + per-canonical-per-orphan-handoff-to-multi-location-seo-architecture + per-canonical-per-orphan-handoff-to-franchise-local-seo-orchestration + per-canonical-per-orphan-handoff-to-ai-overview-tracking + per-canonical-per-orphan-handoff-to-title-rewrite-tracking + per-canonical-per-orphan-handoff-to-seo-alerts + per-canonical-per-orphan-handoff-to-hyper-local-search-trends-build-pillar (sibling build-pillar at /how-to-build-hyper-local-search-trends-ingestion-for-multi-location-content-engines — orphan pages may have demand from emerging keywords) + per-canonical-per-orphan-handoff-to-tiered-pre-filter-deterministic-gates-build-pillar (sibling build-pillar at /how-to-build-tiered-pre-filter-deterministic-gates-for-ai-content-compliance — auto-restore drafts pass through this gate) + per-canonical-per-orphan-handoff-to-marketing-content-llm-as-judge-build-pillar (sibling build-pillar at /how-to-build-marketing-content-llm-as-judge-semantic-compliance-scorer — auto-restore drafts pass through the semantic scorer) + per-canonical-per-orphan-handoff-to-marketing-ai-autonomy-profile-configuration-build-pillar (auto-apply vs editorial-review threshold) + per-canonical-per-orphan-handoff-to-per-jurisdiction-compliance-multi-state-franchise-build-pillar + per-canonical-per-orphan-handoff-to-per-sku-description-generation-build-pillar (sibling build-pillar at /how-to-build-sku-by-channel-bulk-description-orchestration-at-catalog-scale — orphan PDPs trigger description refresh) + per-canonical-per-orphan-handoff-to-per-location-per-cohort-two-sigma-anomaly-detection-build-pillar + per-canonical-per-orphan-handoff-to-foot-traffic-integration-build-pillar + per-canonical-per-orphan-handoff-to-routing-audit-trail-build-pillar + per-canonical-per-orphan-handoff-to-master-record-build-pillar + per-canonical-per-orphan-handoff-to-versioned-customer-history-DSAR-build-pillar + per-canonical-per-orphan-handoff-to-versioned-history-regulatory-defense-build-pillar + per-canonical-per-orphan-handoff-to-customer-change-event-emission-build-pillar + per-canonical-per-orphan-handoff-to-multi-source-attribution-preserving-lead-ingestion-build-pillar + per-canonical-per-orphan-handoff-to-event-tie-in-drafting-build-pillar + per-canonical-per-orphan-handoff-to-weather-seasonality-patterns-build-pillar + per-canonical-per-orphan-handoff-to-cs-agent-assist-build-pillar + per-canonical-per-orphan-handoff-to-review-response-drafting-build-pillar (sibling build-pillar at /how-to-build-per-location-ai-review-response-drafting-at-multi-location-scale) + per-canonical-per-orphan-handoff-to-callback-schedule-link-build-pillar + per-canonical-per-orphan-handoff-to-borderline-routing + per-canonical-per-orphan-handoff-to-five-destination-routing + per-canonical-per-orphan-handoff-to-fbc-override-learning + per-canonical-per-orphan-handoff-to-multi-dimensional-threshold-routing + per-canonical-per-orphan-handoff-to-brand-voice-management + per-canonical-per-orphan-handoff-to-forbidden-phrase-library + per-canonical-per-orphan-handoff-to-claims-allowlist-substantiation. Per-internal-link-orchestration-agent-canonical-bundle integrates the orphan-page-detection skill with sibling skills on the same internal-link-orchestration agent: per-canonical-orphan-page-detection (this skill) + per-canonical-url-hierarchy-authoring + per-canonical-canonical-tag-management + per-canonical-redirect-chain-collapsing + per-canonical-anchor-text-diversity-balancing + per-canonical-link-sculpting-at-scale + per-canonical-PageRank-equity-distribution + per-canonical-link-graph-emit-to-page-generator-GBP-citation-link-build-cannibalization-defense. Per-canonical-end-to-end-SLA runs per-canonical-per-orphan-crawl-source-resolve-to-truth-source-resolve-to-orphan-class-detection-to-revenue-impact-estimation-to-remediation-routing-to-link-equity-flow-update-to-stale-compliance-content-rescan-to-FBC-feedback-loop-SLA canonical-SLA.

What does per-orphan audit-trail + per-canonical-end-to-end-replay do?

Per-portfolio per-orphan per-canonical-audit-trail runs per-portfolio per-canonical-per-orphan-canonical-audit-record (per-orphan-ID + per-banner-pointer + per-URL-pointer + per-canonical-crawl-source-snapshot + per-Screaming-Frog-Sitebulb-OnCrawl-DeepCrawl-Lumar-Botify-JetOctopus-ContentKing-Conductor-Ahrefs-Semrush-Moz-Sistrix-Searchmetrics-Sitechecker-Audisto-Ryte-snapshot + per-truth-source-snapshot + per-XML-sitemap-snapshot + per-database-CMS-source-snapshot + per-GSC-Performance-API-snapshot + per-GBP-per-location-URL-list-snapshot + per-PIM-SKU-list-snapshot + per-internal-CRM-URL-list-snapshot + per-CDN-access-log-snapshot + per-server-log-snapshot + per-backlinks-file-snapshot + per-historical-200-OK-URL-list-snapshot + per-orphan-class-snapshot (true / sitemap-only / database-only / GSC / backlink-only / render-mode / pagination / faceted-nav / per-vertical / stranded) + per-orphan-confidence-tier-snapshot + per-detection-engine-set-theoretic-diff-snapshot + per-continuous-on-deploy-on-CMS-change-event-driven-snapshot + per-detection-confidence-tier-snapshot + per-baseline-organic-traffic-projection-snapshot + per-CTR-uplift-snapshot + per-conversion-rate-snapshot + per-revenue-per-visit-snapshot + per-Bayesian-PyMC-Stan-NumPyro-bambi-snapshot + per-causal-uplift-CATE-snapshot + per-synthetic-control-snapshot + per-DiD-snapshot + per-regression-discontinuity-snapshot + per-Monte-Carlo-snapshot + per-sensitivity-analysis-snapshot + per-loss-from-orphan-status-snapshot + per-revenue-impact-confidence-tier-snapshot + per-remediation-routing-snapshot (auto-add-sitemap / auto-add-internal-link / auto-fix-canonical / auto-301 / auto-410 / auto-noindex / auto-restore / editorial-review) + per-per-vertical-remediation-modifier-snapshot + per-multi-arm-bandit-snapshot + per-remediation-confidence-tier-snapshot + per-PageRank-equity-scoring-snapshot + per-authority-delta-snapshot + per-anchor-text-diversity-snapshot + per-internal-link-allocation-budget-snapshot + per-link-equity-confidence-tier-snapshot + per-Google-Terms-of-Service-snapshot + per-robots.txt-snapshot + per-WCAG-2.2-AA-snapshot + per-ARIA-snapshot + per-EAA-EN-301-549-snapshot + per-Section-508-snapshot + per-ADA-Title-III-snapshot + per-FTC-substantiation-stale-snapshot + per-FTC-Made-in-USA-stale-snapshot + per-FTC-Green-Guides-stale-snapshot + per-FTC-Health-Products-Compliance-Guide-stale-snapshot + per-FTC-fake-review-rule-of-2024-stale-snapshot + per-HIPAA-stale-PHI-snapshot + per-FINRA-2210-stale-disclaimer-snapshot + per-FDA-OPDP-stale-snapshot + per-FDA-Part-101-stale-snapshot + per-FDA-Part-117-stale-snapshot + per-FDA-cosmetic-stale-snapshot + per-cannabis-state-board-stale-snapshot + per-alcohol-TABC-CalABC-SLA-stale-snapshot + per-Surgeon-General-warning-stale-snapshot + per-tobacco-FDA-stale-snapshot + per-California-Prop-65-stale-snapshot + per-CCPA-CPRA-stale-snapshot + per-GDPR-Article-13-14-stale-snapshot + per-EU-AI-Act-Article-50-stale-snapshot + per-Digital-Services-Act-Article-30-snapshot + per-NIST-AI-RMF-snapshot + per-ISO-42001-snapshot + per-ISO-27001-snapshot + per-SOC-2-Type-II-snapshot + per-OPA-Cedar-Casbin-Cerbos-Oso-policy-snapshot + per-compliance-confidence-tier-snapshot + per-canonical-audit-record) + per-canonical-immutable-WORM-storage + per-canonical-7-year-IRS-tax-retention + per-canonical-7-year-FTC-substantiation-retention + per-canonical-7-year-HIPAA-medical-record-retention + per-canonical-6-year-SEC-record-retention + per-canonical-3-year-FINRA-record-retention. Per-canonical-end-to-end-replay runs per-portfolio per-canonical-per-orphan-detection-engine-rewind + per-canonical-per-orphan-revenue-impact-estimation-rewind + per-canonical-per-orphan-remediation-routing-rewind + per-canonical-per-orphan-link-equity-flow-rewind + per-canonical-per-orphan-stale-compliance-rescan-rewind + per-canonical-per-orphan-replay-confidence-tier + per-canonical-per-orphan-replay-explainability.

Engage the internal-link-orchestration agent

Per-portfolio per-banner per-canonical-crawl-source-pointer + per-canonical-truth-source-pointer + per-canonical-orphan-class-spec + per-canonical-detection-engine-spec + per-canonical-revenue-impact-estimation-spec + per-canonical-remediation-routing-spec + per-canonical-link-equity-flow-spec + per-canonical-compliance-gate-spec + per-canonical-audit-trail + per-portfolio-audit-trail shipped as the orchestration layer above your existing per-SEO-crawler-vendor + per-GSC-server-log-vendor + per-CMS-PIM-vendor + per-compliance-GRC-vendor + per-CMP-vendor primitive.