Completions

Build pillar · multi-location-seo-architecture agent

How to build internal link equity distribution analysis

Screaming Frog + Sitebulb + OnCrawl + DeepCrawl Lumar + Botify + JetOctopus + Audisto + Ahrefs Site Audit + SEMrush Site Audit + Moz Pro + Sistrix + LinkResearchTools ship per-account flat crawler primitives. The Crawl + Compute + Score + Audit skill bundle on the multi-location-seo-architecture agent sits above the crawler substrate + graph-database substrate (NetworkX + Apache Spark GraphX + Neo4j + ArangoDB + TigerGraph + DGraph + JanusGraph + Amazon Neptune) and writes a per-URL link- equity canonical record across 50k-500k URLs with named regulatory anchors covering Google PageRank US Patent 6285999 + Topical PageRank (Haveliwala) + Reasonable Surfer Model + Hilltop algorithm + Google Search Essentials + per-API license + hiQ Labs/Van Buren CFAA + crawl-budget honoring + USPTO trademark + Lanham Act + per-vertical HIPAA/FINRA/state bar + EU AI Act Article 50.

Published October 31, 2026 · 3,200 words

The 4-skill bundle on the multi-location-seo-architecture agent

One agent. Four coordinated skills. The Crawl + Compute + Score + Audit bundle runs above the crawler substrate (Screaming Frog + Sitebulb + OnCrawl + DeepCrawl + Botify + JetOctopus + Conductor + Audisto + Ahrefs + SEMrush + Moz Pro) + graph- database substrate (NetworkX + Neo4j + ArangoDB + TigerGraph + DGraph + JanusGraph + Amazon Neptune) and writes one canonical per-URL link-equity record.

Crawl

Per-portfolio per-banner per-location concurrent crawl through 15+ crawlers + GSC Crawl Stats + Bing Webmaster + GSC URL Inspection. Per-URL extraction: anchor-text + density + relevance + rel + HTTP status + canonical + hreflang + noindex + meta-robots + X-Robots-Tag + XML sitemap + crawl-depth + crawl-budget. Per-crawl per-domain rate-limit + robots.txt + Sitemap protocol + Google Crawl Budget guidance.

Compute

Per-URL link-equity computation on site graph via NetworkX + Apache Spark GraphX + Neo4j + ArangoDB + TigerGraph + DGraph + JanusGraph + Amazon Neptune. Per-URL Internal-PageRank per Brin-Page 1998 + US Patent 6285999 + Personalized PageRank + Topical PageRank (Haveliwala) + Reasonable Surfer + Hilltop + Block Level PR + TrustRank + SimRank + HITS + Salsa. Community detection (InfoMap + Louvain + Leiden + Girvan-Newman). Centrality (betweenness + closeness + eigenvector + Katz + Bonacich).

Score

12-dimension composite: Internal-PageRank + Topical-PageRank + Reasonable-Surfer + Hilltop + Block Level PR + anchor- text density + anchor-text relevance + rel distribution + HTTP status + canonical + crawl-depth + revenue-attribution from GA4 + Adobe + Mixpanel + Amplitude. Per-URL severity tier P0-P4 + recommendation type (anchor-text revision + internal-link add/remove + canonical correction + redirect chain repair + crawl-depth reduction + sitemap inclusion + rel correction).

Audit

Per-URL link-equity WORM record: per-crawl source + license + CFAA + crawl-budget attestation + per-URL graph snapshot + 12-dimension composite + severity tier + recommendation + Lanham trademark-mention disclosure + per-vertical applicability + accessibility WCAG 2.2 AA. Retention: 7-year FTC + 7-year IRS + 7-year HIPAA + 7-year state bar + 6-year SEC + 3-year FINRA + GDPR Article 30 + EU AI Act Article 12 + SOC 2 CC7/CC8.

The real ecosystem this sits above

Crawl + Compute + Score + Audit does not replace the crawlers or the graph databases. It sits above them, coordinates them, and writes one canonical per-URL link-equity record with named regulatory anchors.

Crawler substrate

  • Screaming Frog + Sitebulb + OnCrawl + DeepCrawl Lumar
  • Botify + JetOctopus + ContentKing Conductor + Audisto
  • Ahrefs Site Audit + SEMrush Site Audit + Moz Pro
  • Sistrix + LinkResearchTools + URL Profiler
  • GSC Crawl Stats + Bing Webmaster + GSC URL Inspection

Graph-database substrate

  • NetworkX + Apache Spark GraphX + Neo4j + ArangoDB
  • TigerGraph + DGraph + JanusGraph + Amazon Neptune
  • igraph + graph-tool community detection
  • InfoMap + Louvain + Leiden + Girvan-Newman + Label Prop
  • Betweenness + closeness + eigenvector + Katz + Bonacich

Revenue-attribution + warehouse

  • GA4 + Adobe Analytics + Mixpanel + Amplitude + Heap
  • Snowflake + BigQuery + Databricks + Redshift + ClickHouse
  • DuckDB + Iceberg + Hudi + Delta Lake table format
  • PageRank Brin-Page 1998 + Google US Patent 6285999
  • Topical PageRank Haveliwala + Reasonable Surfer + Hilltop

Compliance overlay

Five anchors run per-URL before any link-equity-driven recommendation commits. The first anchor is operationally distinctive to internal-link-equity-distribution: Google PageRank patent + Topical PageRank + Reasonable Surfer Model + Hilltop algorithm + per-API license terms intersect CFAA- scraping doctrine + crawl-budget honoring per Google guidance.

Anchor 1: Google PageRank patent + per-API license + CFAA + crawl-budget honoring (operationally distinctive)

Google PageRank algorithm per Brin-Page 1998 + Google US Patent 6285999 (Method for Node Ranking in a Linked Database). Topical PageRank per Haveliwala 2002. Personalized PageRank. Reasonable Surfer Model. Google Hilltop algorithm. Block Level PageRank. TrustRank. SimRank. HITS + Salsa. Google Search Essentials (formerly Webmaster Guidelines). Google Search Quality Rater Guidelines. Per-API license terms (Ahrefs + SEMrush + Moz Mozscape + Sistrix + Screaming Frog License + Botify + OnCrawl + Sitebulb). CFAA 18 USC 1030 + hiQ Labs v LinkedIn 9th Cir 2022 + Van Buren v United States 2021 + Meta v Bright Data ND Cal 2024 scraping doctrine. robots .txt + per-crawl per-domain rate-limit honoring. Sitemap protocol per sitemaps.org. Google Crawl Budget guidance. Google John Mueller + Gary Illyes Search Liaison guidance. PageSpeed Insights + Lighthouse + Core Web Vitals (LCP + FID + CLS + INP). Bing Markup Guidelines + Yandex + Naver.

Anchor 2: USPTO trademark + Lanham + FTC substantiation

USPTO trademark monitoring + Lanham Act 15 USC 1051 trademark dilution + 15 USC 1125(a) false-designation when internal-link anchor text uses competitor brand. FTC Act Section 5 + Pfizer 1972 substantiation when link-equity- driven decisions substantiate marketing claim. FTC MARS. FTC Endorsement Guides 16 CFR Part 255 when internal-link substantiates competitive claim. Robinson-Patman territorial-pricing-discrimination. FDD Item 12 territorial- protection per FTC Franchise Rule 16 CFR 436 + 15-state franchise registration + state UDTPA.

Anchor 3: Per-vertical (HIPAA + FINRA + state bar + medical board)

HIPAA 45 CFR 164.502 + 504 + 514 when MedicalBusiness internal-link includes patient-relationship signal + 7-year HIPAA. FINRA Rule 2210 when FinancialService internal-link + 3-year FINRA. SEC Regulation FD + 6-year SEC when public-company-IR internal-link. ABA Model Rule 7.1-7.5 when LegalService internal-link + 7-year state bar + 50-state matrix. State medical board + state professional licensing. FDA OPDP when prescription. DEA Schedule + alcohol TABC + CalABC + cannabis state-board.

Anchor 4: AI governance + accessibility

EU AI Act Article 22 + 26 + 50 + Article 13 + 14 + 15 + Annex III when AI-ML internal-link scoring drives content prioritization. Digital Services Act + DMA. NIST AI Risk Management Framework. ISO 42001. WCAG 2.2 AA + ADA Title III + Section 508 + EAA EN 301 549 when internal-link rendering must remain accessible. SEC Reg S-K material disclosure. FASB ASC 350 intangible-asset.

Anchor 5: Privacy + WORM retention

GDPR Article 6 + 7 + Article 28 + 30. LGPD + DPDP + PIPEDA + Quebec Law 25 + CCPA + CPRA + COPPA + 18-state privacy. Per-vendor LLM zero-retention + per-source DPA. Policy-as- code via OPA Rego + AWS Cedar + Casbin + Cerbos + Oso + Styra DAS + Permit.io. Storage: AWS S3 Object Lock + Azure Blob immutable + Google Cloud Storage Bucket Lock + Wasabi WORM. Retention: 7-year FTC + 7-year IRS + 7-year HIPAA + 7-year state bar + 6-year SEC + 3-year FINRA + GDPR Article 30 + EU AI Act Article 12 + SOC 2 CC7/CC8.

6-workstream reporting cycle

Every two weeks during a Tier 3 Fractional CMO engagement, six workstreams report against the pre-engagement baseline. No forecast accuracy claims. Process commitments only.

  1. 1. Per-portfolio crawler coverage. Crawlers + GSC + Bing Webmaster + per-URL extraction + crawl-budget consumption + robots.txt compliance.
  2. 2. Compute link-equity coverage. Per-URL Internal-PageRank + Topical-PageRank + Reasonable- Surfer + Hilltop + community detection + centrality measure.
  3. 3. Score 12-dimension distribution. Per-URL severity tier + recommendation type + orphan-page detection + broken-link detection.
  4. 4. Audit canonical-record coverage. Per-URL WORM hash + per-crawl license + CFAA + crawl-budget + Lanham trademark + per-vertical + accessibility.
  5. 5. Regulatory-defense audit coverage. Google PageRank patent + per-API license + CFAA + crawl- budget + Lanham + per-vertical HIPAA/FINRA/state bar + EU AI Act Article 50.
  6. 6. FBC feedback-loop pattern-learning. Per-Google-update impact + per-URL realized-vs-predicted + per-vendor crawl-budget + per-vertical near-miss.

FAQ

What is internal link equity distribution analysis — and what is the PageRank-patent-times-CFAA-scraping-doctrine problem distinctive to this skill?
A multi-location retail operator with 80-300 stores ships 50,000-500,000 URLs across per-location pages + category pages + product pages + content pages + FAQ pages + AI Overview pages + structured-data pages. Internal links distribute link equity (PageRank) across the site graph. The four-skill bundle on the multi-location-seo-architecture agent — Crawl, Compute, Score, Audit — sits above the crawler substrate (Screaming Frog + Sitebulb + OnCrawl + DeepCrawl Lumar + Botify + JetOctopus + ContentKing Conductor + Audisto + Ahrefs Site Audit + SEMrush Site Audit + Moz Pro) and the graph-database substrate (NetworkX + Apache Spark GraphX + Neo4j + ArangoDB + TigerGraph + DGraph + JanusGraph + Amazon Neptune) and writes a per-URL link-equity canonical record. The operationally distinctive anchor: Google PageRank algorithm per Brin-Page 1998 + Google US Patent 6285999 (Method for Node Ranking in a Linked Database) + Topical PageRank per Haveliwala + Reasonable Surfer Model + Hilltop algorithm + Block Level PageRank + TrustRank + per-API license terms (Ahrefs + SEMrush + Moz Mozscape + Sistrix + Screaming Frog + Botify + OnCrawl + Sitebulb) intersect CFAA-scraping doctrine (hiQ Labs v LinkedIn 9th Cir 2022 + Van Buren v United States 2021 + Meta v Bright Data ND Cal 2024) + robots.txt + per-crawl per-domain rate-limit honoring + Google Search Essentials + Google John Mueller official guidance + Gary Illyes Search Liaison guidance. Pairs with per-location internal link recommendation engine (sibling build-pillar — downstream consumer).
Why do Screaming Frog + Sitebulb + OnCrawl + DeepCrawl + Botify + Ahrefs + SEMrush + Moz Pro break at multi-location 50k-500k URL link-equity-distribution scale?
Each crawler ships per-account flat URL inventory + per-URL link-graph extraction. Each SEO platform ships per-account flat PageRank-ish proxy score (Ahrefs URL Rating + Moz Page Authority + SEMrush Authority Score + Sistrix Visibility Index). None computes per-URL Internal-PageRank + Topical-PageRank + Reasonable-Surfer score on the actual site graph using NetworkX + Apache Spark GraphX + Neo4j + ArangoDB + TigerGraph + DGraph + JanusGraph + Amazon Neptune. None scores per-URL with 12-dimension link-equity composite (Internal-PageRank + Topical-PageRank + Reasonable-Surfer + Hilltop + Block Level PR + per-URL anchor-text density + per-URL anchor-text relevance + per-URL rel attribute distribution + per-URL HTTP status + per-URL canonical + per-URL crawl-depth + per-URL revenue-attribution). None enforces per-API license terms + CFAA-scraping doctrine + crawl-budget honoring. None coordinates Lanham Act trademark surface when internal-link anchor text uses competitor brand. None enforces per-vertical (HIPAA + FINRA + state bar + ABA Model Rule 7.1-7.5). None writes a per-URL link-equity audit trail with regulatory-defense retention. The four-skill bundle Crawl + Compute + Score + Audit sits above the crawler + SEO-platform surface — it does not replace it.
How does Crawl + Compute work across the per-URL link-equity graph?
Crawl runs per-portfolio per-banner per-location concurrent crawl through the crawler substrate (Screaming Frog + Sitebulb + OnCrawl + DeepCrawl Lumar + Botify + JetOctopus + ContentKing Conductor + Audisto + Ahrefs Site Audit + SEMrush Site Audit + Moz Pro + Sistrix + LinkResearchTools + URL Profiler + Visual SEO Studio) + GSC Crawl Stats + Bing Webmaster Crawl Control + GSC URL Inspection. Per-URL extraction: anchor-text harvest + anchor-text density + anchor-text relevance + rel attribute (nofollow + sponsored + ugc + dofollow) + HTTP status (200 + 301 + 302 + 304 + 404 + 410 + 500 + 502 + 503) + canonical tag + hreflang + noindex + meta-robots + X-Robots-Tag + XML sitemap presence + crawl-depth + crawl-budget consumption. Per-crawl per-domain rate-limit honoring + robots.txt + Sitemap protocol per sitemaps.org + Google Crawl Budget guidance. Compute runs per-URL link-equity computation on the site graph via NetworkX + Apache Spark GraphX + Neo4j + ArangoDB + TigerGraph + DGraph + JanusGraph + Amazon Neptune. Per-URL Internal-PageRank per Brin-Page 1998 + US Patent 6285999 + Personalized PageRank + Topical PageRank per Haveliwala + Reasonable Surfer Model + Hilltop algorithm + Block Level PageRank + TrustRank + SimRank + HITS + Salsa. Community detection (InfoMap + Louvain + Leiden + Girvan-Newman + Label Propagation). Centrality (betweenness + closeness + eigenvector + Katz + Bonacich).
What does Score + Audit do?
Score runs per-portfolio per-URL 12-dimension link-equity composite: Internal-PageRank + Topical-PageRank + Reasonable-Surfer + Hilltop + Block Level PR + per-URL anchor-text density + per-URL anchor-text relevance + per-URL rel distribution + per-URL HTTP status + per-URL canonical + per-URL crawl-depth + per-URL revenue-attribution from GA4 + Adobe Analytics + Mixpanel + Amplitude + Heap + per-URL conversion rate + per-URL bounce rate + per-URL time-on-page + per-URL scroll-depth + per-URL revenue-per-visit RPV. Per-URL severity tier: P0 high-revenue-low-link-equity (link-equity-loss) + P1 high-revenue-high-link-equity (preserve) + P2 low-revenue-high-link-equity (re-route equity downstream) + P3 orphan page detection + P4 broken internal link. Per-URL recommendation: anchor-text revision + internal-link addition/removal + canonical correction + redirect chain repair + crawl-depth reduction + sitemap inclusion + rel attribute correction. Audit writes a per-URL link-equity WORM canonical record: per-crawl source + per-crawl license attestation + per-crawl CFAA + crawl-budget attestation + per-URL graph snapshot + per-URL 12-dimension composite + per-URL severity tier + per-URL recommendation + per-URL trademark-mention disclosure (Lanham 15 USC 1125(a)) + per-URL per-vertical regulatory applicability (HIPAA when MedicalBusiness + FINRA when FinancialService + state bar when LegalService) + per-URL accessibility WCAG 2.2 AA. Storage: AWS S3 Object Lock + Azure Blob immutable + Google Cloud Storage Bucket Lock + Wasabi WORM. Retention: 7-year FTC + 7-year IRS + 7-year HIPAA + 7-year state bar + 6-year SEC + 3-year FINRA + GDPR Article 30 + EU AI Act Article 12 + SOC 2 CC7/CC8.
What does this skill connect to on the multi-location-seo-architecture agent and across the swarm?
On the multi-location-seo-architecture agent: multi-location-seo-architecture (parent commercial pillar) + per-location internal link recommendation engine (sibling build-pillar — downstream consumer of link-equity scoring) + url-hierarchy-authoring + canonical-tag-management + orphan-page-detection + title-rewrite-tracking + ai-overview-tracking + seo-alerts. Across the swarm: schema-audit-remediation agent (rich-result eligibility scoring #563 + JSON-LD generation #549 + continuous schema audit + per-vertical schema validation) + local-pack-rank-tracking (#559 + #567 + #571 cluster) + governance-decision-router five-destination routing + master-record + per-jurisdiction compliance multi-state franchise + per-location dynamic content. Build-pillar siblings: tiered pre-filter deterministic gates for AI content compliance + marketing AI autonomy profile configuration + per-location per-cohort two-sigma anomaly detection + #567 SERP history retention. Commercial-pillar parent: /multi-location-seo-architecture.
What does the 6-workstream pre-engagement-baseline reporting cycle look like for this skill?
Every two weeks during the Tier 3 Fractional CMO with AI Swarm engagement, six workstreams report against the pre-engagement baseline. Workstream 1: per-portfolio crawler coverage — crawlers + GSC Crawl Stats + Bing Webmaster + per-URL extraction completeness + crawl-budget consumption + robots.txt compliance. Workstream 2: Compute link-equity coverage — per-URL Internal-PageRank + Topical-PageRank + Reasonable-Surfer + Hilltop + community detection + centrality measure applied. Workstream 3: Score 12-dimension distribution — per-URL severity tier + per-URL recommendation type + per-URL orphan-page detection + per-URL broken-link detection. Workstream 4: Audit canonical-record coverage — per-URL WORM hash + per-crawl license + CFAA + crawl-budget attestation + Lanham trademark surface + per-vertical applicability + accessibility WCAG 2.2 AA. Workstream 5: Regulatory-defense audit coverage — Google PageRank patent US 6285999 + Topical PageRank + Reasonable Surfer + Hilltop + Google Search Essentials + per-API license + hiQ Labs/Van Buren CFAA + crawl-budget + Lanham trademark + per-vertical HIPAA/FINRA/state bar + EU AI Act Article 50. Workstream 6: FBC feedback-loop pattern-learning — per-Google-update impact + per-URL realized-vs-predicted reconciliation + per-vendor crawl-budget reconciliation + per-vertical near-miss tracking.

Engage Completions

Two ways to engage. The Tier 1 AI Readiness Assessment maps the crawler substrate + graph-database substrate + 12-dimension link-equity composite surface against the Crawl + Compute + Score + Audit bundle. The Tier 3 Fractional CMO with AI Swarm embeds 1-2 days per week for 6+ months and runs the bundle end-to-end against the multi-location-seo-architecture agent across the swarm.