Completions

Build pillar · anomaly-detection agent

How to build false-positive suppression for marketing data anomalies

scikit-learn + XGBoost + LightGBM + CatBoost + Isolation Forest + LOF + DBSCAN + HDBSCAN + Autoencoder + Prophet + DeepAR + PyMC + Stan + NumPyro + Bonferroni + Benjamini-Hochberg FDR ship per- model flat anomaly-detection primitives. The Detect + Correct + Gate + Audit skill bundle on the anomaly-detection agent sits above the ML + statistical + Bayesian substrate and writes a per-anomaly-event canonical record with named regulatory anchors covering replication-crisis discipline + multiple-comparisons correction + Rosenbaum bounds + E-value + ASA p-value statement + EU AI Act Article 22 + Annex III + ECOA Reg B + 4/5ths rule + Title VII Bostock + SOX 302/404/906 + CFPB Circular 2022-03.

Published October 28, 2026 · 3,200 words

The 4-skill bundle on the anomaly-detection agent

One agent. Four coordinated skills. The Detect + Correct + Gate + Audit bundle runs above the ML substrate (scikit-learn + XGBoost + LightGBM + CatBoost + TensorFlow + PyTorch) + statistical substrate (scipy.stats + statsmodels + R-stats) + Bayesian substrate (PyMC + Stan + NumPyro + bambi + brms) and writes one canonical per-anomaly-event record.

Detect

Multi-method anomaly detection: supervised + unsupervised (Isolation Forest + LOF + One-Class SVM + DBSCAN + HDBSCAN + Autoencoder + VAE) + forecasting residual (Prophet + DeepAR + LSTM + Transformer) + change-point (PELT + CUSUM + EWMA + Bayesian online changepoint + Hawkes + Chow + Quandt + Chu-Stinchcombe-White) + distribution-shift (KS + AD + Cramer-von Mises + MW-U + Wilcoxon + Kuiper) + SPC (X-bar R + X-bar S + IMR + Western Electric + Nelson).

Correct

Per-test-family multiple-comparisons correction: Bonferroni (FWER) + Benjamini-Hochberg FDR + Benjamini-Yekutieli + Holm-Bonferroni + Hochberg + Sidak + Storey q-value + Empirical Bayes + local FDR + adaptive FDR + knockoff filter + e-values. Pre-registration of hypothesis + analysis plan + decision threshold per Center for Open Science (cos.io) + AsPredicted + protocols.io + OSF. Power-analysis + sample-size + minimum-detectable-effect. Recalibration: Brier + ECE + reliability diagram + isotonic + Platt + Beta. Rosenbaum bounds + E-value + Sensitivity Analysis.

Gate

5 anchors per-anomaly before consumer-impacting routing: replication-crisis discipline pass + ECOA Reg B + 4/5ths rule + Title VII Bostock + ADEA + ADA + Fair Housing + FCRA Section 604/615 + ECOA 1002.9 + CFPB Circular 2022-03 + EEOC AI Guidance 2024 + NYC Local Law 144 + EU AI Act Article 22 + Annex III + FRIA + SOX 302/404/906 + FASB ASC 606 + FASB ASC 326 CECL + SEC Reg S-K + FTC Act Section 5 + Pfizer 1972.

Audit

Per-anomaly-event WORM record: per-stream multi-method signal snapshot + per-test pre-registered analysis plan + multiple-comparisons correction applied + Brier + ECE + reliability + Rosenbaum bounds + E-value snapshot + per- anchor gate-pass + adverse-action-notice + FRIA artifact + routing decision. Retention: 7-year FTC + 7-year IRS + 7-year SOX + 6-year SEC + 3-year FINRA + 4-year ECOA Reg B + 3-year EEOC + 5-year FCRA + GDPR Article 30 + EU AI Act Article 12 + SOC 2 CC7/CC8.

The real ecosystem this sits above

Detect + Correct + Gate + Audit does not replace the ML libraries or statistical tests. It sits above them, coordinates them with replication-crisis discipline, and writes one canonical per-anomaly-event record with named regulatory anchors.

ML + statistical substrate

  • scikit-learn + XGBoost + LightGBM + CatBoost
  • TensorFlow + PyTorch + Hugging Face Transformers
  • AutoML + DataRobot + H2O.ai + Alteryx + KNIME + Dataiku
  • scipy.stats + statsmodels + R-stats + bambi + brms
  • PyMC + Stan + NumPyro Bayesian substrate

Anomaly + change-point + SPC

  • Isolation Forest + LOF + One-Class SVM + DBSCAN + HDBSCAN
  • Autoencoder + VAE + LSTM + Transformer + Prophet + DeepAR
  • ruptures PELT + CUSUM + EWMA + Bayesian online changepoint
  • Hawkes + Chow + Quandt + Chu-Stinchcombe-White
  • X-bar R + X-bar S + IMR + Western Electric + Nelson Rules

Multiple-comparisons + pre-registration

  • Bonferroni + BH FDR + Benjamini-Yekutieli + Holm-Bonferroni
  • Hochberg + Sidak + Storey q-value + Empirical Bayes
  • Local FDR + adaptive FDR + knockoff filter + e-values
  • Center for Open Science + AsPredicted + protocols.io + OSF
  • G*Power + statsmodels-power + Rosenbaum + E-value

Compliance overlay

Five anchors run per-anomaly before any consumer-impacting decision routes. The first anchor is operationally distinctive to false-positive suppression: replication-crisis discipline + multiple-comparisons correction + Rosenbaum bounds + E-value + ASA p-value statement converge on every anomaly as a statistical-inference accountability checkpoint.

Anchor 1: Replication-crisis discipline + multiple- comparisons + Rosenbaum + E-value (operationally distinctive)

Ioannidis 2005 PLOS Medicine “Why Most Published Research Findings Are False” + Open Science Collaboration 2015 + Many Labs project + Reproducibility Project: Psychology. Type I + Type II error control. Per-test power-analysis + per-test minimum-detectable-effect + per-test pre- registration of hypothesis + analysis plan + decision threshold per Center for Open Science (cos.io) + AsPredicted.org + protocols.io + Open Science Framework. Per-test-family multiple-comparisons correction: Bonferroni (FWER) + Benjamini-Hochberg FDR + Benjamini-Yekutieli (under dependence) + Holm-Bonferroni + Hochberg + Sidak + Storey q-value + Empirical Bayes + local FDR + adaptive FDR + knockoff filter + e-values. Rosenbaum bounds for unmeasured-confounder sensitivity + E-value + Sensitivity Analysis as Robust Inference. American Statistical Association statement on p-values 2016 + statement on statistical significance 2019. Nature 2019 “Eliminate Statistical Significance” comment. Per-vendor model calibration audit: Brier score + ECE + reliability diagram.

Anchor 2: ECOA + 4/5ths rule + Title VII + AI-ML disparate- impact

ECOA Regulation B 12 CFR 1002 disparate-impact when AI-ML anomaly-detection routes consumer decisions + 4/5ths rule per Uniform Guidelines on Employee Selection Procedures 1978 + Title VII Civil Rights Act 1964 + Title VII Sex (Bostock v Clayton County 2020) + ADEA + ADA + Fair Housing Act + GINA + Equal Pay Act. EEOC Enforcement Guidance on Software AI Algorithmic Decision-Making 2024 + NYC Local Law 144 AEDT bias audit + Illinois HB 3773 + CA SB-942 + CO SB 21-169. FCRA Section 604/615 + ECOA 1002.9 adverse action + CFPB Circular 2022-03 adverse- action notification for AI-ML credit decisions.

Anchor 3: EU AI Act Article 22 + Annex III + FRIA

EU AI Act Article 22 transparency for automated decisions + Article 26 deployer + Article 50 + Article 13 + 14 + 15 + Annex III high-risk when AI-ML anomaly-detection drives material consequence + Article 6 + 27 Fundamental Rights Impact Assessment. Digital Services Act + DMA. GDPR Article 22 + Article 6 + 7 + 17 + Article 28 + 30. CCPA + CPRA right to opt out of automated decision-making + 18- state privacy. LGPD + DPDP + PIPEDA + Quebec Law 25.

Anchor 4: SOX + FASB + SEC + FTC

Sarbanes-Oxley 302/404/906 when public-company anomaly- detection-driven decision material to financial reporting. Securities Exchange Act 1934 Section 13(b)(2). FASB ASC 606 revenue recognition. FASB ASC 326 CECL when anomaly informs credit-loss. SEC Reg S-K Item 303. FTC Act Section 5 + Pfizer 1972 substantiation when anomaly substantiates marketing claim.

Anchor 5: Security + AI governance + WORM retention

NIST AI Risk Management Framework + NIST SP 800-30 risk assessment + NIST SP 800-53. ISO 42001 + ISO 27001 + ISO 31000 + SOC 2 Type II. Per-vendor LLM zero-retention + per-source DPA. Policy-as-code via OPA Rego + AWS Cedar + Casbin + Cerbos + Oso + Styra DAS + Permit.io. Storage: AWS S3 Object Lock + Azure Blob immutable + Google Cloud Storage Bucket Lock + Wasabi WORM. Retention: 7-year FTC + 7-year IRS + 7-year SOX + 6-year SEC + 3-year FINRA + 4-year ECOA Reg B + 3-year EEOC + 5-year FCRA + GDPR Article 30 + EU AI Act Article 12 + SOC 2 CC7/CC8.

6-workstream reporting cycle

Every two weeks during a Tier 3 Fractional CMO engagement, six workstreams report against the pre-engagement baseline. No forecast accuracy claims. Process commitments only.

  1. 1. Per-portfolio per-stream anomaly-detection coverage. Streams monitored + per-method enumeration + per-stream signal volume.
  2. 2. Detect multi-method signal distribution. Per-stream per-method signal + per-method false-positive baseline + per-method recalibration.
  3. 3. Correct multiple-comparisons coverage. Per-test-family Bonferroni + BH FDR + Benjamini-Yekutieli + Holm-Bonferroni + Storey q-value applied + pre-registration + power-analysis applied.
  4. 4. Gate-pass/gate-fail distribution. Per-anchor gate-fail + replication-crisis pass + ECOA disparate-impact pass + EU AI Act FRIA coverage.
  5. 5. Regulatory-defense audit coverage. Replication-crisis discipline + Bonferroni + BH FDR + Rosenbaum + E-value + ASA p-value statement + EU AI Act Article 22 + Annex III + ECOA + 4/5ths rule + Title VII + SOX + CFPB Circular 2022-03.
  6. 6. FBC feedback-loop pattern-learning. Per-stream realized-vs-predicted reconciliation + multi-arm- bandit regret + recalibration + per-protected-class disparate-impact retrospective.

FAQ

What is false-positive suppression for marketing data anomalies — and what is the replication-crisis-times-multiple-comparisons problem distinctive to this skill?
A multi-location operator running 32 AI agents continuously monitors thousands of marketing KPI streams: per-location per-channel per-platform per-creative-variant per-audience-segment per-cohort. Naive anomaly-detection raises one alert per stream per day per threshold and floods the team with noise. The four-skill bundle on the anomaly-detection agent — Detect, Correct, Gate, Audit — sits above the ML + statistical + Bayesian substrate (scikit-learn + XGBoost + LightGBM + CatBoost + Isolation Forest + LOF + DBSCAN + HDBSCAN + Autoencoder + Prophet + DeepAR + PyMC + Stan + NumPyro + statistical process control + change-point detection + distribution-shift) and writes a per-anomaly-event canonical record. The operationally distinctive anchor: replication-crisis discipline (Ioannidis 2005 PLOS Medicine "Why Most Published Research Findings Are False" + Open Science Collaboration 2015 + Reproducibility Project: Psychology) + per-test multiple-comparisons correction (Bonferroni + Benjamini-Hochberg FDR + Benjamini-Yekutieli + Holm-Bonferroni + Hochberg + Sidak + Storey q-value + Empirical Bayes + local FDR + adaptive FDR + knockoff filter + e-values) + per-test-family error control + Rosenbaum bounds for unmeasured-confounder sensitivity + E-value + pre-registration of analysis plan per Center for Open Science + AsPredicted.org + protocols.io + Open Science Framework + ASA statement on p-values 2016 + ASA statement on statistical significance 2019 + Nature 2019 "Eliminate Statistical Significance" comment. Layered on top: ECOA Reg B + 4/5ths rule + Title VII + EU AI Act Article 22 + Annex III when AI-ML anomaly-detection drives material consequence.
Why do scikit-learn + XGBoost + Isolation Forest + LOF + Prophet + DeepAR break at multi-stream false-positive-suppression scale?
Each ML library ships per-model flat fit + predict primitives. Each anomaly-detection algorithm raises a per-stream per-timestamp flag. None applies per-test-family multiple-comparisons correction (Bonferroni + BH FDR + Benjamini-Yekutieli + Holm-Bonferroni + Hochberg + Sidak + Storey q-value + Empirical Bayes + local FDR + adaptive FDR + knockoff filter + e-values). None coordinates per-test pre-registration of hypothesis + analysis plan + decision threshold per Center for Open Science. None calibrates the per-test threshold with replication-crisis discipline (Brier score + Expected Calibration Error + reliability diagram + isotonic / Platt / Beta recalibration + holdout-set replication + temporal cross-validation). None gates per-anomaly against ECOA Regulation B disparate-impact + 4/5ths rule + Title VII Bostock + ADEA + ADA + Fair Housing + FCRA Section 604/615 + EEOC AI Guidance 2024 + EU AI Act Article 22 + Annex III + CFPB Circular 2022-03 adverse-action notification. None writes a per-anomaly-event audit trail with regulatory-defense retention + pre-registered-analysis-plan snapshot. The four-skill bundle Detect + Correct + Gate + Audit sits above the ML + statistical substrate — it does not replace it.
How does Detect + Correct work with replication-crisis discipline?
Detect runs per-portfolio per-stream multi-method anomaly detection: scikit-learn + XGBoost + LightGBM + CatBoost + TensorFlow + PyTorch supervised + Isolation Forest + LOF + One-Class SVM + DBSCAN + HDBSCAN + Autoencoder + Variational Autoencoder unsupervised + Prophet + DeepAR + LSTM + Transformer forecasting residual + ruptures PELT + binary segmentation + CUSUM + EWMA + Bayesian online changepoint + Hawkes + Chow + Quandt + Chu-Stinchcombe-White change-point + Kolmogorov-Smirnov + Anderson-Darling + Cramer-von Mises + Mann-Whitney U + Wilcoxon + Kuiper distribution-shift + X-bar R + X-bar S + Individual Moving Range + Western Electric Rules + Nelson Rules SPC. Per-stream per-method per-timestamp signal. Correct runs per-test multiple-comparisons correction across the test-family: Bonferroni (family-wise error rate) + Benjamini-Hochberg FDR + Benjamini-Yekutieli (under dependence) + Holm-Bonferroni + Hochberg + Sidak + Storey q-value + Empirical Bayes + local FDR + adaptive FDR + knockoff filter + e-values. Per-test pre-registration of hypothesis + analysis plan + decision threshold per Center for Open Science (cos.io) + AsPredicted.org + protocols.io + Open Science Framework registry. Per-test power-analysis via G*Power + statsmodels-power + per-test sample-size determination + per-test minimum-detectable-effect. Recalibration: Brier score + Expected Calibration Error + reliability diagram + isotonic regression + Platt scaling + Beta calibration. Rosenbaum bounds for unmeasured-confounder sensitivity + E-value + Sensitivity Analysis as Robust Inference.
What does Gate + Audit do?
Gate runs 5 anchors per-anomaly before any consumer-impacting decision routes. (1) Replication-crisis discipline pass: per-test power-analysis + per-test multiple-comparisons correction + Rosenbaum bounds + E-value + ASA statement on p-values 2016 + ASA statement on statistical significance 2019 + Nature 2019 + per-test pre-registration evidence. (2) ECOA Regulation B 12 CFR 1002 disparate-impact + 4/5ths rule per Uniform Guidelines 1978 + Title VII Bostock + ADEA + ADA + Fair Housing + GINA + FCRA Section 604/615 + ECOA 1002.9 + CFPB Circular 2022-03 + EEOC AI Guidance 2024 + NYC Local Law 144 AEDT. (3) EU AI Act Article 22 + Article 26 + Article 50 + Article 13 + 14 + 15 + Annex III high-risk + Article 6 + 27 Fundamental Rights Impact Assessment + Digital Services Act + DMA + GDPR Article 22. (4) SOX 302/404/906 when public-company anomaly-detection-driven decision material to financial reporting + Securities Exchange Act 1934 Section 13(b)(2) + FASB ASC 606 + ASC 326 CECL when anomaly informs credit-loss + SEC Reg S-K Item 303. (5) FTC Act Section 5 + Pfizer 1972 substantiation when anomaly substantiates marketing claim + per-vendor LLM zero-retention. Audit writes a per-anomaly-event WORM canonical record: per-stream multi-method signal snapshot + per-test pre-registered analysis plan + multiple-comparisons correction applied + Brier + ECE + reliability + Rosenbaum bounds + E-value snapshot + per-anchor gate-pass + adverse-action-notice content + Fundamental Rights Impact Assessment artifact + downstream routing. Storage: AWS S3 Object Lock + Azure Blob immutable + GCS Bucket Lock + Wasabi WORM. Retention: 7-year FTC + 7-year IRS + 7-year SOX + 6-year SEC + 3-year FINRA + 4-year ECOA Reg B + 3-year EEOC + 5-year FCRA + GDPR Article 30 + EU AI Act Article 12 + SOC 2 CC7/CC8.
What does this skill connect to on the anomaly-detection agent and across the swarm?
On the anomaly-detection agent: per-location per-cohort two-sigma anomaly detection (sibling build-pillar) + multi-stream severity routing for anomaly detection and compliance ops (sibling build-pillar — downstream consumer of suppressed anomalies). Across the swarm: governance-decision-router five-destination routing + buyer-state-aware BANT scoring (#568 sibling agent — same replication-crisis substrate) + per-platform compliance gating for social posts (#564 same EU AI Act Article 50 substrate) + master-record + per-jurisdiction compliance multi-state franchise. Build-pillar siblings: tiered pre-filter deterministic gates for AI content compliance + marketing AI autonomy profile configuration + per-vertical compliance overlay + root-cause attribution sketch for multi-location KPI diagnosis. Commercial-pillar parent: /anomaly-detection-and-alerting.
What does the 6-workstream pre-engagement-baseline reporting cycle look like for this skill?
Every two weeks during the Tier 3 Fractional CMO with AI Swarm engagement, six workstreams report against the pre-engagement baseline. Workstream 1: per-portfolio per-stream anomaly-detection coverage — streams monitored + per-stream per-method enumeration + per-stream per-timestamp signal volume. Workstream 2: Detect multi-method signal distribution — per-stream per-method signal volume + per-method false-positive baseline + per-method recalibration. Workstream 3: Correct multiple-comparisons coverage — per-test-family Bonferroni + BH FDR + Benjamini-Yekutieli + Holm-Bonferroni + Storey q-value applied + per-test pre-registration evidence + power-analysis applied. Workstream 4: Gate-pass/gate-fail distribution — per-anchor gate-fail rate + replication-crisis discipline pass + ECOA disparate-impact pass + EU AI Act FRIA coverage. Workstream 5: Regulatory-defense audit coverage — replication-crisis discipline + Bonferroni + BH FDR + Rosenbaum + E-value + ASA p-value statement + EU AI Act Article 22 + Annex III + ECOA + 4/5ths rule + Title VII + SOX + CFPB Circular 2022-03. Workstream 6: FBC feedback-loop pattern-learning — per-stream realized-vs-predicted reconciliation + multi-arm-bandit regret + recalibration + per-protected-class disparate-impact retrospective.

Engage Completions

Two ways to engage. The Tier 1 AI Readiness Assessment maps the ML + statistical + Bayesian substrate + multiple-comparisons + pre-registration surface against the Detect + Correct + Gate + Audit bundle. The Tier 3 Fractional CMO with AI Swarm embeds 1-2 days per week for 6+ months and runs the bundle end-to-end against the anomaly-detection agent across the swarm.