Completions

Skill catalog

Data validation tools with per-vertical rule libraries

Validate every location record against HIPAA, cannabis license-status, ABV, FDA, and FINRA rules before it promotes to canonical — with maintained per-vertical rule libraries.

The problem

You just realized a healthcare-network location has been live for three weeks without HIPAA-required practitioner credentials in the database. The record passed your generic validation. Your cannabis license at one store lapsed in March and marketing kept running ads for two months. Your compliance officer catches these but only when they audit, which is quarterly.

Generic data-validation tools like Great Expectations, dbt tests, and Soda.io validate ANY data shapes — but they require your data team to write the per-vertical rules from scratch. HIPAA fields, ABV labeling, FDA supplement metadata, FINRA disclosure language, state cannabis license-status — each ruleset is custom development that the compliance officer cannot maintain and the data team does not have time to update as regulations shift.

The result: violations get caught after they ship, not before. Records promote to canonical with HIPAA gaps. License lapses propagate into ad spend. Supplement claims ship without FDA-required metadata. The audit happens after the regulator finds the violation, not before.

What success looks like

Every record validates against maintained per-vertical rule libraries before promotion to canonical. HIPAA fields enforced on healthcare locations. Cannabis license-status confirmed within window. Alcohol ABV labeling complete. FDA supplement metadata present. FINRA disclosures included on financial-services pages.

Records that fail vertical validation never promote. The failure routes to compliance-officer review with rule citation and source-data context. The record stays in staging until corrected or explicitly overridden by the officer.

Multi-state operators stack per-jurisdiction overlays through intersection. A pharmacy in California serving cannabis-adjacent supplements gets HIPAA plus state pharmacy board plus state cannabis plus FDA supplement validation simultaneously. The audit trail builds itself for regulator-defense retention.

How most operators solve this today

Three tiers of generic data-validation tools exist. None ship maintained per-vertical rule libraries for regulated multi-location operations.

  • Generic data-validation (Great Expectations, dbt tests, Soda.io, Monte Carlo, Datafold, Anomalo)

    OSS — free / $100-$100,000+/yr

    Built for data-engineering teams validating warehouse data shapes. Operators must write the per-vertical rules themselves. HIPAA / FDA / FINRA / state cannabis libraries do not ship out of the box.

  • ETL-bundled validation (Fivetran column-tests, Airbyte schema checks)

    Bundled with ETL subscription

    Light-touch validation for type errors and missing fields. Not built for vertical-compliance schema enforcement.

  • Excel data-validation rules + custom code

    DIY ops-analyst time

    Ops analyst builds rules into the master spreadsheet. Custom SQL or Python validates after-the-fact. Falls apart at 50+ locations and across regulated verticals.

  • Compliance officer + quarterly manual audit

    $120k-$180k FTE

    Catches violations after they ship, not before promotion. Quarterly cadence means months of exposure between audits.

What changes when this is an agent skill

The Completions schema-validation skill ships with maintained per-vertical rule libraries — HIPAA for healthcare, FTC ad-substantiation, FINRA and SEC for financial services, state cannabis for CA/CO/MA/NY/MI, alcohol category, state pharmacy, state lottery, FDA supplements.

Pre-promotion gating means records that fail vertical validation never promote to canonical. The failure routes to compliance-officer review with rule citation, source-data context, and recommended correction. The record stays in staging until the officer corrects it or explicitly overrides with documented rationale.

Composable per-jurisdiction overlays stack through intersection — a multi-state operator gets per-state rule composition without writing the cross-product rules manually. Deterministic checks (regex, keyword, type, format) run first and cheaply. LLM-based semantic checks catch probabilistic violations like implied health claims or undisclosed sponsorship language that regex cannot detect.

Audit trail logs every validation decision with rule citation, inputs, outcome, and reviewer if human-routed. Composes with versioned-history-regulatory-defense for six-to-seven-year retention. This is the sibling of per-vertical-compliance-overlay (loop 001) — same rule library, enforced at the data-record layer rather than the content-output layer.

Agents that include this skill

Skills live inside agent rentals. To get this skill in production, hire any of the agents below — context-tuning at onboarding is included in the first month.

  • Master Record Canonicalization Agent

    Ingests every operator source system, resolves per-location fact conflicts, and emits the canonical master record downstream agents consume.

    Early-adopter

    $2,000–$4,000/mo

FAQ

What are data validation tools?
Software that validates data records against rule libraries before promotion to a canonical store. The Completions skill maintains per-vertical rule libraries (HIPAA, cannabis, alcohol, FDA, FINRA) so operators do not write rules from scratch.
How is this different from Great Expectations or Soda.io?
Those tools validate ANY data shapes via custom rules. Operators write the per-vertical rules themselves. This skill ships maintained per-vertical rule libraries for regulated multi-location operations.
Is this the same as the per-vertical compliance overlay skill?
They are sibling skills. Per-vertical-schema-validation enforces rules at the DATA RECORD layer (before promotion to canonical). Per-vertical-compliance-overlay enforces rules at the CONTENT OUTPUT layer (before marketing content publishes). Both load the same per-vertical rule libraries.
Which verticals are supported on day one?
HIPAA for healthcare, FTC ad-substantiation, FINRA and SEC for financial services, state cannabis (CA, CO, MA, NY, MI initially), alcohol category, state pharmacy boards, state lottery, FDA supplements. Additional verticals land through the shared skill backlog.
What happens when validation fails?
The record does not promote to canonical. The failure routes to compliance-officer review with rule citation and source-data context. The record stays in staging until corrected or explicitly overridden with documented rationale.
Can a single record stack multiple vertical overlays?
Yes. A pharmacy in California serving cannabis-adjacent supplements stacks HIPAA plus state pharmacy board plus state cannabis plus FDA supplement rules simultaneously. Rules compose by intersection.
How does this compose with multi-source ingestion?
Ingestion pulls per-source data; this skill validates each contribution against per-vertical rules before the canonical record promotes; conflict-resolution-policy resolves cross-source disagreements on validated data; master-record-sync emits change events downstream.

Hire one of the agents that includes this skill