FY27-Q1 ABM Landing Page Copy Matrix

GF #1 Procter and Gamble Greenfield

Databricks · Airflow · Azure AI factory investment at scale No active data catalog identified Arc: data contracts enforce quality before models consume

Hero headline H1 — replaces "Enterprise AI data catalog platform that eliminates data chaos"

The data contract layer your AI factory is missing

Subheadline 2–3 sentences beneath H1

Your Databricks and Airflow pipelines power P&G's AI factory. But without contract enforcement at runtime, models train on whatever arrives. DataHub streams lineage and validates data quality across every pipeline execution — before bad data reaches production.

Pain points Replaces the 4 generic bullets in "Your data catalog wasn't built for this"

Data scientists lose hours debugging model failures caused by upstream schema changes no one caught before training ran
Governance teams can't enforce data quality standards as new AI pipelines deploy across global markets and product lines
Compliance reviews require manual lineage reconstruction across Databricks jobs and Airflow DAGs — every time an audit lands

Discover Replaces "Find and understand data 10x faster"

Discover

Find trusted training data across your entire Databricks estate

Conversational search surfaces the data assets your AI teams need — with lineage, quality scores, and ownership attached — without Slack threads or tribal knowledge.

Observe Replaces "Deliver consistently reliable data"

Observe

Catch pipeline failures before they corrupt AI model inputs

Automated anomaly detection monitors every Airflow run and Databricks job for freshness, schema drift, and volume changes — so your AI factory never trains on degraded data.

Govern Replaces "Automate compliance and governance"

Govern

Enforce data contracts as every Airflow job and Databricks pipeline executes

Set quality contracts once. DataHub enforces them at runtime across every pipeline — blocking non-compliant data before models consume it, not after.

Lineage Replaces "Rapidly resolve data issues"

Lineage

Trace column-level data flows from source through transformation to model

When a model behaves unexpectedly, follow the data upstream — column by column, pipeline by pipeline — to find exactly where quality broke down and what it affects downstream.

Customer story Which story to feature + personalized framing

Feature: Chime

"Like P&G, Chime's data teams were siloed from the pipelines generating quality issues. Cross-platform lineage and continuous monitoring eliminated the manual reconciliation that was slowing production AI — and gave governance teams real-time visibility they'd never had."

OSS #1 Parker Hannifin OSS Tier 1

Databricks · Airflow · Azure Synapse Filtration Group + Curtis Instruments acquisitions Arc: OSS portability survives every acquisition

Hero headlineH1

One lineage layer for every stack you've acquired

Subheadline2–3 sentences beneath H1

Filtration Group and Curtis Instruments each brought their own data environments. DataHub maps column-level lineage across all three stacks simultaneously — streaming as each inherited pipeline executes, not reconstructed hours later. And because it's built on open source, your metadata layer survives the next acquisition too.

Pain points3 account-specific bullets

Data engineers can't trace lineage across acquired stacks — each environment has its own metadata silo with no shared visibility layer
Schema changes in one environment break downstream pipelines in another with no automated impact analysis or warning
As your data estate grows through acquisitions, governance complexity compounds without a unified layer that scales with it

DiscoverPillar headline + body

Discover

Search across Databricks, Airflow, and Synapse in a single query

Find any data asset — regardless of which inherited environment it lives in — with automated documentation that stays current as pipelines evolve across all three stacks.

ObservePillar headline + body

Observe

Detect cross-environment quality failures before they cascade downstream

Automated monitoring watches for anomalies across all three data environments simultaneously — so a schema change in Synapse doesn't silently break Databricks consumers.

GovernPillar headline + body

Govern

Apply consistent governance policies across all three data stacks simultaneously

Define policies once. DataHub propagates them across Databricks, Airflow, and Synapse — so governance doesn't have to be rebuilt every time a new environment joins the estate.

LineagePillar headline + body

Lineage

Trace column-level dependencies across every acquired data environment, live

Column-level lineage streams across Databricks, Airflow, and Synapse as pipelines execute — so your team always knows what connects to what, regardless of which acquisition it came from.

Customer storyWhich story + framing

Feature: Netflix

"Netflix unified discovery across data, ML, and software assets across a growing — and increasingly complex — data estate. Cross-domain lineage now enables proactive incident prevention rather than reactive debugging. A parallel for Parker Hannifin's multi-stack challenge."

GF #2 Johnson and Johnson Greenfield

Airflow · dbt · Snowflake · Databricks Population Analytics — 15PB clinical data FDA audit traceability requirement Arc: real-time lineage means always-current chain of custody

Hero headlineH1

Automated lineage for 15 petabytes of clinical data

Subheadline2–3 sentences beneath H1

Your Population Analytics team traces data provenance by hand across Airflow, dbt, Snowflake, and Databricks. FDA submissions require full chain-of-custody — and today that means manual reconstruction. DataHub streams column-level lineage from clinical source through every transformation, so audit trails are always current when regulators ask.

Pain points3 account-specific bullets

Clinical data engineers spend hours manually reconstructing lineage before each regulatory submission — work that DataHub eliminates entirely
dbt transformation changes break downstream Snowflake tables with no automated impact analysis before deployment
Compliance teams cannot provide real-time audit trails as trial data moves across four platforms in a single day

DiscoverPillar headline + body

Discover

Find and verify clinical data sources across your 15PB data estate

Conversational search surfaces verified, documented datasets — with ownership, quality scores, and certification status — so analysts build on data they can trust before submissions.

ObservePillar headline + body

Observe

Detect quality anomalies in trial data before they reach downstream analytics

Automated assertions monitor freshness, schema stability, and completeness across clinical data pipelines — catching issues before they propagate into Snowflake reports or Databricks models.

GovernPillar headline + body

Govern

Maintain FDA-ready compliance records with automated chain-of-custody tracking

Data contracts enforce freshness and schema requirements across every clinical pipeline. Automated certification workflows track compliance status by domain — so audit readiness is continuous, not a pre-submission scramble.

LineagePillar headline + body

Lineage

Trace data provenance from clinical source through every dbt transformation, live

Column-level lineage streams from source systems through Airflow orchestration and dbt transformations to Snowflake — giving your team an always-current, complete provenance record without manual reconstruction.

Customer storyWhich story + framing

Feature: Chime

"Chime's compliance and data quality challenges mirror J&J's: siloed pipelines, no real-time lineage, and governance teams working manually against moving targets. DataHub's cross-platform lineage gave them the audit trail they needed without slowing down data teams."

GF #3 Verizon Greenfield

Kafka · Airflow · BigQuery 200PB across Verizon + Frontier stacks Alation — batch scan limitation Arc: streaming vs. batch is the core differentiation

Hero headlineH1

Real-time metadata for your combined 200PB data estate

Subheadline2–3 sentences beneath H1

Alation was built for static warehouses — not Frontier's Kafka streams and Airflow pipelines running alongside Verizon's BigQuery environment. DataHub streams metadata as pipelines execute, giving your teams unified lineage across both stacks in seconds rather than the next morning.

Pain points3 account-specific bullets

Data teams work from yesterday's metadata — Alation's batch scans don't reflect overnight changes in Kafka streams or Airflow pipelines
Frontier's infrastructure and Verizon's BigQuery environment have no shared lineage layer, leaving engineers to trace cross-stack dependencies manually
Incident resolution takes hours because there's no way to follow data flows from a Frontier source system to a consumer BigQuery dashboard

DiscoverPillar headline + body

Discover

Search across your combined Verizon and Frontier data estate in real time

Find any data asset across both telecom stacks — with current ownership, quality status, and lineage — without switching tools or waiting for overnight scans to complete.

ObservePillar headline + body

Observe

Detect Kafka stream quality issues before they surface in BigQuery dashboards

Continuous monitoring catches anomalies in streaming data as it moves — not after consumers have already seen bad numbers in downstream reports.

GovernPillar headline + body

Govern

Apply consistent governance policies across both telecom data stacks simultaneously

Set policies once in DataHub. They propagate across Verizon's BigQuery environment and Frontier's Airflow pipelines — so integration doesn't mean doubling your governance overhead.

LineagePillar headline + body

Lineage

Trace data flows from Frontier source systems through Airflow to BigQuery, live

Column-level lineage streams across both stacks as pipelines run — so when a consumer dashboard shows bad data, your team finds the Frontier source that introduced it in seconds, not hours.

Customer storyWhich story + framing

Feature: Chime

"Chime unified fragmented data environments — replacing the siloed metadata and manual debugging that followed their own integration work. Cross-platform lineage and continuous monitoring gave teams proactive quality control at a scale Alation couldn't match."

OSS #2 IQVIA OSS Tier 1

Snowflake · dbt · Airflow Active cloud modernization program Arc: governance arrives with migration, not after

Hero headlineH1

Governance that keeps pace with your cloud migration

Subheadline2–3 sentences beneath H1

Your Data Architecture team is moving to Snowflake with dbt and Airflow advancing simultaneously. Governance typically arrives six months after the migration settles. DataHub's streaming metadata ingests every new pipeline as it onboards — so lineage and documentation arrive with the data, not after it.

Pain points3 account-specific bullets

New cloud pipelines launch without lineage or documentation, creating technical debt that compounds with every migration sprint
Data governance teams can't keep pace as the modernization program adds new Snowflake tables and dbt models faster than they can document
Analysts hit stale metadata from legacy documentation that doesn't reflect the current cloud state — slowing every analysis

DiscoverPillar headline + body

Discover

Find migrated data assets across Snowflake and dbt as they go live

Automated ingestion catalogs every new Snowflake table and dbt model the moment it deploys — so your teams find and trust migrated data immediately, not after manual documentation catches up.

ObservePillar headline + body

Observe

Catch data quality regressions introduced during migration immediately

Automated quality checks monitor every migrated pipeline for freshness, schema drift, and completeness — so you discover migration-introduced regressions in minutes, not after downstream teams report bad data.

GovernPillar headline + body

Govern

Apply governance requirements to every new pipeline at ingestion, not after

Data contracts and compliance policies attach to new Snowflake and dbt assets the moment they onboard — so your cloud modernization doesn't inherit the governance debt of your legacy stack.

LineagePillar headline + body

Lineage

Trace column-level lineage across your cloud stack from the moment pipelines deploy

Every new Airflow DAG and dbt transformation comes with lineage attached from day one — so your team always knows what a migrated dataset feeds downstream, without waiting for documentation to catch up.

Customer storyWhich story + framing

Feature: Netflix

"Netflix unified discovery across a fast-growing data estate where documentation couldn't keep pace with deployments. DataHub's automated ingestion and cross-platform lineage eliminated the gap — a direct parallel to IQVIA's migration challenge."

GF #4 State Street Greenfield

Snowflake · Airflow · dbt Alpha Data Platform — $380B new mandates Arc: investment SLAs require real-time, not batch, lineage

Hero headlineH1

Lineage your compliance teams can actually trust

Subheadline2–3 sentences beneath H1

The Alpha Data Platform is onboarding new institutional mandates at a pace that compliance teams can't trace manually. DataHub streams column-level lineage through Snowflake, Airflow, and dbt as each mandate's data flows — so investment SLAs are traceable in real time, not reconstructed the next morning.

Pain points3 account-specific bullets

Compliance teams reconstruct lineage manually for each regulatory review as the Alpha Platform onboards new mandates at record pace
Custody teams can't verify data provenance in real time — batch metadata means answers come the next morning, not when they're needed
Data quality issues in investment workflows go undetected until they surface in downstream reporting — when the cost of fixing them is highest

DiscoverPillar headline + body

Discover

Find and verify data sources across the Alpha Data Platform in seconds

Certified, documented data assets — with ownership and quality scores — surface instantly across your Snowflake and dbt environment, so teams build on data they can defend to regulators.

ObservePillar headline + body

Observe

Monitor data quality across investment workflows with automated assertion checks

Continuous quality monitoring catches freshness, schema, and volume anomalies across every mandate's pipeline — before they reach the investment calculations that matter.

GovernPillar headline + body

Govern

Maintain real-time audit trails across every mandate's data flow automatically

Data contracts enforce compliance requirements as each mandate onboards. Certification workflows track readiness by domain — so your compliance posture reflects the current state of the Alpha Platform, not last night's batch.

LineagePillar headline + body

Lineage

Trace investment data from Snowflake sources through dbt transformations to reports, live

Column-level lineage streams through every step of each mandate's data flow — so custody teams can answer "where did this number come from?" in seconds, not hours.

Customer storyWhich story + framing

Feature: Chime

"Chime's compliance and cross-team visibility challenges parallel State Street's: manual lineage reconstruction, no real-time quality monitoring, and governance teams working reactively. DataHub gave them continuous compliance visibility without slowing down data operations."

OSS #3 Charles Schwab OSS Tier 1

Snowflake · BigQuery · Redshift · Pub/Sub · Airflow TD Ameritrade (Forge) integration Arc: only catalog that crosses all three warehouses simultaneously

Hero headlineH1

The only catalog that sees across all three of your warehouses

Subheadline2–3 sentences beneath H1

Snowflake Horizon covers Snowflake. Nothing covers Snowflake, BigQuery, and Redshift together. DataHub maps real-time lineage across all three as Forge data flows through Pub/Sub and Airflow — giving your teams unified visibility that no single-warehouse tool can provide.

Pain points3 account-specific bullets

Data teams debug pipeline failures without knowing which of three warehouses introduced the issue — each has its own metadata silo
The Forge integration adds new cross-warehouse data flows that no existing tool in your stack can trace end to end
Governance policies applied to Snowflake don't automatically extend to BigQuery or Redshift — leaving two-thirds of the estate ungoverned

DiscoverPillar headline + body

Discover

Search across Snowflake, BigQuery, and Redshift in a single unified query

Find any data asset — regardless of which warehouse it lives in — with current ownership, quality score, and usage patterns, without switching tools or knowing which environment to search first.

ObservePillar headline + body

Observe

Detect quality issues across all three warehouses before they surface downstream

Automated monitoring watches for freshness, schema, and volume anomalies across Snowflake, BigQuery, and Redshift simultaneously — so a Forge integration issue doesn't silently corrupt reports.

GovernPillar headline + body

Govern

Apply consistent governance policies across all three warehouse environments

Set a policy once. DataHub enforces it across Snowflake, BigQuery, and Redshift — so PII classification and compliance requirements don't have to be managed three times over.

LineagePillar headline + body

Lineage

Trace column-level data flows from Pub/Sub through Airflow to all three warehouses, live

Column-level lineage streams across your entire stack as Forge data moves through Pub/Sub, Airflow, BigQuery, Snowflake, and Redshift — giving your team a single map of every dependency, live.

Customer storyWhich story + framing

Feature: Chime

"Chime operated across fragmented data environments where producers and consumers were siloed and quality issues were invisible until they broke things downstream. DataHub's cross-platform lineage replaced manual debugging — a parallel to Schwab's multi-warehouse challenge."

OSS #4 Intel OSS Tier 1

Airflow · Snowflake · Databricks · Kafka Platform efficiency mandate Arc: OSS survives any restructuring — vendor lock-in doesn't

Hero headlineH1

One metadata layer. Every platform your teams run.

Subheadline2–3 sentences beneath H1

Airflow, Snowflake, Databricks, Kafka — each generating metadata in its own silo. DataHub ingests lineage from all four simultaneously without custom scripts or manual documentation. And because it's built on open source, your metadata layer is yours — regardless of which platforms get consolidated.

Pain points3 account-specific bullets

Data engineers manually document pipeline dependencies across four platforms — documentation that's outdated the moment it's written
Impact analysis before a schema change means cross-referencing four separate tools with no automated lineage — a bottleneck the efficiency mandate was meant to eliminate
Consolidation planning is impossible when metadata silos prevent anyone from seeing which pipelines actually depend on which assets

DiscoverPillar headline + body

Discover

Find data assets and pipeline dependencies across all four platforms in seconds

Conversational search across Airflow, Snowflake, Databricks, and Kafka — with automated documentation that stays current as pipelines change, without any manual cataloging work.

ObservePillar headline + body

Observe

Detect cross-platform quality failures before they cascade into production

Automated monitoring watches for anomalies across all four platforms simultaneously — so a Kafka schema change doesn't silently break a downstream Databricks model that nobody saw coming.

GovernPillar headline + body

Govern

Govern Airflow, Snowflake, Databricks, and Kafka from one control plane

Apply governance policies once. DataHub enforces them across all four platforms — eliminating the per-tool overhead that compounds with every environment your teams run.

LineagePillar headline + body

Lineage

Trace column-level dependencies across all four platforms — no custom scripts required

Column-level lineage streams simultaneously from Airflow, Snowflake, Databricks, and Kafka as pipelines execute — giving your team the dependency map your efficiency mandate requires, without building it by hand.

Customer storyWhich story + framing

Feature: Netflix

"Netflix unified discovery across data, ML, and software assets — a growing, complex estate with no single tool that could see across it. DataHub's cross-domain lineage gave them proactive incident prevention at scale. The parallel to Intel's four-platform consolidation challenge is direct."

OSS #5 Nike OSS Tier 1 HOT

Airflow · dbt · Databricks · Snowflake 128 active devs · "Win Now" mandate Arc: stale metadata makes DAG debugging a days-long ordeal

Hero headlineH1

Live lineage for every DAG your team runs

Subheadline2–3 sentences beneath H1

When a pipeline fails and metadata is hours old, your engineers trace it across Airflow, Databricks, Snowflake, and dbt by hand. DataHub streams lineage as every DAG (Directed Acyclic Graph) executes — so root causes take minutes to find, not days. That's what your Win Now turnaround demands from data infrastructure.

Pain points3 account-specific bullets

Data engineers spend hours debugging pipeline failures without a unified view of how Airflow DAGs connect to Databricks jobs and Snowflake tables
Schema changes in dbt break downstream Snowflake tables with no automated impact analysis before the change deploys
Pipeline documentation is perpetually stale — metadata reflects last week's state, not what's running right now across four platforms

DiscoverPillar headline + body

Discover

Find trusted data assets across your Airflow, Databricks, Snowflake, and dbt stack instantly

Conversational search surfaces any data asset — with current ownership, quality status, and lineage attached — so your 128 engineers stop hunting and start building.

ObservePillar headline + body

Observe

Detect DAG failures and data quality issues the moment they occur

Automated anomaly detection monitors every Airflow run for freshness, schema drift, and volume changes — so your team gets alerted when a DAG breaks, not when a marketing analyst notices a dashboard gap.

GovernPillar headline + body

Govern

Enforce data contracts across your marketing analytics pipeline automatically

Set quality contracts once. DataHub enforces them across every Airflow, Databricks, and Snowflake pipeline — so the $5B marketing analytics investment runs on data that actually meets the standard.

LineagePillar headline + body

Lineage

Trace column-level lineage from Airflow through Databricks to Snowflake as every DAG runs

When a pipeline breaks, follow the data upstream — column by column, DAG by DAG — from symptom to root cause in minutes. No cross-platform archaeology. No manual tracing across four tools.

Customer storyWhich story + framing

Feature: Netflix

"Netflix is an engineering-led org running data infrastructure at scale — like Nike — where reactive debugging costs too much. DataHub gave them cross-domain lineage and proactive incident prevention, shifting the team from fighting fires to preventing them. That's the Win Now story."

OSS #6 Adobe OSS · Deploy HOT

Databricks · dbt · Snowflake 192 active devs · 90 Databricks teams · OSS deployed Semrush acquisition — 3,000+ new sources Arc: OSS-to-Cloud upgrade, not replacement

Hero headlineH1

Your OSS deployment, scaled for 3,000 new sources

Subheadline2–3 sentences beneath H1

192 engineers already run DataHub OSS across 90 Databricks teams. The Semrush acquisition adds 3,000+ new data sources that need the same lineage and governance — immediately. DataHub Cloud closes the gap: automated ingestion, data contracts, and quality monitoring at the scale your OSS deployment can't handle alone.

Pain points3 account-specific bullets

3,000+ Semrush data sources will onboard without lineage or governance unless an automated system handles ingestion at acquisition pace
OSS deployment requires manual ingestion configuration that doesn't scale to handle hundreds of new sources arriving simultaneously
Data contracts enforced manually across 90 Databricks teams create governance overhead that grows with every source added from the Semrush estate

DiscoverPillar headline + body

Discover

Find and understand every data source across your 90-team Databricks mesh

Automated ingestion catalogs every Semrush and Adobe source as it onboards — with lineage, ownership, and documentation attached — so your 90 teams can find and trust new data immediately.

ObservePillar headline + body

Observe

Detect quality issues across Semrush and Adobe sources before teams consume bad data

Automated quality checks run across every new source as it integrates — catching freshness, schema, and completeness issues before 90 teams build downstream on data that hasn't been validated.

GovernPillar headline + body

Govern

Automate data contracts across all 90 teams and 3,000+ new sources simultaneously

DataHub Cloud automates the contract enforcement and compliance monitoring that OSS requires manual configuration for — so governance scales with your Semrush integration, not after it.

LineagePillar headline + body

Lineage

Trace column-level lineage from every Semrush source through your Databricks stack, automatically

Every new Semrush source gets column-level lineage attached at ingestion — streamed through your Databricks transformations in real time, so your 90 teams always know where their data comes from.

Customer storyWhich story + framing

Feature: Netflix

"Netflix is an engineering-first org where DataHub's OSS roots are well understood. The story — unifying discovery across a growing, decentralized data estate — maps directly to Adobe's 90-team mesh challenge. Positioning: this is what your OSS deployment becomes at full enterprise scale."

Landing Page Copy Matrix — Top 10