| Tier / Signal | 👩💻 Data Engineer | 🏗 Data Platform Lead | 🎯 CDO / VP Data | 🤖 AI / ML Team |
|---|---|---|---|---|
|
T1
Tier 1 — OSS Users (DataHub Core)Confirmed DataHub Core users. Message = upgrade to Cloud for observability, incident management, compliance, MCP Server, and AI Context Graph. This is an upgrade conversation, not a discovery conversation. |
||||
| Pain Point | Freshness monitoring and volume checks are built by the team — manual work that pulls engineers away from the pipelines that matter | Incident response starts with Slack and ends with a retrospective — no SLA tracking or cross-team health visibility | AI agents need certified metadata access at machine scale — an infrastructure gap that's quietly blocking the AI roadmap | Training data freshness issues surface after the model runs — no automated layer catching bad data before it's consumed |
| T1 · OSS Ad 1 | Problem What if your pipelines told you when they broke — not your users? Freshness monitoring, volume checks, and automated Slack alerts — the observability layer your team's been hand-rolling, ready out of the box in DataHub Cloud. | Problem Your incident response is only as fast as whoever noticed it first. Without SLA dashboards and cross-team data health visibility, every incident starts with a Slack message and ends with a retrospective. There's a better way. | Problem Your AI roadmap is waiting on infrastructure you haven't built yet. AI agents need real-time access to certified metadata at machine scale. DataHub Cloud has that infrastructure ready — your team doesn't need to build it from scratch. | Problem What if stale training data got flagged before your model ran? When freshness issues surface manually after model execution, you're always a step behind. Automated monitoring means your team catches problems, not consequences. |
| Ad 2 | Solution Automated assertions catch stale data before consumers do. DataHub Cloud monitors freshness, schema stability, and volume on a schedule — alerting your team before anything breaks downstream. | Solution Incident SLA dashboards. Ready when you are. DataHub Cloud gives platform leads real-time visibility into data health across every team — operational tooling that comes with the platform, not after it. | Solution Give AI agents trusted metadata. In real time. DataHub Cloud's MCP Server and Context Graph let AI agents query certified, live metadata — the infrastructure that makes agentic AI actually work at enterprise scale. | Solution Automated freshness monitoring built for production AI. DataHub Cloud catches training data quality issues before your models consume them — automated assertions, real-time alerts, no manual intervention required. |
| Ad 3 | Why DataHub You're already running DataHub. Cloud adds observability. No migration. No retraining. Just automated monitoring, freshness alerts, and data health scores — ready on the platform your team already knows. 📊 58% faster to resolve data-related outages — IDC Business Value of DataHub Cloud, 2026 | Why DataHub Built on what your team already knows. Same DataHub your team already runs. Cloud adds incident management, SLA dashboards, and compliance visibility — the operational layer that makes your platform enterprise-ready. 📊 17–18% higher productivity across data engineering teams — IDC, 2026 | Why DataHub Your DataHub foundation. Cloud's AI readiness layer. You've already solved discovery and lineage. Cloud turns that foundation into production-grade AI infrastructure — MCP Server, Context Graph, and certified metadata at scale. 📊 119% more AI/ML models successfully reaching production — IDC, 2026 | Why DataHub MCP Server. Context Graph. Built for AI teams. Worth it. ML teams at scale need metadata infrastructure that works at machine speed. Cloud adds exactly that — built on the DataHub your org already runs. 📊 119% more AI/ML models reaching production — IDC Business Value of DataHub Cloud, 2026 |
|
T2
Tier 2 — AI Tooling ConfirmedReodev-confirmed AI stack (OpenAI, SageMaker, LangChain, Vertex, MLflow, etc.). Message = they're building AI and the metadata/governance layer hasn't kept pace. Risk: undocumented, unvalidated training data reaching production. |
||||
| Pain Point | Lineage doesn't connect training data to model outputs — debugging AI pipeline breaks is hours of manual Slack threads | Discovery is bottlenecked on the platform team — analysts slow AI timelines waiting for data answers | No visibility into what's training the models — no way to certify AI-ready data before production | Feature discovery takes days — training datasets are undocumented across fragmented systems |
| T2 · AI Ad 1 | Problem AI pipeline broke. Nobody knows why. Yet. When lineage doesn't connect your training data to model outputs, debugging means Slack threads and hours of manual digging. | Problem Your analysts ask the platform team. Every time. When data discovery isn't self-serve, your central team becomes the bottleneck — and AI timelines move at the speed of that queue. | Problem What's training your AI models right now? You don't know. If you can't certify the data feeding your models before they go live, you don't have an AI governance strategy. You have a risk. | Problem Feature discovery takes days. Your sprint doesn't have them. When training datasets are undocumented and fragmented, feature reuse is impossible — and every ML team rebuilds what already exists. |
| Ad 2 | Solution Trace AI pipeline failures to the source. Fast. DataHub maps lineage from raw data through every transformation to your model inputs — root cause in minutes, not hours. | Solution Self-serve discovery. No platform team required. DataHub's conversational search lets analysts find certified data in seconds — cutting the discovery requests that bottleneck your platform team daily. | Solution Certify the data feeding your models before they go live. DataHub tracks which datasets are AI-certified, who owns them, and when they were last validated — so governance keeps pace with model deployment. | Solution Find certified training data in seconds, not days. DataHub's conversational search surfaces documented, certified datasets with lineage and quality signals — so ML teams build on data they can trust. |
| Ad 3 | Why DataHub Connects to your AI stack. Automatically. DataHub integrates with Databricks, Snowflake, dbt, Airflow, and your AI tooling — automated lineage, no manual metadata maintenance. | Why DataHub DataHub is the metadata layer your AI stack is missing. Discovery, lineage, and governance in one platform — so your platform team builds infrastructure, not bottlenecks. | Why DataHub DataHub is the governance layer your AI roadmap requires. AI readiness isn't just model performance — it's certified data, documented lineage, and automated compliance. DataHub delivers all three. 📊 119% more AI/ML models reaching production · 20% governance team efficiency gains — IDC, 2026 | Why DataHub DataHub connects ML pipelines to documented, trusted data. Automated lineage from raw sources to training features. Quality assertions on every dataset your models consume. Zero manual catalog work. |
|
T3
Tier 3 — Modern Data Stack (no AI tooling confirmed)Uses dbt, Snowflake, Airflow, or Databricks — confirmed via tech stack tags. No AI tooling confirmed. Message = the modern stack solves infrastructure but not visibility. Lineage gaps, blind deploys, and discovery friction are the cost. |
||||
| Pain Point | dbt schema changes break downstream dashboards — no way to know what's affected before deploying | Discovery still routes through the platform team — the stack is modern but self-serve isn't real | The data team is building what a catalog should provide — governance overhead grows with the stack | dbt models run on schedule — no visibility into which outputs are actually safe to use as training data |
| T3 · Stack Ad 1 | Problem You changed a dbt model. Then a dashboard broke. Without column-level lineage across your stack, every schema change is a blind deploy. You find out what broke when analysts start filing tickets. | Problem Discovery is still a Slack message to your team. Your stack is modern. Your catalog isn't. Every data question still routes through your central team — the bottleneck your modern stack was supposed to eliminate. | Problem Your data team is building what a catalog should do. Documentation, governance, and lineage are things your team is building manually — instead of work that should be automated by the data platform. | Problem Your dbt models run. You don't know which to trust. Without automated quality monitoring, you can't tell a healthy model output from a stale one — and models trained on bad data are expensive to discover late. |
| Ad 2 | Solution See every downstream impact before you deploy. DataHub's column-level lineage maps exactly what breaks when you change a dbt model, Snowflake schema, or Airflow DAG — before you push anything. | Solution One catalog for your whole data stack. Actually self-serve. DataHub's conversational search makes every dbt model, Snowflake table, and Airflow output discoverable in seconds — no ticket, no Slack message required. | Solution Automated governance across your full modern stack. DataHub ingests lineage, documentation, and ownership from dbt, Snowflake, and Airflow automatically — so your team governs, not manually catalogs. | Solution Automated quality monitoring on every dbt + Snowflake pipeline. DataHub runs freshness, volume, and schema assertions on a schedule — and flags failures before your ML pipelines consume unreliable outputs. |
| Ad 3 | Why DataHub Column-level lineage across dbt, Airflow, and Snowflake. Automated. DataHub connects your full modern stack in one lineage graph — no custom build, no manual tagging. 100+ native integrations, including yours. | Why DataHub DataHub connects what your modern stack leaves invisible. Discovery, lineage, and documentation across dbt, Snowflake, and Airflow — unified in one catalog that updates automatically as your pipelines change. 📊 91% faster data searches · 17–18% higher team productivity — IDC, 2026 | Why DataHub DataHub was built for the stack you're already running. Native integrations with dbt, Snowflake, Airflow, Databricks, and 100+ more — automated ingestion means your catalog stays current without human maintenance. | Why DataHub Lineage from raw sources to training features. Zero manual work. DataHub traces data flows from Snowflake ingestion through dbt transformations to the features your models consume — with quality signals at every step. |
|
T4
Tier 4 — Industry Fallback (ICP, no tech stack confirmed)ICP accounts with no confirmed tech stack or OSS signal. Message = broad data quality, discovery, and AI readiness pain framed by industry context. Served by industry targeting across sectors (Finance, Healthcare, Retail, Manufacturing, Media). |
||||
| Pain Point | Data pipeline failures are caught in production — debugging takes hours because there's no lineage or observability layer | 30%+ of data team time goes to answering discovery questions — ad-hoc data requests are slowing every team | The AI roadmap is stalled on data trust — no governance framework means models can't be certified for production | Training datasets are undocumented and hard to find — feature reuse is manual and unreliable across teams |
| T4 · ICP Ad 1 | Problem Your team debugs data issues instead of building pipelines. When data quality failures hit production with no lineage to trace, engineers spend hours on root cause analysis that should take minutes. | Problem Your data team spends 30% of their time finding data. When discovery isn't self-serve, every business question becomes a request ticket — and your platform team's roadmap pays the price. | Problem Your AI roadmap needs data you can actually trust. Today. Production AI requires certified training data, documented lineage, and automated governance — and most enterprise data platforms weren't built for this. | Problem Good models need good data. Yours isn't documented. Without a catalog that connects training datasets to quality signals, your ML team spends more time validating data than building models. |
| Ad 2 | Solution Find the root cause of data failures in minutes. DataHub's cross-platform lineage traces failures from production dashboards back to the source table or transformation that introduced them — automatically. | Solution Self-serve data discovery across your organization. DataHub's AI-powered search makes every dataset, table, and pipeline findable in seconds — with certification signals so teams know what's safe to use. | Solution AI-ready governance starts with certified, documented metadata. DataHub automates PII classification, data certification, and lineage documentation — so your governance program keeps pace with model deployment. | Solution Find, validate, and trust training data. Without the manual work. DataHub's catalog surfaces certified training datasets with quality scores, ownership, and lineage — so ML teams build faster on data they can actually trust. |
| Ad 3 | Why DataHub DataHub brings observability to every pipeline you run. Discovery, lineage, and observability unified across 100+ platforms — so your data team catches issues before production, not after analysts file tickets. | Why DataHub DataHub handles the discovery overhead your platform shouldn't. Conversational search, automated documentation, and certified data signals — so your platform team ships infrastructure, not answers to data questions. 📊 91% faster data searches · up to 25% reduction in Snowflake storage costs — IDC, 2026 | Why DataHub DataHub is how enterprise data teams reach AI readiness. Netflix, Visa, and 3,000+ organizations use DataHub to build the certified metadata foundation that production AI requires at enterprise scale. | Why DataHub DataHub is the data layer your AI team has been missing. Automated lineage from source to feature store, quality monitoring on training pipelines, and self-serve discovery — all in one platform built for AI-era data teams. |