Lineage you can show, tests you can point to, freshness you can promise.
Real-time where it earns it, batch where it doesn’t.
Snowflake, BigQuery, ClickHouse, or Postgres — picked for fit.
Every model has tests; every column has lineage.
When a dashboard breaks, you know why, where, and who to call.
Documentation is part of the deliverable, not an afterthought when someone asks.
Bronze/silver/gold or Kimball — chosen for fit, not fashion. Schema documented, ownership claimed per dataset.
Every transformation in version control. Tests on every primary key, foreign key, accepted value, and freshness expectation.
OpenLineage- or Marquez-backed. Drill from a dashboard tile back to the raw source. Impact analysis before any change ships.
Written per dataset. Paged when broken. Public to the consumers of the data, not buried in a team channel.
Lightdash, Metabase, or your tool. Semantic layer with one definition per metric — finally.
Hightouch/Census or custom. Warehouse-to-CRM/Email/Product so the rest of the company gets fed from the same trusted source.
From “help us stop arguing about numbers” to a full embedded data platform team.
The first eight weeks resolve the reconcile gap. After that, we focus on speed, depth, and self-serve.
Map current state. Identify the top 10 places numbers disagree across surfaces. Pick the canonical source for each metric, in writing.
Add dbt tests on every primary key and foreign key. Turn on lineage. Every dashboard tile must trace to a source.
Pipelines refactored where freshness or reliability requires it. SLAs published per dataset. Alerts wired to the right humans.
Semantic layer + BI rollout. Or, if the org is bigger, a multi-team platform with quotas and chargeback.
If something isn’t answered here, ask in your intro email — we keep this list short on purpose.
Snowflake if your team is mostly SQL and you want a more polished UI/security model. BigQuery if you’re GCP-native and willing to take fewer guardrails for better scale-out economics. ClickHouse if you’re event-heavy.
Almost always yes. The alternatives (notebooks, stored procs, ad-hoc views) lose traceability and tests within a quarter. dbt isn’t magic; it’s discipline made cheap.
No — real-time costs ~3× more to build and 5× more to operate. We make it real-time where the business decision requires it (fraud, ops dashboards) and batch everywhere else.
No — we extend it. We bring senior data engineers and analytics engineers; your team owns domain knowledge. Most engagements end with your team running the platform we built.
Send a paragraph about the problem. We’ll come back inside 48 hours with a written take — team shape, cost envelope, riskiest assumptions.