Service 06 / 10 / Data & AI From $54k/mo × 15-min freshness SLA

Data engineering
that ties out.

Pipelines that don’t silently lie. Warehouses, CDC, lineage, dbt — and the dashboards leadership actually trusts. Numbers that reconcile, not numbers that argue.

Start a project › Back to catalog

$54k

From / month

15m

Freshness SLA

< 1%

Reconcile drift

Engineers

Sec. 01 What ships

A warehouse you
can defend in a meeting.

Lineage you can show, tests you can point to, freshness you can promise.

A / Ingest

CDC & batch

Real-time where it earns it, batch where it doesn’t.

Debezium / Fivetran choice
Kafka if needed
Schema registry enforced
Late-arrival handling tested
Backfills idempotent

B / Warehouse

Warehouse core

Snowflake, BigQuery, ClickHouse, or Postgres — picked for fit.

Snowflake / BQ default
ClickHouse event-heavy
DuckDB edge use
Iceberg when shared
Time travel enabled

C / Transform

dbt & tests

Every model has tests; every column has lineage.

dbt Core / Cloud PR-gated
Test coverage ≥ 95%
Documentation auto + manual
Macros library shared
Exposures declared

D / Trust

Quality & lineage

When a dashboard breaks, you know why, where, and who to call.

Lineage UI Marquez / OL
Alerts per-model
Reconcile checks daily
Anomaly detection baseline
SLA per dataset written

Sec. 02 Deliverables

A pipeline you
can read in a week.

Documentation is part of the deliverable, not an afterthought when someone asks.

Conformed warehouse

Bronze/silver/gold or Kimball — chosen for fit, not fashion. Schema documented, ownership claimed per dataset.

dbt project

Every transformation in version control. Tests on every primary key, foreign key, accepted value, and freshness expectation.

Lineage graph

OpenLineage- or Marquez-backed. Drill from a dashboard tile back to the raw source. Impact analysis before any change ships.

Freshness & quality SLAs

Written per dataset. Paged when broken. Public to the consumers of the data, not buried in a team channel.

Self-serve BI layer

Lightdash, Metabase, or your tool. Semantic layer with one definition per metric — finally.

Reverse-ETL routes

Hightouch/Census or custom. Warehouse-to-CRM/Email/Product so the rest of the company gets fed from the same trusted source.

Sec. 03 Pricing — three tiers

Three shapes
of data work.

From “help us stop arguing about numbers” to a full embedded data platform team.

Reconcile

Trust sprint

From $54k/mo · 2 engineers

Audit current pipeline + warehouse
Top 10 reconcile gaps closed
Tests + lineage on critical models
8–12 week minimum

Most common

Build squad

From $96k/mo · 3 engineers

CDC ingestion, dbt baseline
Quality / freshness SLAs live
Semantic layer + BI rollout
Quarterly cost review

Platform

Data platform

From $164k/mo · 5 engineers

Multi-team self-serve data platform
Reverse-ETL + activation
Per-team quotas + chargeback
Embedded analytics engineer

Sec. 04 How the engagement unfolds

Eight weeks
to numbers that agree.

The first eight weeks resolve the reconcile gap. After that, we focus on speed, depth, and self-serve.

01 / Week 1–2

Audit & reconcile

Map current state. Identify the top 10 places numbers disagree across surfaces. Pick the canonical source for each metric, in writing.

02 / Week 3–4

Tests + lineage

Add dbt tests on every primary key and foreign key. Turn on lineage. Every dashboard tile must trace to a source.

03 / Week 5–6

Freshness SLAs

Pipelines refactored where freshness or reliability requires it. SLAs published per dataset. Alerts wired to the right humans.

04 / Week 7+

Self-serve or platform

Semantic layer + BI rollout. Or, if the org is bigger, a multi-team platform with quotas and chargeback.

Sec. 05 Frequently asked

Things buyers ask
on the first call.

If something isn’t answered here, ask in your intro email — we keep this list short on purpose.

Snowflake or BigQuery?+

Snowflake if your team is mostly SQL and you want a more polished UI/security model. BigQuery if you’re GCP-native and willing to take fewer guardrails for better scale-out economics. ClickHouse if you’re event-heavy.

Do we still need dbt?+

Almost always yes. The alternatives (notebooks, stored procs, ad-hoc views) lose traceability and tests within a quarter. dbt isn’t magic; it’s discipline made cheap.

Real-time everywhere?+

No — real-time costs ~3× more to build and 5× more to operate. We make it real-time where the business decision requires it (fraud, ops dashboards) and batch everywhere else.

Will you replace our data team?+

No — we extend it. We bring senior data engineers and analytics engineers; your team owns domain knowledge. Most engagements end with your team running the platform we built.

Sec. 06 Pairs well with

Other things
we do well.

05 / Data & AI

AI engineering

Clean data is what makes AI features land.

From $84k/mo →

04 / Run

Cloud & DevOps

The infra under the warehouse.

From $48k/mo →

P-05 / Product

Postgres playbook

60-page perf playbook, off-the-shelf.

$40 · one-time →

Got something hard
that needs to be real?

Send a paragraph about the problem. We’ll come back inside 48 hours with a written take — team shape, cost envelope, riskiest assumptions.

hello@kvb.dev › Browse services

Data engineeringthat ties out.

A warehouse youcan defend in a meeting.