Analytics engineering for small teams: the six stages most teams skip past
A staged maturity model for analytics engineering for small teams, with the operational signal that tells you it's time to move and the failure mode at each step.

In April 2026, dbt-core was downloaded from PyPI 97.5 million times in a single month. There are roughly 90,000 dbt projects running in production worldwide. Somewhere inside that number is a team of seven people whose entire data stack is a Google Sheet, three Looker Studio dashboards, and a Postgres replica that one engineer hits with raw SQL when somebody asks a question. That team is fine. They might not need dbt yet. They probably will in six months, and the worst thing they can do between now and then is jump straight to a three-tier mart split with a semantic layer.
This piece is for that team, and the half dozen variants of it I see every quarter. Five to fifty people, no dedicated data engineer, one founder or ops lead who got handed "analytics" because they were the most comfortable with spreadsheets. Mixed Brazilian and US/EU operators. Already paying for at least one BI tool. Already arguing about which revenue number is correct.
What follows is a six-stage maturity model. Each stage names the operational signal that tells you it's time to move, what's load-bearing at the stage and what's premature, and the single failure mode that kills the most teams there. The dominant failure across every stage is the same: skipping ahead. Burning months on a semantic layer before staging is even stable. Building three mart subdirectories when you have four mart models. dbt's own documentation says it plainly: "if you have less than 10 marts models and aren't having problems developing and using them, feel free to forego subdirectories completely." Most teams ignore that line on day one.
Where the work actually goes
Data professionals on what consumes most of their time
- Maintaining or organizing datasets55%
- Maintaining platforms or infrastructure26%
- Other (analysis, modeling, building)19%
The job, in other words, is mostly plumbing. The maturity model below is an argument about which plumbing is worth installing at which stage, and which is premature.
Stage 1: Spreadsheets, and that's fine
Most small companies start their analytics life in a spreadsheet, and most of them should stay there longer than they think. A spreadsheet is a database, a query engine, a chart library, and a presentation tool in one file. For a team of seven trying to figure out whether a price test worked or which campaign is converting, the spreadsheet is genuinely the right answer. There's no infrastructure to maintain, no schema migration, no SQL to write, and the iteration loop is two seconds.
I don't have a defensible number for how many small companies live in spreadsheets. The published surveys all sample respondents who already self-identify as data professionals, which biases the answer. But every operator I've worked with under thirty employees runs at least 60% of their reporting through Sheets or Excel, and that's not the failure mode. The failure mode is what they do next.
The signal it's time to move. The same spreadsheet has been edited by four people this month and two of them have started copying it before changing it. Or: a number you trusted last quarter is now wrong, and nobody can reconstruct why. Or: you have to explain to a new hire which tab is the "real" pipeline view. Once the spreadsheet is being forked rather than shared, you've outgrown stage 1.
What's load-bearing. Naming conventions. A single owner per file. A weekly rhythm where the numbers get refreshed and somebody actually looks at them. None of this is glamorous, and skipping it is what creates the chaos that warehouses are then blamed for not solving.
What's premature. A warehouse. A BI tool. dbt. Anyone whispering about a data lake.
Failure mode. Treating spreadsheet-stage as a moral failing and jumping straight to a Snowflake account because someone's friend at a Series B told them to. The average month-one bill is small, the average month-six bill is not, and the team still does its actual reporting in the same spreadsheet because the warehouse never got populated.
Stage 2: Metrics in production code
Stage 2 is the first time a metric stops being a cell reference and starts being a query. Usually this means somebody (often a backend engineer who got tired of being asked) wrote a SQL view in the production database, or a small Python script that hits the API, computes the number, and writes it somewhere. The number is now reproducible. Two people running it get the same answer. That's the entire promise of stage 2.
The shape varies. Some teams put the SQL in a `metrics.sql` file in their app repo and a cron job runs it nightly. Some hit a read replica from Metabase. Some use Hex or a notebook. The common thread is that the metric definition lives in code somebody can review, not in a spreadsheet someone can rewrite without telling you.
The signal it's time to move. You have more than ten of these queries. Two of them disagree on the same metric. Or you've started copying the same JOIN across three queries because three different reports need the same customer table.
What's load-bearing. Version control on the SQL. A single canonical definition for the three or four metrics the company actually argues about (revenue, MRR, active users, whatever yours are). A schedule, even if it's a cron job glued together with tape.
What's premature. dbt. A modeling layer. Tests. You don't have enough models for tests to pay for themselves yet.
Failure mode. Letting every team write their own version of the same query. The CRM team builds a "deals closed this month" report, finance builds another one from Stripe, the growth team wires Looker straight to a transactional replica with their own SQL, and now you have three definitions of revenue and an executive who has stopped trusting any of them. I wrote about that pattern in detail in duplicated metrics: the silent problem killing trust in BI. Stage 2 is where the duplication starts. Catching it here is cheap. Catching it at stage 5 is not.
Stage 3: Your first dbt project
This is the stage everyone wants to skip past, and the one that most determines whether the team's analytics function compounds or thrashes. Stage 3 is a dbt project with two layers: staging models that are 1:1 with raw source tables (renamed columns, cast types, no business logic) and a small set of marts that downstream tools read from. Tests on grain. Sources defined. CI on every PR. That's the whole project.
Tristan Handy, who founded dbt Labs, described the original idea as "analytics as software, authored by anyone who knows SQL, to bring together the best of both worlds of analytics and engineering." The keyword is "anyone who knows SQL". You don't need a data engineer to operate stage 3. You need someone who can write SELECT statements and is willing to put them in git.
The cost objection is usually overblown at this stage and underblown at the next one. Local DuckDB and the BigQuery free tier give a working warehouse at zero monthly cost for a team that's only ingesting a few sources. MotherDuck's Free Lite tier exists for the same reason. Pricing pages get scary further up the curve, when ELT jobs run every fifteen minutes across 500 GB of data and ten BI users hit the warehouse all day. That's not stage 3.
What a small warehouse costs each month
On-demand pricing in April 2026, USD, excluding personnel
The shape of that range is the point. Stage 3 lives in the first three rows. The last row is what stage 4 and stage 5 start to cost when the warehouse becomes load-bearing for the whole company — still trivial compared to one mid-level salary, but no longer "coffee subscription" money. The teams that get blindsided are the ones who priced their stack against row one and then ended up running row five.
The signal it's time to move. You're at stage 2 and the duplication has already started, or you can feel it about to. Somebody asked "wait, is that the same revenue number as the board deck?" and the answer took ninety minutes to produce.
What's load-bearing. Staging models that match raw sources. Tests on the grain of every model: unique and not_null on the primary key,relationships on every foreign key. CI that runs the tests on every PR. Sources defined with a freshness check. That's it.
What's premature. An intermediate layer. Custom macros. The dbt-utils package. Snapshots. The semantic layer. Any folder named marts/finance/when you have two finance models. The dbt Labs structure guide is explicit on this: "Don't split and optimize too early." Most teams read that line and split anyway.
Failure mode. Skipping stage 3 to go straight to stage 4 or 5. Building marts with no staging layer underneath, so every business model is doing its own renaming and casting and the same column gets defined four different ways across the project. Or adding the dbt Semantic Layer before there's a single mart model worth exposing. The anti-pattern is well-documented enough that Benn Stancil wrote a whole essay about it. His words, on dbt projects gone wrong: "Customers struggled to manage it, and a lot of dbt projects became briar patches of unparsable Jinja, entangled Python and SQL models, and an incomplete semantic layer." The briar patch starts at stage 3, when somebody decides the toy version isn't sophisticated enough.
Stage 4: Your first mart layer
Stage 4 is when the staging layer is stable, the tests are green, and you start building business-meaningful tables. fct_orders. dim_customers.fct_subscription_events. The names are boring on purpose. Each mart answers a recurring question the business is already asking, and the mart is built once so that every dashboard, every analyst, and every notebook can read from the same definition.
The discipline at stage 4 is restraint. Only build a mart for a question somebody is actually asking. If finance asks for revenue weekly and growth asks for activated users weekly, you have two marts. You don't have a "customer 360" mart yet, because nobody is asking for one.
The signal it's time to move. Three or more dashboards are joining staging tables directly and reproducing the same logic. Your BI tool has its own SQL view with a definition of "active customer" that's drifted from the one in the warehouse.
What's load-bearing. A documented grain for every mart (one row per what, per what time period). Tests on the grain. A single owner per mart. A short doc somewhere — a YAML description, a README, a Notion page — that says what the mart is for.
What's premature. Subdirectories like marts/finance/, marts/marketing/, marts/product/ when you have six mart models total. Custom incremental strategies. A DAG of intermediate models that exists to make the marts "cleaner".
Failure mode. Building a mart for every conceivable question. There's a well-known dbt Discourse thread where a practitioner asks the community whether 500 models across five dbt projects in one repo is normal. It is not normal. It is a team that confused "having a mart" with "answering a question". The mart layer exists to compress the surface area of the warehouse, not expand it.
Data team size at scaleups
Headcount as a share of total workforce, by company type
Translate the median into a 30-person company and you get one data person. That's the constraint everything in the next stage has to fit inside. Governance for a team of one is not a committee, it's a habit.
Stage 5: Your first BI tool, with governance
Stage 5 is where most teams either consolidate or fragment, and the difference shows up within a year. By this point you have a dbt project, a few stable marts, and three or four people who want to make their own charts. The question is whether they make those charts on top of the marts or alongside them.
Most small teams already have a BI tool by the time they reach stage 5. The trap isn't the tool, it's the proliferation. A 2024 community survey from Modern Data 101 asked data teams how many distinct tools they use day-to-day, and the distribution was lopsided in a way that shouldn't be a surprise to anyone who's seen a stage-5 stack from the inside.
How many tools data teams actually use day-to-day
Distribution across 230+ data practitioners
- 1–4 tools30%
- 5–7 tools60%
- 10+ tools10%
Madison Schott, a data lead at ConvertKit, summarized the symptom in the same survey: "creating metrics within different tools. And then, you have three different answers for the same question." That's the stage 5 failure mode in one sentence. The fix isn't a semantic layer yet — that's stage 6 territory. The fix at stage 5 is policy. Every metric that appears in an executive dashboard comes from a mart model. Every BI user who wants a new metric files a small request. The dbt project owns the truth, and the BI tool consumes it.
The signal it's time to move. Two BI tools are in production and somebody has reproduced the same metric in both. Or: an executive asks for a number and the analyst's first question back is "from which dashboard?".
What's load-bearing. A single canonical BI tool for governed dashboards. Documented ownership for each dashboard. A refresh schedule somebody actually checks. Naming conventions for dashboards (the grim but useful kind: FIN_, GROWTH_, EXP_). A short living doc that lists the metrics that come from the warehouse versus the metrics that don't, and why.
What's premature. A semantic layer. Row-level access control across the whole warehouse. A data catalog. A formal data governance committee. You have eleven people. The committee is two people in a Slack DM.
Failure mode. Adding a fourth and fifth tool to "fix" what is actually a definitional problem. The Modern Data 101 number is a warning here. Every additional tool is a new place where a metric can disagree with itself. The locallyoptimistic essay Run Your Data Team Like A Product Team describes the adjacent failure: the data team becomes a service desk, bolted on to other functions, with no compounding work. A small team at stage 5 is one or two bad decisions away from that ticket-shop trajectory.
Stage 6: The team owns this
Stage 6 is the goal, and most small teams never get there because they exhaust themselves on stages 4 and 5. The marker of stage 6 is simple: people who don't sit on the data team are adding models, fixing tests, and writing documentation. The CFO references a warehouse number in a board update without asking. The new hire opens a PR against the dbt repo in their second week. Decisions reference dashboard output instead of a Slack thread of screenshots.
That doesn't happen by accident. It happens when stages 1 through 5 were done in order, when the toolchain stayed small, and when the people who built the project taught the people who joined later. The dbt Labs 2025 survey found that 80% of analytics professionals were using AI in some part of their workflow, up from 30% the year before. AI assistance — Cursor on the modeling code, an LLM over the project's docs, a copilot on ad-hoc analysis — accelerates stage 6 more than any earlier stage, because the scaffolding it has to read from is finally clean enough to be useful. Bad scaffolding plus AI is the worst of both worlds.
The signal you're here. A non-data-team person opened the last meaningful PR. The number of "what does this metric mean" Slack questions has dropped. Onboarding a new hire onto the warehouse is a written page, not a verbal lore-dump.
What's load-bearing. Documentation. CI that's strict enough to block bad PRs but fast enough not to be hated. A model owner field that's actually maintained. A ritual where somebody reviews the warehouse's health monthly — failed tests, stale sources, unused models — and prunes.
What's still premature for most teams. A formal data platform team. A dedicated head of data. A self-service "anyone can query anything" culture that bypasses the marts. Stage 6 is not "everyone is a data engineer". It's "the small group of people who use the warehouse trust it, and a slightly larger group of people knows how to ask good questions of it".
Failure mode. Believing stage 6 is permanent. It's not. A team that reached stage 6 in 2024 can be back at stage 4 by 2026 if a key person left and nobody replaced the maintenance ritual. Stage 6 is a state, not an endpoint.
The objection: do we need any of this?
The strongest version of the counter-argument runs like this. The 55% number from dbt's own 2024 survey, where data professionals say organizing and maintaining datasets is their single biggest task, isn't a sign that the work is necessary. It's a sign that the analytics-engineering toolchain has invented the work. A small team could ignore the whole ladder, run reports out of a spreadsheet, and let the founder ask the engineer for SQL when they need a number.
That argument lands for the first dozen people. It stops landing somewhere between twenty and forty. The reason is that the cost of disagreement scales superlinearly. The Synq numbers earlier in this piece are the constraint: at the median scaleup, a 30-person company has one data person. That person cannot be the only place truth lives. Either the warehouse holds the truth and they curate it, or every team holds its own and the company starts running on contradicting numbers. Stage 3 onward is the cheaper of those two options, and it gets cheaper the earlier it starts.
Dengsøe's other observation is the corrective: "If you over-index on analytical roles, you may risk slowing everyone down as the quality of the data platform starts to deteriorate." Hiring three analysts to paper over a missing stage 3 doesn't work. The platform comes first. The analysts are the multiplier, not the substitute.
So which stage are you actually at?
The honest answer for most teams is one stage earlier than they think. The signals are operational, not aspirational. A few questions to ask, in order:
- Can two people on your team produce the same revenue number for last quarter without calling each other? If no, you are stage 1 or 2, regardless of what tools you've bought.
- When somebody asks "where does this number come from?", is the answer a query somebody can read, or a spreadsheet somebody has to remember? If the second, you are stage 1.
- Does every dashboard you trust read from a defined model in a warehouse, or do half of them read from a transactional replica with bespoke SQL? If the second, you are somewhere between stage 2 and stage 3 — the bespoke SQL is the duplication starting.
- If you opened your dbt project right now (assuming you have one), is there a
staging/directory with one model per raw source, and do those models do nothing but rename and cast? If no, you are stage 3 in name and stage 2 in practice. - When a non-data person needs a new metric, is there a path that doesn't involve waiting for the one engineer who knows the warehouse? If no, you are stage 4 at best, regardless of how many marts you've built.
- Has anyone outside the data team committed to the dbt project in the last quarter? If no, you are not at stage 6, even if the project is sophisticated.
Take the lowest stage any of those questions points to. That's where you are. The temptation, especially after reading a piece like this, is to skip ahead and address the most interesting failure. Resist it. The teams that compound are the ones that did stage 3 properly before they touched stage 4, and stage 4 properly before they touched stage 5. The teams that don't compound are the ones with a half-built semantic layer on top of a staging layer that was never finished.
A 30-minute call to walk through the diagnostic questions above against your real warehouse, real dashboards, and real team. If a 1 to 2 week engagement makes sense after that, we'll scope it. If not, you'll have a clearer map than when we started.
Map your team's actual stageSources
dbt-core PyPI download statistics, April 2026: pypistats.org; the 90,000 production projects figure was cited at Coalesce 2025 and reported by Hugo Lu in DataOps Leadership.
dbt Labs State of Analytics Engineering surveys: the 2024 edition (n=456) reporting that 55% of data professionals name organizing and maintaining datasets as their #1 task is at getdbt.com; the 2025 edition (n=459) — 56% citing data quality as the main challenge, 80% using AI in their workflow up from 30% the year before — is at State of Analytics Engineering 2025. Note that dbt Labs surveys recruit through dbt's channels, so these numbers reflect the dbt-leaning side of the practitioner population.
Mikkel Dengsøe's analysis of data team size at 100 US and EU scaleups, with the median 3% of workforce figure: Synq blog.
Modern Data 101 community survey on data stack complexity (n=230+, 2024), source for the 70% juggling 5–10 tools and 63% spending more than 20% of their time on stack maintenance, including the Madison Schott quote on duplicated metrics: The Current Data Stack Is Too Complex.
Tristan Handy's framing of analytics engineering, First Round Review interview, 2024; Benn Stancil on dbt's failure modes, benn.substack.com; the dbt Labs project structure guidance and the "don't split and optimize too early" rule come from How We Structure Our dbt Projects.
On running a small data team without devolving into a ticket shop: Run Your Data Team Like A Product Team (Locally Optimistic). The dbt Discourse thread on a 500-model project across five dbt projects: Amount of models in one project for one team.
Warehouse pricing references, all April 2026: BigQuery, MotherDuck pricing change 2026, Snowflake pricing, and MotherDuck data warehouse TCO.
Related reading
For the smallest viable dbt project — staging conventions, the four-test rule, CI from day one — see dbt for small teams: how to start without overcomplicating. For the duplication failure mode that stages 4 and 5 are designed to fight, see duplicated metrics: the silent problem killing trust in BI. For why all of this matters before you ship an AI feature on top of it, see fix your data before adopting generative AI.