We modeled how a typical product squad actually allocates its time across the five stages of the product development meta-framework — Discover, Define, Design, Execute, and Measure — and compared this allocation against an independent assessment of how much each stage contributes to product success. The result reveals a severe, systemic misalignment: the stages that determine whether a product succeeds receive a fraction of the time invested in them, while execution — the most automatable stage — consumes the majority of squad effort. This misalignment is not a management failure. It is a structural artifact of an era in which human hands were required to write code. That era is ending.
Across every industry that builds products, the same five-stage process repeats under different names. We call this the product development meta-framework:
This framework is not specific to software. Doctors follow it (SOAP notes), lawyers follow it (IRAC), researchers follow it (IMRaD). What makes software product development interesting is that one stage — Execute — has historically required a disproportionate amount of skilled human labor, distorting the entire process around it.
The question this report investigates is simple: does the time allocation across these five stages reflect their actual importance to product success? The answer, as we will show, is no — and the divergence is large enough to constitute a structural failure of how the industry organizes itself.
Our model is built from the ground up using a reference squad composition. Rather than using per-person hours alone, we weight each role's contribution by its headcount — because a squad with 8 software engineers and 1 PM has a very different time distribution than the raw per-person numbers suggest.
The default squad modeled is a typical growth-stage product team: 1 PM, 2 UX Designers, 8 Software Engineers, 1 Data Scientist, 2 QA Engineers, 1 DevOps Engineer, and 1 Analyst — 16 people in total.
| Role | Headcount | Discover hrs/person | Define hrs/person | Design hrs/person | Execute hrs/person | Measure hrs/person | Total |
|---|---|---|---|---|---|---|---|
| PM | 1 | 80 | 60 | 30 | 40 | 30 | 240 |
| UX Designer | 2 | 70 | 40 | 120 | 20 | 15 | 265 |
| SWE | 8 | 10 | 25 | 40 | 200 | 15 | 290 |
| Data Scientist | 1 | 20 | 20 | 35 | 80 | 50 | 205 |
| QA | 2 | 5 | 15 | 15 | 80 | 20 | 135 |
| DevOps | 1 | 5 | 10 | 20 | 60 | 30 | 125 |
| Analyst | 1 | 30 | 35 | 15 | 10 | 25 | 115 |
Man-hours per step are computed as the sum of (hours-per-person × headcount) across all roles. This is what makes the model accurate: a single SWE spending 200 hours in Execute becomes 1,600 man-hours when you account for 8 engineers. Compare that to the PM's 40 hours there, and the imbalance becomes stark.
The Execute step, driven almost entirely by SWE hours, dwarfs every other step in raw man-hour terms. This is the organizing fact around which all other analysis flows.
The most common mistake in analyzing time allocation is to derive "importance" from the existing time distribution — in effect, using the output to validate the input. We explicitly rejected this approach.
Our importance vector is a single set of weights assigned to each step based on one criterion: how much does the quality of work in this step determine whether a product achieves product-market fit or fails? This is deliberately independent of how many people are involved, how long it takes, or how it has traditionally been resourced.
| Step | Importance weight | Normalized share | Rationale |
|---|---|---|---|
| Discover | 32% | Getting the problem wrong is the single largest cause of product failure. No excellent execution recovers from building something nobody needs. Asymmetric downside: failure here almost guarantees failure overall. | |
| Define | 24% | Fuzzy definition creates compounding misalignment downstream. Scope drift and wrong success metrics are more expensive to fix at every subsequent step. | |
| Design | 17% | Architecture and UX decisions create technical and product debt that's expensive to unwind. Bad calls here are recoverable but costly. | |
| Execute | 11% | In an AI-assisted world, execution is increasingly commoditized. It is table stakes — necessary but no longer the primary differential lever for success. | |
| Measure | 16% | Chronically underweighted in practice. Critical to compounding — products that can't measure outcomes can't iterate toward PMF. Its importance is structural and long-term. |
With both dimensions normalized to sum to 100%, we need a formula that captures how well time allocation tracks importance — not just the size of the gap, but how disproportionate it is.
An early version of this model used absolute percentage-point differences to measure alignment. This fails in a headcount-weighted model because the gaps become very large — Execute might show a 39pp gap — and a linear scaling factor can't represent both large and small gaps accurately across the same chart.
The correct measure is the ratio between importance and time, not the difference. Perfect alignment means time is proportional to importance. Any deviation in either direction — too much time or too little — should be penalized symmetrically.
Where imp[s] is the normalized importance share of step s, and time[s] is the normalized time share of step s.
This formula maps to 100% when imp equals time exactly, and approaches 0% as the ratio diverges in either direction. A step receiving 4.5× more time than its importance warrants scores approximately 22%.
The overall score is importance-weighted — steps with high importance that are badly misaligned drag the score down more than low-importance steps that are badly misaligned.
The chart below shows the normalized comparison: each bar represents the share of total squad man-hours going to a step, broken down by role. The dashed outline shows the importance target. The line shows the ratio-based alignment score for each step.
Discover receives the smallest share of productive hours despite driving the largest share of product outcomes. The primary contributors to this step are PM (80h), UX Designer (70h per person × 2 = 140h), and Analyst (30h). SWEs contribute only 10h each — typically initial feasibility spikes. Total productive man-hours: approximately 350h, or 9.5% of the squad's productive budget.
The reason for this underinvestment is structural: discovery work is harder to schedule, harder to measure, and produces no visible artifact by the end of a sprint. In an industry organized around shipping, time flows toward things that produce commits.
Define is where discoveries are turned into scoped problem statements, requirements, architecture decisions, and success metrics. Its underinvestment is somewhat less severe than Discover's, partly because PRD writing and architecture planning have more visible deliverables that sprint planning can accommodate.
However, the quality of Define work is frequently undermined by the fact that Discover was underfunded. Teams define solutions to poorly understood problems, building in compounding misalignment from the start.
Design is the only step where time and importance are well matched. UX Designers (120h per person × 2 = 240h) dominate this step, and their work — wireframes, prototypes, system architecture — has enough visible artifact weight to receive appropriate scheduling. SWEs contribute 40h each here (320h total), primarily in technical design and architecture review.
Execute is the organizing center of the pre-AI product team. With 8 SWEs each spending 200 productive hours here (1,600h), plus 160h from QA, 60h from DevOps, and 80h from Data Scientists, the step consumes approximately 1,980 productive man-hours — the equivalent of the entire rest of the framework combined, twice over.
This is not a mistake. It reflects the genuine historical reality that writing, testing, and deploying code required human beings. The misalignment this creates — 4.6× more time than the step's outcome contribution warrants — is a structural artifact of that constraint, not of poor management.
Measure receives similar importance to Design (16%) but half the time allocation. This reflects a near-universal pattern: teams ship features and immediately pivot to planning the next sprint, never closing the loop between what was built and whether it worked. The result is a product development process that cannot compound — each cycle starts from approximately the same knowledge base as the last.
On top of productive hours, every role in a multi-person squad incurs coordination overhead: time spent on standups, sprint ceremonies, PR reviews, stakeholder readouts, alignment meetings, and knowledge re-transfer. This time carries zero importance weight — it exists entirely because multiple humans need to synchronize state that a single integrated system would hold natively.
We modeled coordination overhead as a separate layer on top of productive hours, estimated as follows:
| Role | Discover | Define | Design | Execute | Measure | Source of overhead in Execute |
|---|---|---|---|---|---|---|
| PM | 20h | 35h | 25h | 40h | 20h | Stakeholder readouts, sprint planning, escalations |
| UX Designer | 15h | 20h | 30h | 20h | 10h | Design review cycles, handoff back-and-forth |
| SWE | 5h | 15h | 15h | 60h | 8h | Standups (~5h), ceremonies (~24h), PR reviews (~36h) |
| Data Scientist | 10h | 15h | 10h | 25h | 20h | Analysis review loops, stakeholder explainers |
| QA | 3h | 10h | 8h | 25h | 10h | Bug triage, retesting coordination, sprint ceremonies |
| DevOps | 3h | 8h | 8h | 20h | 12h | Deployment coordination, incident response syncs |
| Analyst | 12h | 15h | 8h | 10h | 15h | Report review cycles, stakeholder alignment |
The Define step having the highest overhead share (36.6%) is counterintuitive but logical: it is the step that requires the most cross-functional alignment. PMs, designers, engineers, and analysts all need to agree on scope, requirements, and architecture — and that agreement costs time. The actual discovery and definition work is often less time-consuming than the alignment ceremonies around it.
The misalignment described in this report is not an accident, and it is not the result of poor product management. It is the rational outcome of an era in which writing code was genuinely the scarce, expensive, bottleneck activity — and therefore the activity around which everything else was organized.
That era is ending. AI-assisted coding tools have begun commoditizing the Execute step. As this commoditization deepens, the structural logic that justified the current time distribution dissolves. Two futures become possible.
Engineers are replaced by AI code generation. The same process — underinvesting in discovery, rushing to execution — runs faster. Wrong things are built more efficiently than ever. Companies burn AI credits shipping features nobody wanted.
Teams redirect reclaimed execution hours into Discover and Measure — the two most underinvested, highest-importance steps. Product teams shrink but improve. The limiting factor shifts from hands to thinking.
If time allocation were to converge toward the importance vector, what would a corrected squad look like? We model two scenarios: a minimal correction (close the gap by 50%) and an AI-native allocation (time tracks importance proportionally).
The current industry time allocation reflects the constraints of an era that is ending. The steps that matter most for product success — Discover (32%), Define (24%), and Measure (16%) — together receive only 32% of squad time, while the most automatable step receives 50%. The first generation of product teams to redirect that freed execution time upstream will build systematically better products than their peers — not because they work harder, but because they work in the right places.
Discover + Define together drive 56% of product success. They receive 23% of squad time. This is the fundamental misalignment the industry must correct.
As Execute is automated, these hours can flow to Discover and Measure — the two steps most underinvested relative to their importance.
This overhead exists only because multiple humans must synchronize state. A system that holds full product context collapses this tax structurally.
Teams that automate Execute without restructuring their Discover and Measure investment will simply build wrong things faster — at lower cost, until they run out of money.