Written by aka
on 2026-03-02

Sprouted: Hypothesis Trees as a Meta-Framework Above Spec-Driven Development

Hey, it's aka. I came up with a new framework! So I'm publishing it.

Ideas are first-come, first-served, right? So while there are still parts to refine, I'm putting it out there anyway. Taking the position that "requirements, specs, and everything else are hypotheses", this framework manages decision-making through a recursive tree structure of Why/What/How. It also covers comparisons with SDD tools and relationships with GORE, HDD, and DR.

⚠️ Disclaimer: Nearly 100% of this article was written with the assistance of generative AI for idea refinement and writing, reviewed by aka.

Sprouted: Hypothesis Trees as a Meta-Framework Above Spec-Driven Development

Introduction: Discomfort with Spec-Driven Development

In 2025, Spec-Driven Development (SDD) is gaining attention. Tools that "write specs first, then have AI generate code" are emerging one after another, and there's a growing call to move beyond "vibe coding".

This trend itself is right. Clarifying intent before implementation produces better quality than writing code from vague prompts. However, when actually using these tools, a certain discomfort lingers.

If requirements change, specs change too.

SDD tools say "treat specs as the source of truth". But if the requirements underlying those specs can change, how much meaning is there in treating specs as "settled"? Aren't requirements, specs, and designs all hypotheses?

Furthermore, obsessing over the labels "requirements", "specifications", and "design" seems questionable. The essence is that what you want to achieve and the means to achieve it can be separated at each layer — what you call each layer isn't essential.

This article defines this way of thinking as a framework called "Sprouted". Sprouted doesn't negate SDD — it provides a higher-level structure that encompasses SDD.

The Core of Sprouted: Recursive Tree Structure of Why / What / How

Basic Structure

In Sprouted, every development decision is managed as a node in a tree structure. Each node has three attributes:

Why (reason): The premise for this node's existence. Has two aspects: motivation and constraints
- Motivation: Why this is needed. Derived from the parent's What
- Constraints: Under what premises/conditions are we thinking? Derived from the parent's How
What (objective): What to achieve. The approach to solving the motivation in Why
How (means): Choice of means. The concrete approach to realize What under Why's constraints

For users, you only need to think about Why / What / How. However, being aware that Why contains both motivation and constraints improves the quality of your Why.

Inter-Layer Connections: Two Lines Descend Through Layers

The key to understanding Sprouted's recursive structure is that inter-layer connections consist of two lines.

Parent's What → Child's Why motivation (motivation chain): Digging into "what to achieve" creates the next layer's "why this is needed"
Parent's How → Child's Why constraints (constraint chain): "Which means was chosen" determines the next layer's premises and constraints

Within each node, motivation generates What, and constraints narrow How's options.

Let's look at a TODO app as an example:

graph TD
    subgraph "Layer 1"
        WM1["Why (Motivation): People forget tasks and it causes problems"]
        WC1["Why (Constraints): Targeting smartphone users / Individual dev scale"]
        What1["What: Enable task recording and management"]
        How1["How: Build a TODO app"]
        WM1 --> What1
        WC1 --> How1
        What1 --> How1
    end

    subgraph "Layer 2"
        WM2["Why (Motivation): Users regularly create, complete, and modify tasks"]
        WC2["Why (Constraints): TODO app (Web/Mobile) is the premise"]
        What2["What: Add, complete, and list tasks"]
        How2["How: Implement with React + SQLite"]
        WM2 --> What2
        WC2 --> How2
        What2 --> How2
    end

    subgraph "Layer 3"
        WM3["Why (Motivation): List view becomes hard to read as task count grows"]
        WC3["Why (Constraints): React + SQLite is the premise. SQL ORDER BY/WHERE available"]
        What3["What: Sort and filter tasks by deadline and priority"]
        How3["How: Build sort/filter API with query parameters"]
        WM3 --> What3
        WC3 --> How3
        What3 --> How3
    end

    What1 -.->|"Motivation chain"| WM2
    How1 -.->|"Constraint chain"| WC2
    What2 -.->|"Motivation chain"| WM3
    How2 -.->|"Constraint chain"| WC3

    style WM1 fill:#4a9e4a,color:#fff
    style WM2 fill:#4a9e4a,color:#fff
    style WM3 fill:#4a9e4a,color:#fff
    style WC1 fill:#8B4513,color:#fff
    style WC2 fill:#8B4513,color:#fff
    style WC3 fill:#8B4513,color:#fff
    style What1 fill:#3a7bd5,color:#fff
    style What2 fill:#3a7bd5,color:#fff
    style What3 fill:#3a7bd5,color:#fff
    style How1 fill:#e67e22,color:#fff
    style How2 fill:#e67e22,color:#fff
    style How3 fill:#e67e22,color:#fff

This chain continues as deep as needed. And every layer has the same structure. Whether you call Layer 1 "requirements", Layer 2 "specifications", or Layer 3 "design" is up to you. The structure is identical.

How to Write Why

There's an important point about writing Why.

Motivation should describe the "user or real-world premise" that emerges when digging into the parent's What. It must not be a rephrasing of the parent's How. Not "to make the TODO app work" but rather "users regularly create, complete, and modify tasks". The former merely repeats the parent's How, while the latter is a real-world premise that emerges from digging into the parent's What of "enable recording and management".

Constraints should describe the premises/conditions created by choosing the parent's How. Choosing "TODO app" as How makes Web/Mobile the premise. Choosing "React + SQLite" makes SQL ORDER BY available. Multiple constraints can be combined in a single node.

Layer 1 constraints also exist. Just because there's no parent doesn't mean it's empty. Business constraints, market conditions, resource limitations — everyone has them in their heads but rarely makes them explicit. If this foundation cracks, the entire tree is affected.

Nodes as Hypothesis Verification Cycles

Each node corresponds to a small cycle of "premise → hypothesis → verification → result".

Why → Premise/hypothesis (given this problem, under these constraints)
What → What to verify (if this is achieved, the problem should be solved)
How → Verification method (this is how we'll confirm/build it)
Result → What actually happened

In other words, each node is a small hypothesis verification cycle.

1-to-Many Options and Subtree Switching

The relationships between Why and What, and What and How, are 1-to-many. The structure is Why 1 : What many : How many.

This is a crucial point in Sprouted. If any of Why, What, or How changes, the subtree below may need to be switched.

First, let's look at a case where one Why has multiple Whats:

graph TD
    W["🌱 Why (Motivation): People forget tasks and it causes problems"]

    W --> WA["What①: Enable task recording and management"]
    W --> WB["What②: Notify when task deadlines approach"]
    W --> WC["What③: Automatically extract tasks from daily activities"]

    WA --> HA1["How: TODO app"]
    WA --> HA2["How: Sticky note app"]
    WA --> HA3["How: LINE bot"]

    WB --> HB1["How: Push notifications"]
    WB --> HB2["How: Email reminders"]
    WB --> HB3["How: Calendar integration"]

    WC --> HC1["How: AI auto-extraction"]
    WC --> HC2["How: Voice memo auto-analysis"]

    style W fill:#4a9e4a,color:#fff
    style WA fill:#3a7bd5,color:#fff
    style WB fill:#3a7bd5,color:#fff
    style WC fill:#3a7bd5,color:#fff
    style HA1 fill:#e67e22,color:#fff
    style HA2 fill:#e67e22,color:#fff
    style HA3 fill:#e67e22,color:#fff
    style HB1 fill:#e67e22,color:#fff
    style HB2 fill:#e67e22,color:#fff
    style HB3 fill:#e67e22,color:#fff
    style HC1 fill:#e67e22,color:#fff
    style HC2 fill:#e67e22,color:#fff

When What changes, the entire set of How options changes. The How options for What① "recording and management" (TODO app, sticky note app, LINE bot) are completely different from the How options for What② "notifications" (push notifications, email reminders, calendar integration).

In actual development, you select one option at each layer. The important thing is to record the options that weren't chosen and the reasons why.

When Something Changes, Review the Subtree

This is the practical point of Sprouted.

When Why's motivation changes. If the motivation "people forget tasks and it causes problems" itself collapses (e.g., users aren't actually forgetting tasks — they just can't prioritize), you may need to review all the What, How, and deeper layers below it.

When Why's constraints change. If the constraint "individual dev scale" changes to "team development", the How options and subsequent designs may change.

When What changes. If Why stays the same but you switch from What① "recording and management" to What② "notifications", the set of How options changes too. The designs considered for the TODO app become less relevant, and notification design becomes newly necessary.

When How changes. If both Why and What stay the same but How switches from "TODO app" to "LINE bot", the next layer's constraints change, and the child nodes' considerations change accordingly.

graph TD
    subgraph "Before change"
        B_W["Why (Motivation): People forget tasks"]
        B_W --> B_What["What: Enable recording and management"]
        B_What --> B_How["How: TODO app ✅"]
        B_How --> B_C1["CRUD API design"]
        B_How --> B_C2["React UI design"]
        B_How --> B_C3["SQLite schema design"]
    end

    subgraph "After switching How"
        A_W["Why (Motivation): People forget tasks"]
        A_W --> A_What["What: Enable recording and management"]
        A_What --> A_How["How: LINE bot ✅"]
        A_How --> A_C1["Messaging API integration"]
        A_How --> A_C2["Rich menu design"]
        A_How --> A_C3["Conversation flow design"]
    end

    style B_W fill:#4a9e4a,color:#fff
    style B_What fill:#3a7bd5,color:#fff
    style B_How fill:#e67e22,color:#fff
    style B_C1 fill:#ddd,color:#333
    style B_C2 fill:#ddd,color:#333
    style B_C3 fill:#ddd,color:#333
    style A_W fill:#4a9e4a,color:#fff
    style A_What fill:#3a7bd5,color:#fff
    style A_How fill:#e67e22,color:#fff
    style A_C1 fill:#f9e79f,color:#333
    style A_C2 fill:#f9e79f,color:#333
    style A_C3 fill:#f9e79f,color:#333

In other words, when any of Why / What / How changes, the subtree below may need review. The higher the node, the bigger the impact when changed; the lower the node, the more localized the impact. This is the concrete manifestation of the philosophy "everything is a hypothesis, and the only difference is a gradient of changeability".

Sunk Costs Exist in Reality

However, even if "switching the entire subtree" is logically sound, reality isn't that simple. Design may already be in progress, code may already be written, and the team may already be moving in that direction. Sunk costs are real.

That's precisely why visualizing the tree structure with Sprouted has value. When deciding whether to switch, if you can see "where we are now and what would be affected", you can make structural decisions rather than emotional ones. Switch everything? Reuse parts? What's the blast radius? See all that, then choose whether to "accept sunk costs and switch" or "stay the course".

Sprouted isn't a framework for "always making the right choice". Whether a choice was right can only be known after the fact. But if you can see what hypotheses you're standing on and what would be affected if something changed, you can do your best. That's what Sprouted is for.

When Why Has Multiple Sources

There are cases where multiple parents lead to the same How. For example, "preventing forgotten tasks" and "team progress visibility" — two different Whys — might both arrive at "TODO app" as How. This means the tree structure becomes a DAG (Directed Acyclic Graph).

However, if Why or What differs, the details of How also differ subtly. A TODO app for individual task-forgetting prevention and one for team progress visualization need different features. Hasty unification produces something half-baked for both purposes. And since the Whys differ, switching the subtree when one Why changes can cause unintended impact on the other tree.

This issue is discussed in detail in the "Trap of Unification" section below.

Everything Is a Hypothesis

The Illusion of Certainty

Traditional development processes have implicitly assumed "requirements are fixed, specs change" or "if you lock down specs, code stabilizes". SDD tools similarly treat specs as the "source of truth" — settled facts.

But every layer is a hypothesis.

"People forget tasks and it causes problems" is a hypothesis — we don't know if it's really true. Even if people are struggling, "a TODO app is the optimal solution" is also a hypothesis. Specs and code are all "bets" based on that hypothesis.

The assumption that specs are "settled" creates SDD's rigidity. When specs change, it's not "poor spec management" — the underlying hypothesis collapsed. This is a normal process and should be welcomed.

The Gradient of Changeability

Even if everything is a hypothesis, changeability varies.

Nodes closer to the root (Why-side): Abstract and tend to be stable. But the impact is large when they change
Nodes closer to the leaves (How-side): Concrete and tend to change. But impact is localized

The traditional view of "requirements are stable, code is unstable" is actually a direct property of this tree structure. However, traditionally, specific layers were given special treatment. In Sprouted, every layer is treated equally as a hypothesis, and the only difference is a gradient of changeability.

Management by Confidence Level

If everything is a hypothesis, you need a mechanism to manage "how much can we trust this?" In Sprouted, each node has a confidence level.

The conditions for confidence to rise are clearly definable:

Results are in and the hypothesis is verified (backed by user tests, production data, etc.)
All child nodes have reached high confidence (bottom-up confidence propagation)
Stable without changes for an extended period (track record over time)

Nodes whose confidence exceeds a threshold are proposed for "locking". Changing a locked node triggers visualization of the impact scope (subtree) and explicit display of change costs.

This maintains practical stability while preserving the "everything is a hypothesis" philosophy. When a hypothesis collapses, the entire subtree is automatically flagged for reconsideration.

The Trap of Unification

When How Looks the Same Despite Different Why and What

Developers have a bias toward unification. When they spot similar features, they think "can't we merge these?" However, unifying things where How merely looks the same but Why and What differ results in something half-baked for both purposes.

Moreover, after unification, if one Why grows stronger, the other purpose tends to be sacrificed.

Typical Failure Patterns

Feature creep through consolidation. Suppose a memo app crams "memos", "task management", and "document management" into one app. Structuring with Sprouted makes it clear that these are three different trees from three different Whys:

Why①: "Don't want to forget ideas" → What: "Quick memo-taking" → How: "Lightweight memo feature"
Why②: "Want to organize work" → What: "Task management" → How: "TODO feature"
Why③: "Want to consolidate materials" → What: "Long-form document management" → How: "Rich editor"

If this were visualized, the question "these are three separate trees — should they really be one app?" could have been raised.

Duplication of the same purpose. The reverse pattern also exists. Suppose a company built 5 or 6 messaging apps. They all started from the same Why: "want to communicate with people far away", but each department built a separate tree. Managing it as one tree would have enabled structural decisions at the What level: "chat and video calls should be separate" or "business and personal should be unified".

Successful separation. On the other hand, a company offering project management tools separated distinct tool groups (issue tracker / knowledge base / lightweight task board) for a common Why of "teams want to produce results". Why was shared but What differed, so separating the How was the right call.

Criteria for Unification

In Sprouted, the criteria for whether to unify are structurally determined:

If Why and What are the same, it's fine to unify How
If Why is the same but What differs, separate by default
If Why differs, manage as separate trees even if How looks similar

Even when unifying, if you decompose How further, you can separate the truly common parts from the parts unique to each Why. Coarse-grained unification is the root cause of many problems — finer granularity often reveals parts that can coexist.

When Reuse Is Fine

Conversely, when similar nodes appear in different branches, if Why / What / How all match completely, it's the same thing and can be reused.

The criteria are simple:

Why (both motivation and constraints) the same? → Yes
What the same? → Yes
How the same? → Yes
→ Reuse OK

If even one is No, manage them as separate nodes even if they look similar. If something changes, you want to switch just one side — but unification would drag the other along.

Comparison with Existing SDD Tools

Current SDD Tool Structure

As of 2025, major SDD tools all adopt linear pipelines. Typical workflows look like:

Requirements → Design → Tasks → Implementation
Constitution → Specify → Plan → Tasks → Implement
Spec (persistent source of truth) → Plan → Code generation

All share the basic structure of "write specs, then AI generates code based on them". The approach is to structure requirements, define immutable principles, and maintain specs as persistent artifacts.

Structural Problems SDD Tools Face

These tools have several structural limitations.

First, they treat specs as "settled". Positioning specs as the source of truth and progressing linearly from requirements → design → tasks. But as discussed, requirements, specs, and everything else are hypotheses with a gradient of confidence. Treating specs as settled means "spec-implementation drift" becomes a problem when changes occur. If treated as hypotheses from the start, changes are naturally processed as "hypothesis updates".

Second, Why isn't built into the structure. Some tools generate user stories, but how those connect to higher-level Whys is unmanaged. Immutable principles aren't treated as objects of hypothesis verification. Since "why does this spec exist?" can't be structurally traced, the basis for evaluating spec validity is easily lost.

Third, change propagation isn't structurally solved. When specs change, which designs and code are affected is still tracked manually. There's a risk of outdated specs misleading AI agents into generating implementations that don't match reality.

Fourth, there's no granularity flexibility. Even a small bug fix gets the same heavyweight process as a major feature development. The mechanism for adjusting process depth to problem size is weak.

The Gap Sprouted Fills

Sprouted provides structural solutions to these problems.

SDD Problem	Sprouted's Approach
Specs treated as settled	Every node is a hypothesis. Gradient management via confidence levels
Why not built into structure	Each node has Why (motivation+constraints)/What/How, with Why as the starting point
Change propagation is manual	Tree structure automatically identifies impact scope as subtrees
No granularity flexibility	Tree depth naturally determines granularity. How deep to go is your choice
No criteria for unification	Structural judgment based on Why/What match

Importantly, Sprouted doesn't negate SDD. SDD tools excel at the leaf-node level transformation of "spec → code generation". Sprouted structurally manages what's above: "why write this spec?", "is this spec really correct?", and "if the spec changes, what's affected?"

When Sprouted's tree leaf nodes become sufficiently concrete How, they're equivalent to the "specs" that existing SDD tools receive. In other words, Sprouted doesn't replace existing SDD tools — it sits above them as a meta-framework.

Relationship with Existing Approaches

Sprouted's philosophy intersects with multiple fields in software engineering. Here we examine three particularly relevant approaches: Goal-Oriented Requirements Engineering (GORE), Hypothesis-Driven Development (HDD), and Design Rationale (DR).

To state the conclusion upfront: Sprouted can be positioned as combining GORE's structuredness with HDD's hypothesis management, attempting to overcome the practicality barrier that DR couldn't solve for 50 years, using LLMs.

Goal-Oriented Requirements Engineering (GORE)

What Is GORE?

GORE is an approach researched for over 20 years in requirements engineering, using goals as the starting point to elicit, model, and analyze requirements. Representative frameworks include KAOS (Keep All Objectives Satisfied) and i* (iStar).

KAOS progressively concretizes top-level goals through AND/OR decomposition, ultimately breaking them down to levels assignable to agents (humans or software). Each goal can be formally defined in temporal logic (LTL), and obstacle analysis (factors preventing goal achievement) is systematically conducted. i* models dependency relationships between actors in organizations and handles the dimensions of "why", "who", and "how".

Sprouted's recursive tree structure, WHY/HOW decomposition, and explicit management of alternatives superficially resemble GORE's basic ideas. However, the two differ in their underlying epistemology and target layers.

Epistemological Difference: Are Goals "Correct" or "Hypotheses"?

GORE takes the stance that "goals are correct things elicited from stakeholders, and the job is to decompose them formally and completely". There's no structural mechanism for managing the possibility that goals themselves are wrong.

Sprouted takes the stance that "all nodes are hypotheses existing within a gradient of confidence". Even the top-level Why is a bet until verified, and when it collapses, the entire subtree is reviewed.

This difference becomes clear when attempting conversion. Modeling a TODO app in KAOS, the starting point is the goal "tasks are properly managed". Trying to bring this into Sprouted, you can't write the Why. In KAOS, the goal itself is the starting point, but in Sprouted, the Why "people forget tasks and it causes problems" is the starting point, and the goal-equivalent What is derived from there.

The reverse is the same: converting Sprouted nodes to KAOS loses confidence levels, hypothesis management, and the two-line connection of motivation and constraints. KAOS has no concept of "a goal might be wrong", so Sprouted's core hypothesis management structure has nowhere to go.

In other words, a clean bidirectional conversion between the two doesn't work.

Layer Difference: Consistency Within the Tree vs. Validity of the Tree Itself

The epistemological difference is also a difference in target layers.

KAOS formal verification works by defining goals in temporal logic, describing the system's state transition model, then having a model checker explore all paths to report "there's a path that can't reach this goal". This is powerful, but it can only verify "gaps within the defined state space" — the enumeration of state variables and operations is done by humans. Gaps in the world that weren't included in the model are fundamentally undetectable.

Sprouted starts from the premise that "the tree itself might be wrong". Even if KAOS perfects the consistency within the goal tree for "managing tasks with a TODO app", if the user's real problem was "prioritization" not "forgetting", the entire tree becomes meaningless. Sprouted supports the judgment of "replacing the tree itself".

In other words, GORE verifies consistency within the tree, while Sprouted questions the validity of the tree itself. The two aren't opposed — they operate at different layers.

Incorporating GORE's Insights

Among GORE's strengths, those that naturally connect with Sprouted's philosophy were incorporated as rules.

Obstacle analysis → Incorporated as a SHOULD rule. KAOS has a mechanism for systematically identifying obstacles that prevent goal achievement. Since this naturally connects with Sprouted's "everything is a hypothesis" philosophy, a rule was added: "for important nodes, describe risks that could collapse the hypothesis" (see Appendix "Rules for Hypothesis Management").

AND/OR decomposition distinction → Incorporated as a SHOULD rule. KAOS explicitly distinguishes whether sub-goals are "all required (AND)" or "alternatives (OR)". Since this affects change impact analysis ("can we still achieve the parent Why if we give up What②?"), a rule was added to explicitly note the AND/OR relationship of What groups (see Appendix "Rules for Options and Decision-Making").

Non-functional requirements → Documented as handleable within existing structure. GORE has the concept of softgoals (requirements that can't be fully satisfied but should be met to a sufficient level). In Sprouted, non-functional requirements can also be described with Why/What/How, and if achievement criteria are unclear, set the confidence level lower (see Appendix "Rules for Hypothesis Management").

The following were scoped out:

Formal verification. KAOS formal verification has the strength of completely exploring defined state spaces. However, the cost of describing state transition models is high, and the model's validity itself depends on human judgment. Sprouted takes natural language and LLM affinity as a design principle, which trades off against formal verification. However, if LLMs can in the future auto-generate state transition models from natural language, there's room for partial formal verification integration with high-confidence nodes (see "Outlook" section).

Actor/agent modeling. i* explicitly models "who depends on whom". An important perspective, but adding Who to the three attributes of Why/What/How increases cognitive cost per node, contradicting the design principle of "reducing practitioners' cognitive cost". Left as room for future extension.

Hypothesis-Driven Development (HDD)

What Is HDD?

Hypothesis-Driven Development applies Lean Startup thinking directly to software development processes. It replaces "requirements" with "hypotheses" and treats new feature/service development as "a series of experiments".

A typical HDD process:

Write out assumptions
Convert to hypotheses. A common template: "We believe that [feature X] will result in [outcome Y]. We will know we have succeeded when [metric Z]"
Design experiments (A/B tests, prototypes, user interviews, etc.)
Execute experiments and analyze results
Decide to persevere or pivot
Move to next hypothesis

For example, hypothesize "making hotel images larger will increase booking rates", run an A/B test, and measure whether booking rates increase by 5% within 48 hours.

Academically, the concept of "Hypotheses Engineering" has also been proposed, arguing that just as requirements engineering handles requirements, hypotheses need to be elicited, documented, analyzed, and prioritized. Here, "assumptions" are implicitly accepted understandings, while "hypotheses" are verifiable explicit statements.

Commonalities with Sprouted

HDD and Sprouted are the closest existing approach at the epistemological level.

"Everything is a hypothesis" epistemology. HDD also declares "we don't do projects anymore. Only experiments." The same stance as Sprouted's "all nodes are hypotheses".

Hypothesis verification cycle. HDD's "hypothesis → experiment → result → pivot or persevere" closely corresponds to Sprouted's "Why → What → How → result" cycle.

Negation of requirements. HDD argues "replace requirements with hypotheses", Sprouted argues "specs are hypotheses". Same direction.

The Decisive Difference: Presence of Structure

While sharing epistemology, HDD and Sprouted have a decisive difference.

First, HDD hypotheses have no hierarchical structure. HDD hypotheses are basically managed as a flat list. "Make hotel images larger", "change button color", "change pricing display" sit side by side. Without parent-child relationships, you can't structurally analyze which other hypotheses are affected when one collapses.

Sprouted manages hypotheses in a tree structure. If "users forget tasks" collapses, the impact on "TODO app", "React+SQLite", and "sort API" below it is structurally visible.

Second, there's no Why/What/How separation. HDD's template "We believe that [X] will result in [Y]" mixes means (X) and outcomes (Y) in one sentence. There's no mechanism to separate Why's motivation and constraints, What, and How, or to track motivation and constraint chains as Sprouted does.

Third, there's no change propagation concept. In HDD, when a hypothesis is refuted, you "pivot" — but analyzing what's affected and how far is done in people's heads. In Sprouted, impact scope is automatically identified as subtrees.

Fourth, the scope of application differs. HDD specializes in hypothesis verification against product metrics (conversion rates, DAU, etc.). Integration with A/B testing and feature flags is assumed, with emphasis on production data collection. Sprouted focuses on structuring the upstream questions: "what should we build in the first place?", "why build it?", "is this spec really correct?"

Sprouted as a Structured Version of HDD

In summary, HDD shares the "treat as hypothesis" epistemology but lacks hypothesis structuring and change impact analysis. Sprouted's unique contribution is combining HDD's hypothesis management with GORE's structuredness.

A division of roles is also conceivable: HDD handles everyday feature-level hypothesis verification ("will this feature be used?") while Sprouted handles the structural hypothesis management above ("why is this feature needed, and if something changes, what's affected?").

Design Rationale (DR)

What Is DR?

Design Rationale is an approach for explicitly recording and managing the reasons behind decisions made during the design process, originating from IBIS (Issue-Based Information System) developed by W.R. Kunz and Horst Rittel in 1970. Since then, multiple variants have been proposed: PHI (Procedural Hierarchy of Issues), QOC (Questions, Options, and Criteria), DRL (Decision Representation Language), and more.

DR's basic structure is strikingly close to Sprouted.

IBIS records decisions in the structure: Issue → Position → Argument. Multiple Positions are presented for an Issue, and each Position has supporting and opposing Arguments attached.

QOC analyzes design space in the structure: Question → Option → Criteria. Multiple Options line up for a Question, and each Option is evaluated against each Criteria.

DGA records decision rationale in the structure: Decision → Goal → Alternatives.

Comparing these with Sprouted reveals structural correspondences:

Design Rationale	Sprouted
Question / Issue / Decision	Close to What (what to achieve)
Option / Position / Alternative	Close to How (means candidates)
Criteria / Argument / Goal	Close to Why (motivation and constraints)
Recording unchosen options	Recording unchosen How and reasons

Commonalities with Sprouted

Beyond structural similarity, there are commonalities at the philosophical level.

Making decision reasons explicit. DR's core is recording "why this design was chosen", which aligns with Sprouted's "Why is the starting point" philosophy.

Recording alternatives. DR emphasizes recording "unchosen options and their reasons". Sprouted's rule to "record unchosen candidates and their reasons" directly corresponds to DR's insights.

Hierarchical decomposition. PHI and extended IBIS versions decompose Issues hierarchically. This corresponds to Sprouted's recursive tree structure.

Decisive Differences

First, DR has no hypothesis management. DR is an approach for "recording the rationale behind decisions made" and has no structural mechanism for managing the possibility that the decision itself is wrong. There's no confidence level concept and no epistemology of "this decision is a hypothesis and a bet until verified".

Second, there's no change propagation structure. DR records the rationale for individual decisions, but the mechanism for tracking which other decisions are affected when one changes is weak. QOC's Questions exist independently, lacking parent-child relationships or connection structures equivalent to Sprouted's "motivation chain" and "constraint chain".

Third, DR tends toward after-the-fact recording. QOC's original paper itself states "design space analysis is not a record of the design process but a byproduct constructed alongside design", but in practice, rationale is often written up after design is complete. Sprouted assumes building the tree before design and proceeding based on the tree.

Why DR Hasn't Spread for 50 Years

Design Rationale's history contains the most important lesson for Sprouted. Despite having a structure close to Sprouted's (Why/What/How separation ≈ Question/Option/Criteria), it has failed to achieve practical adoption for 50 years.

The causes have been clearly analyzed:

High recording cost. Both IBIS and QOC require designers to manually record Issue → Position → Argument or Question → Option → Criteria for every decision. The speed of design and the cost of recording rationale don't balance.

Constraining design thinking. The argumentative perspective forces structure on design thinking, which designers find cumbersome. Designers want to think freely, but the process of "first define the Issue, then enumerate Positions..." is felt to inhibit creativity.

Tools separated from development environment. Dedicated tools (gIBIS, QuestMap, Compendium, etc.) are required, not integrated into daily development workflows.

Nobody reads it. The painstakingly written rationale is rarely referenced later. It's only used when new members join or during major refactoring, making the recording cost not worth it.

How Sprouted and LLMs Might Solve DR's Problems

DR's failure causes all stem from "humans manually structuring". And this directly connects to what's described in Sprouted's "Outlook: LLM Affinity" section.

Recording cost → LLMs auto-extract structure from natural language. Designers just write their thoughts in natural language; the classification into Why/What/How is done by the LLM.

Design thinking constraints → Write freely in natural language, then LLMs propose structuring afterward. Since structuring happens just after design, not simultaneously, it doesn't inhibit creativity.

Tool integration → LLMs operate within existing development environments and chat tools. No switching to dedicated tools needed.

In other words, LLMs may solve the "cost of structuring" problem that Design Rationale couldn't solve for 50 years. The claim that Sprouted is "a framework that becomes practical only with the advent of LLMs" is substantiated by DR's history.

Sprouted's Positioning

Organizing the relationships with the three existing approaches reveals the position Sprouted fills.

Borrowed from GORE: Structuredness. Goal tree decomposition, AND/OR decomposition, obstacle analysis. But with different epistemology.

Shared with HDD: Epistemology. The stance of "everything is a hypothesis" and "hypotheses, not requirements". But HDD lacks structure.

Shared with DR: Decision-making structure. Why/What/How separation ≈ Question/Option/Criteria. But DR lacks hypothesis management and failed at practical adoption.

Sprouted is positioned as integrating GORE's structuredness × HDD's hypothesis management × DR's decision recording, attempting to overcome DR's 50-year practicality barrier through LLMs.

Outlook: LLM Affinity

Sprouted is also a framework that becomes practical for the first time with the advent of LLMs.

Structuring Support

If users describe nodes in natural language, LLMs can provide structuring support like "this reads more like a What" or "the Why motivation is vague — making it more specific would make child node selection easier". LLMs can also detect logic jumps (jumping from Why straight to How, skipping What) or cases where Why/What/How are mixed in a single node, proposing separation.

This opens up a framework that was previously only usable by "people who can do this abstraction themselves" to a wider range of developers.

Ambiguity Detection and Confidence Updates

Nodes written in natural language retain ambiguity. LLMs can detect "this What is open to multiple interpretations" and reflect this in confidence evaluation.

Also, because Why (motivation/constraints)/What/How are separated, the scope of ambiguity narrows. Writing everything in one sentence versus separating by attribute makes ambiguity points easier to identify.

Change Impact Analysis

When a parent node's hypothesis collapses, LLMs can estimate the impact on descendant nodes at the natural language level. Warnings like "if this requirement changes, it conflicts with this spec" or "this subtree needs reconsideration" can be auto-generated.

Duplication Detection and Unification Judgment

When similar nodes appear in different branches, LLMs can propose "these two nodes have matching Why/What/How — unify them?" Conversely, they can warn "the How is the same but Why differs — are you sure you want to unify?"

Confidence-Linked Formal Verification Integration

In the GORE field, methods are established for defining goals in temporal logic, describing system state transition models, and having model checkers explore all paths to detect gaps. The biggest cost of this approach was humans enumerating state variables and operations.

If LLMs can auto-generate state transition models from Sprouted's natural language nodes, partial formal verification linked with the confidence mechanism becomes possible.

Specifically, for nodes with high confidence that are lock candidates, LLMs extract state variables and operations from the Why/What/How descriptions and pass them to a model checker. The model checker reports "this path can't reach What", and humans decide whether to modify How or accept it as a risk.

Running this verification when confidence is low is pointless (it'll change anyway). The right time is when confidence rises and you're asking "should we lock this?" — formal verification as a final check before locking. This preserves Sprouted's hypothesis management philosophy while gaining formal guarantees for solidified parts.

However, formal verification only guarantees "completeness within the defined state space" — gaps in the world not included in the model are undetectable. Sprouted's "everything is a hypothesis" stance assumes this limitation, positioning formal verification not as a replacement for hypothesis management but as reinforcement.

Connection to Existing SDD Tools

When Sprouted's leaf nodes become sufficiently concrete, they can be directly passed to existing SDD tools or AI coding agents. LLMs can also assist in determining "has this been decomposed enough?"

Sprouted sits at the top of the toolchain, managing from "why build it" to "what specifically to build" as a hypothesis tree. "How to turn it into code" is delegated to existing SDD tools.

Conclusion

Sprouted's essence is the philosophy of managing all software development decisions as a "hypothesis tree".

Requirements, specs, designs, and code are all hypotheses. The only difference is a gradient of abstraction and changeability. With this recognition, there's no need to give special treatment to any particular layer, and the same operations (decomposition, tracing, change impact analysis, hypothesis verification) can be applied to all layers.

The framework name "Sprouted" comes from the metaphor that "hypotheses sprout from the seed of Why and grow into a tree". If the seed is good, the tree grows healthy; if the seed (Why) is wrong, no amount of careful spec-writing is meaningful.

SDD tools made "spec → code" transformation efficient. Sprouted structurally manages why that spec exists, whether it's really correct, and what's affected when it changes. The two aren't opposed — combining them enables a more robust development process.

All development begins with a single seed — a "Why".

Appendix: Sprouted Rules and Checklists

This framework's rules are defined in three levels following RFC 2119:

MUST: Without this, the framework doesn't function
SHOULD: Improves quality, but may be omitted depending on circumstances
MAY: Beneficial if done, but no problem if not

Rules for Node Structure

MUST: Each node has Why / What / How. This is Sprouted's minimum unit. If any one is missing, the node is incomplete and the basis for decisions or means can't be traced.

MUST: What is derived from the parent's Why motivation. A What without motivation can't explain "why is this needed?" "What to build" must be preceded by "why is it needed?"

SHOULD: Write both motivation and constraints in Why. Consciously separating motivation (from parent's What) and constraints (from parent's How) improves Why quality and keeps What and How choices on track. However, strictly separating them in writing isn't mandatory.

SHOULD: Don't make Why's motivation a rephrasing of the parent's How. Not "to make the TODO app work" but "users regularly create, complete, and modify tasks". Rephrasing the parent's How has no information value and can't be used to derive What.

SHOULD: Explicitly state constraints in Layer 1's Why too. Business constraints, market conditions, resource limitations, etc. Unstated premises can't even be recognized when they collapse.

MAY: Unify the description format for Why / What / How. Free-form natural language is fine, but establishing templates when used by a team reduces ambiguity.

Rules for Hierarchy and Decomposition

MUST: Parent's What generates child's Why motivation, parent's How generates child's Why constraints. This is the foundation of Sprouted's recursive structure. If this connection breaks, inter-layer traceability is lost.

SHOULD: Don't pack multiple concerns into one node. Separate things with different Whys into different nodes. For example, "quick memo-taking" and "searching and organizing memos" come from different Whys and are different trees — they shouldn't be combined in one node.

SHOULD: Match decomposition depth to problem size. A small bug fix doesn't need 3-4 layers deep. Conversely, ending a large feature development at 1 layer risks oversight.

MAY: Delegate to existing SDD tools or coding agents when leaf nodes are sufficiently concrete. Sprouted manages "why to build" and "what to build"; "how to turn it into code" can be left to existing tools.

Rules for Options and Decision-Making

MUST: Consider multiple How candidates for each What. Don't jump to the first means that comes to mind — line up options and choose. Even if there's only one candidate, at least consider "is there another option?"

SHOULD: Record unchosen options and their reasons. This makes reconsideration easier when premises change later. Without reasons, you can't trace why that means was chosen.

SHOULD: When multiple Whats exist in parallel, explicitly note whether they're all required (AND) or alternatives (OR). Dropping one AND-related What means the parent Why can't be achieved, while OR-related ones have alternatives. Without this distinction during change impact analysis, it's hard to judge "is it OK to give up this What?"

MAY: List multiple What candidates before choosing. Considering "are there other angles?" at the What level too, not just How, can lead to more fundamental solutions.

Rules for Hypothesis Management

MUST: Treat all nodes as hypotheses. Nothing is a settled fact. Requirements, specs, and code all exist within a gradient of confidence.

SHOULD: Give each node a confidence level. Explicitly stating "how likely is this to be correct?" makes change decisions easier. Consider locking high-confidence nodes; actively verify low-confidence ones.

SHOULD: When a parent node changes, check the affected subtree. Ideally this would be automatic, but even manually maintaining the habit of "parent changed, so review children" improves quality.

SHOULD: For important nodes, describe risks that could collapse the hypothesis. Thinking ahead about "what could disprove this hypothesis?" speeds up response when it does collapse. Not needed for every node, but especially effective for low-confidence or high-impact nodes.

MAY: Record hypothesis verification results in nodes. States like "verified", "unverified", "refuted" make the health of the entire tree visible.

MAY: Describe non-functional requirements with Why / What / How too. Non-functional requirements like performance and security can be handled with the same structure (e.g., Why "users leave if it's slow" → What "fast response" → How "target 200ms with cache"). If achievement criteria are unclear, set the confidence level lower.

Rules for Unification

MUST: Only reuse when Why / What / How all match. If even one differs, manage as separate nodes.

MUST: Don't unify things with different Whys. Even if How looks the same, different Whys mean different trees. Unifying means when one Why changes, the other gets dragged along.

SHOULD: When tempted to unify, first check the match level of Why / What / How. "Similar" and "same" are different. Build the habit of checking structure before unifying.

MAY: If unification isn't possible, decompose How further to partially unify. Finer granularity can sometimes separate the truly common parts from the unique parts.

Checklist: When Creating a Node

Items to check when creating a new node:

☐ Is Why written? A node without motivation has unclear reason to exist
☐ Is Why's motivation derived from the parent's What, not a rephrasing of parent's How?
☐ Are constraints explicitly stated? Especially easy to forget at Layer 1
☐ Does What naturally follow from the motivation? Any logical jumps?
☐ When multiple Whats exist, is it noted whether they're AND (all required) or OR (alternatives)?
☐ Were multiple How candidates considered? At least two?
☐ Are unchosen candidates and reasons recorded?
☐ Does the node mix multiple Whys? If so, split it
☐ What's the confidence level? Verified, or unverified hypothesis?
☐ For important nodes, have risks that could collapse the hypothesis been noted?

Checklist: When Changing a Node

Items to check when modifying an existing node:

☐ What changed? Why (motivation/constraints) / What / How — which one?
☐ Has the impact on child nodes been checked? If motivation changed, What is affected; if constraints changed, How is affected
☐ Need to review the entire subtree, or just part of it?
☐ Has the switch-or-stay decision been made considering sunk costs?
☐ If unified nodes exist, what's the impact on other usage points?
☐ Is the reason for the change recorded? So "why was this changed?" can be traced later

Repository (Added 2026/03/20)

The Sprouted framework definition (SKILL.md), AI assistant system prompt, new project templates, and Sprouted's own hypothesis tree (working example) are publicly available.

👉 gitlab.com/akapersonal/sprouted