Depth versus breadth in hierarchies

Every hierarchy forces you to choose: drill down or spread out

You have a bookshelf with a hundred books. You could organize it into three categories (Fiction, Nonfiction, Reference) and then subdivide Fiction into Literary, Genre, Short Stories, then Genre into Science Fiction, Fantasy, Mystery, Thriller, then Science Fiction into Hard SF, Space Opera, Cyberpunk. That is depth. Five levels before you reach a single book.

Or you could organize that same shelf into twenty categories, all at the same level: Science Fiction, Fantasy, Mystery, Biography, History, Philosophy, Psychology, Economics, Programming, Design, Management, Poetry, Essays, Travel, Cooking, Health, Reference, Humor, Graphic Novels, Unread. That is breadth. One level and you are already looking at books.

These are not equivalent layouts with different aesthetics. They are different strategies that impose different costs, reward different behaviors, and break down in different ways. The deep bookshelf makes it trivial to find the right neighborhood once you know the path, but impossible to browse casually. The broad bookshelf makes it trivial to scan everything at a glance, but impossible to make fine distinctions within any single area.

This is the structural tradeoff at the center of every hierarchy you will ever build, navigate, or think inside of.

The algorithmic skeleton: DFS versus BFS

Computer science formalized this tradeoff decades ago with two fundamental tree-traversal algorithms: depth-first search (DFS) and breadth-first search (BFS).

DFS picks a branch and follows it as far as it will go before backtracking and trying the next branch. It uses a stack — last in, first out. It requires very little memory because it only needs to remember the current path, not every node at the current level. But it can get lost. In an infinite or very deep tree, DFS may descend forever along one branch, never finding a solution that sits two levels down a different branch.

BFS examines every node at the current level before descending to the next. It uses a queue — first in, first out. It guarantees finding the shallowest solution. But it is expensive. At every level, BFS must hold all nodes in memory before proceeding. In a tree with a branching factor of 10, level three already contains a thousand nodes.

The tradeoffs are precise:

DFS is cheap per step but risky per outcome. You go deep fast, but you might be going deep in the wrong branch.
BFS is expensive per step but safe per outcome. You never miss a shallow solution, but you pay a memory and time tax at every level.
DFS finds a path. BFS finds the shortest path. If you need any solution quickly and are okay with it not being optimal, go deep. If you need the best available solution, go wide first.

These are not just algorithmic properties. They describe the structural cost of organizing and navigating any hierarchy — your file system, your knowledge base, your understanding of a field, or your company.

The same tradeoff in how you build expertise

The algorithmic logic maps directly onto how you develop knowledge and skill. Going deep means pursuing resolution within a narrow domain. Going wide means pursuing coverage across many domains. Both have real costs.

James March's landmark 1991 paper "Exploration and Exploitation in Organizational Learning" formalized this as the central tension in any adaptive system. Exploitation means refining and deepening what you already know — it produces reliable, immediate returns. Exploration means searching broadly for new possibilities — it produces uncertain, delayed returns. March demonstrated that adaptive systems tend to exploit faster than they explore, which makes them effective in the short run and brittle in the long run. Organizations (and individuals) that over-exploit get stuck in local optima. They get very good at the wrong thing.

David Epstein's synthesis of the research in Range (2019) brought this into the domain of human expertise. Epstein documents that in "wicked" learning environments — where the rules change, feedback is delayed, and patterns do not repeat cleanly — breadth consistently outperforms depth. Nobel laureates are more likely than average scientists to have serious hobbies outside their field. LinkedIn's analysis of half a million profiles found that one of the strongest predictors of reaching an executive role was the number of different job functions a person had worked across, not how long they spent in any single function. In environments of genuine complexity, the person with range sees patterns that the specialist cannot, because the specialist's depth is also a kind of tunnel.

But Epstein is careful: in "kind" learning environments — where rules are stable, feedback is immediate, and patterns repeat — depth wins decisively. Chess, classical music, specific surgical procedures. When the structure of the problem is fixed and well-defined, going deep faster produces mastery faster. The question is never "depth or breadth?" in the abstract. The question is "what kind of problem am I organizing knowledge for?"

The T-shaped skill model, used internally at McKinsey since the 1980s for recruiting and developing consultants, encodes this tradeoff explicitly: broad general capability across many domains (the horizontal bar) combined with deep expertise in one or two (the vertical stroke). The shape itself is a statement about hierarchy — a wide, shallow layer of coverage supporting a narrow, deep channel of resolution.

Information foraging: how depth and breadth shape navigation

Peter Pirolli and Stuart Card's information foraging theory, developed at Xerox PARC in the late 1990s, studied how people navigate hierarchical information structures — particularly websites and knowledge bases. Their model draws on optimal foraging theory from ecology: just as an animal decides where to hunt based on the density of prey relative to the cost of travel, a person decides where to click based on "information scent" — the degree to which local cues predict the value of deeper content.

The research produced a concrete finding about depth versus breadth in navigation: as the depth of a hierarchical structure increases, the cost of a wrong turn increases exponentially. At depth level one, a wrong click costs you one backtrack. At depth level five, a wrong click might cost you five backtracks, and the accumulated cognitive load of maintaining your position in the tree compounds at every level.

This is why usability research consistently recommends broader, shallower navigation structures. Katz and Byrne (2003) found that increasing either information scent or breadth significantly increased users' ability to find what they were looking for. The practical recommendation: make your hierarchy broader (more choices at each level) and less deep (fewer levels to traverse) — unless the user already knows the exact path.

That "unless" is critical. Experts navigating a familiar domain perform better with deep structures because they already know the path. The depth does not confuse them — it provides resolution they need. Novices navigating an unfamiliar domain perform better with broad structures because they need to survey their options before committing. The breadth does not overwhelm them — it provides orientation they lack.

Depth rewards those who already know where they are going. Breadth rewards those who are still figuring out where to go.

Organizational hierarchy: span versus layers

The same tradeoff governs how organizations structure authority and communication. A "tall" organization has many hierarchical layers — small span of control (few direct reports per manager), but long chains of command. A "flat" organization has few layers — wide span of control (many direct reports), but short communication paths.

Tall structures (deep hierarchies) provide clear specialization. Each layer has a defined scope, and the narrowing at each level means decisions get progressively more specific. But information degrades as it travels through layers. A signal from a front-line employee passes through five managers before reaching the executive who could act on it. By then the signal is distorted, delayed, or dead.

Flat structures (broad hierarchies) provide fast communication. Fewer layers mean fewer translation steps between the person who sees the problem and the person who can solve it. But each manager is responsible for more direct reports, which means less oversight, less specialization, and more cognitive load at every node. A survey of Fortune 500 firms shows that the average CEO span of control grew from about 4.5 direct reports in the 1980s to around 20 by 2000 — a dramatic flattening driven by the speed demands of digital business.

Research from Wharton examined whether flatter organizations are more innovative. The answer is nuanced: flatter structures surface more ideas (breadth of input), but deeper structures are better at executing on specific ideas (depth of focus). The organizations that performed best at innovation were not purely flat or purely deep — they were selectively deep, maintaining shallow structures for discovery and deep structures for implementation.

This is the pattern. In every domain — algorithms, expertise, navigation, organizations — the answer is not "depth is better" or "breadth is better." The answer is that depth and breadth are different tools with different cost profiles, and the structure you choose should match the function you need.

The cognitive cost: why your brain struggles with both

Working memory sets a hard constraint on both strategies. George Miller's 1956 estimate of "seven plus or minus two" items has been refined by Nelson Cowan's research down to approximately three to five active items. This matters differently for depth and breadth.

Deep hierarchies tax your sequential memory. At level five of a nested structure, you need to hold the path you took (where you are relative to the root), the context at each level (why this branch and not another), and the goal you are pursuing (what you are looking for). That is at least three active items just to maintain orientation, leaving almost no working memory for actually evaluating what you find.

Broad hierarchies tax your parallel comparison capacity. When you face twenty options at the same level, you need to compare them against each other and against your criteria. Sheena Iyengar's research on choice overload — the famous jam study — demonstrated that more options do not always produce better decisions. Participants presented with 24 jam varieties were one-tenth as likely to purchase as those presented with 6. The cost of breadth is decision fatigue: when every option is visible and none is obviously wrong, the system stalls.

The practical implication: your cognitive architecture has a sweet spot between depth and breadth. Hierarchies that are too deep exhaust your ability to hold position. Hierarchies that are too wide exhaust your ability to choose. Chunking — Miller's actual contribution — is the mechanism that bridges the two: by grouping items into meaningful clusters, you convert breadth into navigable depth. A list of twenty items is overwhelming. Four groups of five items, each group labeled, is manageable. You have converted a flat, wide structure into a shallow, grouped one — and your working memory can handle it.

What AI changes about this tradeoff

When you use AI as a thinking partner, the depth-versus-breadth constraint shifts. An LLM can hold the full context of a deep hierarchy without the working memory degradation that limits your biological cognition. You can ask it to traverse ten levels of nested categories, recall where you are in the structure, and surface connections between branches — all operations that would overwhelm unaided human working memory.

This means AI amplifies your ability to work with deeper structures without losing orientation. You can build knowledge hierarchies that go deeper than you could navigate alone, because the AI serves as an external position-tracker and branch-comparer. But the tradeoff does not disappear — it shifts. The new bottleneck becomes your ability to verify what the AI surfaces. Depth without comprehension is not depth at all. It is the illusion of resolution masking the absence of understanding.

The practical move: use AI to explore broadly (BFS-style) when you are in discovery mode, letting it surface options across many branches you would not have time to check yourself. Then use AI to exploit deeply (DFS-style) when you have identified a promising branch, letting it trace implications and connections further than your working memory allows. Match the traversal strategy to your current phase: exploring or exploiting.

Protocol: choosing your traversal strategy

When you are building, navigating, or restructuring any hierarchy — a knowledge base, a project plan, a mental model, an organizational structure — ask three questions:

1. What is the function of this structure? If the function is retrieval (finding a specific thing you already know exists), depth serves you — it narrows to exactly what you need. If the function is discovery (surveying what exists to decide what to pursue), breadth serves you — it shows you the landscape.

2. Who is navigating this structure? If the navigator is an expert with a clear goal, depth is efficient — they know the path. If the navigator is a novice or someone exploring, breadth is forgiving — they can see options without committing to a branch.

3. What is the cost of a wrong turn? In deep structures, a wrong turn at level two means wasted traversal at levels three, four, and five. The cost compounds with depth. In broad structures, a wrong choice at the top level is corrected immediately — you just pick a different option. If wrong turns are expensive (irreversible decisions, long implementation cycles), prefer breadth at the top to reduce early errors. If wrong turns are cheap (easily reversed, low stakes), prefer depth to gain resolution faster.

These three questions convert an unconscious structural preference into a deliberate design choice.

Bridge: what happens at the bottom

Whether you go deep or wide, every hierarchy eventually terminates. The branches end. You reach the items that have no children — the leaf nodes.

This is not a minor detail. The entire purpose of a hierarchy is to organize access to its terminal elements. Categories, subcategories, and nested layers are navigation infrastructure. The actual value — the actions, the specific knowledge, the concrete decisions — lives at the leaves.

In L-0267, you will examine why leaf nodes are where action happens: why the most concrete level of any hierarchy is where implementation occurs, and why confusing a structural layer with an action layer is one of the most common mistakes in both personal knowledge management and organizational design.

Sources

March, J. G. (1991). "Exploration and Exploitation in Organizational Learning." Organization Science, 2(1), 71-87.
Epstein, D. (2019). Range: Why Generalists Triumph in a Specialized World. Riverhead Books.
Pirolli, P. & Card, S. (1999). "Information Foraging." Psychological Review, 106(4), 643-675.
Cowan, N. (2001). "The Magical Number 4 in Short-Term Memory." Behavioral and Brain Sciences, 24(1), 87-114.
Miller, G. A. (1956). "The Magical Number Seven, Plus or Minus Two." Psychological Review, 63(2), 81-97.
Katz, M. A. & Byrne, M. D. (2003). "Effects of Scent and Breadth on Use of Site-Specific Search on E-Commerce Web Sites." ACM Transactions on Computer-Human Interaction, 10(3), 198-220.
Shapira, Z. & Tucci, C. (2020). "Are Flatter Organizations More Innovative? Hierarchical Depth and the Importance of Ideas." Wharton Mack Institute Working Paper.
McKinsey & Company. "Ops 4.0 — The Human Factor: A Class Size of 1." McKinsey Operations Blog.