Duplication signals missing abstraction

The same insight keeps showing up in different clothes

You open your notes and find a passage about how great leaders stay composed during conflict. A few folders away, there is a note about how experienced therapists maintain neutrality with distressed clients. In a third location, you wrote about how elite athletes perform under pressure by separating sensation from reaction.

Three notes. Three domains. One idea.

The idea is stimulus-response decoupling: the capacity to insert a gap between what happens to you and what you do about it. But because you encountered it in different contexts — leadership, therapy, sport — you recorded it three times, in three different vocabularies, without noticing you were saying the same thing.

This is duplication. And duplication is never just a storage problem. It is a signal that you have not yet named the pattern your repeated instances share.

The DRY principle: knowledge, not just code

Andy Hunt and Dave Thomas formalized this insight in The Pragmatic Programmer (1999) as the DRY principle — Don't Repeat Yourself. Their original formulation is more precise than most people realize: "Every piece of knowledge must have a single, unambiguous, authoritative representation within a system." Note the word knowledge, not "code." Hunt and Thomas were explicit that DRY extends to documentation, database schemas, build systems, test plans — any place where a piece of knowledge could be stated twice and therefore drift into contradiction.

Dave Thomas later said that DRY is "probably one of the most misunderstood parts of the book," because developers narrowed it to code-level deduplication. But the principle operates at the level of what you know and claim, not at the level of syntax. Two functions with identical code might represent genuinely different concepts that happen to share an implementation today. Two paragraphs with different words might express identical claims that should live in one place.

The parallel to personal knowledge systems is direct. When you capture the same insight in multiple notes without linking them to a shared abstraction, you create the epistemic equivalent of duplicated code. Each copy is a liability: when your understanding of the pattern evolves, you update one note and forget the others. Your knowledge base becomes internally inconsistent — different parts of your own thinking contradict each other silently.

How the brain detects missing abstractions

The cognitive machinery for recognizing duplication is real and well-documented. Pattern recognition in human cognition operates by matching incoming information against templates stored in long-term memory. When the prefrontal cortex — the region responsible for higher-order reasoning and abstract thought — encounters a repeated structure across different contexts, it generates a signal that something generalizable is present.

This is not a metaphor. Research in cognitive science shows that the hippocampus, essential for memory formation, enables the brain to recognize patterns based on past experiences and to anticipate future occurrences. When you read your third note about staying calm under pressure and feel a flicker of recognition — "I have said this before" — that flicker is your pattern recognition system detecting structural repetition beneath surface variation.

Eleanor Rosch's prototype theory (1973) describes how categories form in the mind. Rather than relying on strict Aristotelian definitions — a bird must have feathers, flight, and a beak — people categorize by resemblance to a central prototype. A robin is a more "bird-like" bird than a penguin. What Rosch demonstrated is that abstraction is not a top-down logical operation. It emerges bottom-up from exposure to multiple specific instances that share a family resemblance. You encounter enough examples and the category crystallizes.

This is exactly what happens when duplication signals a missing abstraction. You do not start with the abstract pattern and deduce instances. You accumulate instances — three notes about staying calm — and the pattern emerges from their overlap. The moment of recognition ("these are all the same thing") is the moment the abstraction becomes available to be named.

Mathematics: where abstraction is the entire game

No field demonstrates this dynamic more clearly than mathematics. The history of mathematical progress is largely the history of noticing that different-looking problems share identical structure — and then naming that structure.

Group theory emerged because Evariste Galois, studying polynomial equations, noticed that certain symmetry operations across different equations shared a common algebraic structure. Those concrete "groups of permutations" soon gave rise to abstract group theory — the study of any set with an operation that satisfies certain axioms, regardless of what the elements are. The abstraction did not add new information. It named the pattern that the specific cases already shared.

Category theory takes this one level further. Introduced by Samuel Eilenberg and Saunders Mac Lane in the 1940s, category theory is sometimes called "the mathematics of mathematics." Its power lies in identifying that constructions like products, limits, and adjunctions appear across algebra, topology, logic, and computation — different fields, different objects, same structural pattern. As the Stanford Encyclopedia of Philosophy notes, "general statements about categories apply to each specific concrete category of mathematical structures." The abstraction unifies what duplication revealed.

For your knowledge system, the lesson is the same. When you see the same structural pattern appearing in notes about leadership, parenting, and negotiation, you are looking at the epistemic equivalent of a mathematical structure waiting to be abstracted. Name it. Give it one home. Let the specific instances reference the general pattern.

The Rule of Three: when to extract

Not all repetition warrants immediate abstraction. Martin Fowler, in Refactoring (1999), presents duplication as the first and most important "code smell" — a surface indicator of a deeper structural problem. He and Kent Beck considered it bad enough to warrant being the first smell discussed.

But Fowler also endorsed the Rule of Three, attributed to Don Roberts: "The first time you do something, you just do it. The second time you do something similar, you wince at the duplication, but you do the duplicate thing anyway. The third time you do something similar, you refactor."

The wisdom here is about timing, not tolerance. Two instances of similar code — or two notes expressing a similar insight — may look alike for accidental reasons. They might diverge as your understanding deepens. Premature abstraction — extracting a shared pattern before you truly understand what the instances have in common — produces vague, over-general categories that obscure more than they clarify. The heuristic, widely quoted in software engineering, is: "It is better to have some duplication than a bad abstraction."

But by the third instance, you have evidence. You can see what varies and what stays constant. You can design an abstraction that captures the invariant core while allowing the specific instances to retain their contextual differences. In your knowledge system, this means: the first time you write about staying calm under pressure, write it. The second time, notice the overlap. The third time, stop, extract the shared pattern into its own note, name it precisely, and link the three contexts to it.

Single source of truth for your thinking

The database world calls this principle Single Source of Truth (SSOT): every data element should be mastered in exactly one place, with all other references pointing to that authoritative location. When this principle is violated — when the same customer's address lives in three different tables — updates to one copy leave the others stale. The system becomes internally inconsistent, and no one knows which version to trust.

Your knowledge base follows the same physics. If your understanding of "stimulus-response decoupling" is spread across three notes with no shared root, an update to your understanding — say, you learn about the neuroscience of the amygdala hijack — gets applied to whichever note you happen to open that day. The other two fall behind. Over months and years, your own thinking contradicts itself silently across different contexts, and you cannot diagnose why your notes feel less useful over time.

The fix is structural: one canonical note per concept, with contextual applications linked to it rather than duplicating it. When your understanding of the pattern evolves, you update one place. Every linked context inherits the update. This is not just tidiness. It is the difference between a knowledge base that compounds in value and one that decays through invisible inconsistency.

Your Third Brain: AI as duplication detector

This is where AI transforms the practice. The hardest part of eliminating duplication in a knowledge base is finding it. You wrote the three notes about staying calm months apart, using different vocabulary, filed in different contexts. You would need to read your entire archive to spot the overlap.

AI does not have this limitation. Vector embeddings — the mathematical representations that modern language models use to encode meaning — place semantically similar text near each other in high-dimensional space, regardless of surface vocabulary. "Leaders who maintain composure under pressure" and "athletes who separate sensation from reaction" use entirely different words but occupy nearby regions in embedding space because they describe the same underlying structure.

This capability — semantic deduplication — is the knowledge management equivalent of what databases call entity resolution: determining that two records that look different actually refer to the same entity. Traditional text search finds literal matches. Embedding-based search finds conceptual matches, which is precisely what you need to detect duplication that hides behind different words.

The practical workflow: when you create a new note, run a semantic similarity search against your existing notes. If the search returns notes that express substantially the same insight in different terms, you have found duplication. Now you can make a deliberate choice — extract the shared abstraction, merge the notes, or link them as related-but-distinct. The AI does not make the structural decision. It surfaces the candidates that your own memory would miss.

As your knowledge base grows, this becomes not optional but essential. A system of 50 notes is small enough to hold in your head. A system of 500 is not. Without semantic deduplication as a regular practice, every knowledge system of sufficient size drifts toward internal inconsistency — the same ideas repeated with slight variations, none authoritative, all slowly diverging.

From duplication to structure

The primitive for this lesson — "when you write the same idea twice you have not yet named the pattern they share" — is a diagnostic tool. It converts a vague feeling ("I think I have said this before") into a specific action: find the shared structure, name it, give it one home.

This matters because abstraction is not a luxury of formal thinking. It is how knowledge compounds. Every time you extract a shared pattern and name it, you create a concept that can be referenced, refined, challenged, and composed with other concepts. The three scattered notes about staying calm were inert — each useful only in its original context. The named abstraction "stimulus-response decoupling" is active — it connects to emotional regulation, to Kahneman's System 1 and System 2, to meditation practice, to any future context where the same structural pattern appears.

Duplication is not a flaw in your thinking. It is evidence that your thinking has already identified something worth naming. The only failure is leaving the pattern unnamed — letting the repetition persist without extracting the structure it reveals.

In the next lesson, Atomic does not mean isolated, we explore what happens after you have extracted your atoms and abstractions: they connect. Atomicity is about self-containment, not loneliness. The named patterns you extract from duplication become the nodes in a network — and it is the network, not the individual nodes, where the real cognitive power lives.

The same insight keeps showing up in different clothes

Three notes. Three domains. One idea.

This is duplication. And duplication is never just a storage problem. It is a signal that you have not yet named the pattern your repeated instances share.