Unique identifiers prevent confusion

1.3 billion websites work because every page has an address

The internet hosts over 1.3 billion websites and somewhere north of 50 billion indexed pages. Every single one of them is reachable because it has a URL — a Uniform Resource Locator — that points to exactly one resource. Type it in, and you get that page. Not a similar page. Not a page with the same title. That exact page.

Roy Fielding formalized this principle in his 2000 doctoral dissertation at UC Irvine, defining the REST architecture that now underpins virtually every web application you use. The core constraint: every resource must be identified by a unique URI. No unique address, no reliable reference. No reliable reference, no links. No links, no web.

This is not a technical curiosity. It is the foundational pattern behind every system that manages knowledge at scale — from the web itself to databases to the most productive personal knowledge systems ever built. And it is the pattern most people violate every day in their own thinking tools.

The collision problem: when two things share a name

In L-0021, you learned that each container should hold exactly one idea. But atomicity alone is not enough. You can have a thousand perfectly atomic notes, and the system still breaks if you cannot point to a specific one without ambiguity.

This is the collision problem. It happens in every knowledge system that relies on names alone:

Wikipedia maintains over 370,000 disambiguation pages — pages that exist solely because multiple concepts share the same name. "Mercury" could be a planet, an element, a Roman god, a record label, or a NASA program. Without disambiguation, a link to "Mercury" is meaningless. Wikipedia solved this by giving each concept a unique page with a unique URL, and creating explicit disambiguation pages that route you to the right one.

Databases enforce this at the schema level. Every table in a relational database has a primary key — a column (or set of columns) that uniquely identifies each row. Without primary keys, you cannot create foreign keys. Without foreign keys, you cannot establish referential integrity. And without referential integrity, your data relationships are unreliable. A customer order that references "John Smith" instead of customer ID 4,829 will eventually point to the wrong John Smith. Edgar Codd established these principles in his 1970 relational model, and every production database built since then depends on them.

Your note system almost certainly has this problem right now. Search for "meeting notes" or "project plan" or "ideas" in whatever tool you use. Count the results. If you have more than one hit with a similar title, you have a collision. And every collision is a potential source of confusion — a link that points to the wrong thing, a reference that retrieves outdated information, a decision made on the basis of the wrong document.

Luhmann's solution: addresses that encode structure

Niklas Luhmann, the German sociologist who produced 50 books and over 550 articles during his career, built his output on a Zettelkasten (slip-box) of approximately 90,000 index cards maintained over four decades. The system's power was not the cards themselves — it was the addressing scheme.

Luhmann assigned each card a unique alphanumeric identifier using an alternating number-letter scheme. The first card was 1. A continuation of that idea was 1a. A branch from 1a was 1a1. A new idea unrelated to the first was 2. This meant every card had a permanent, unique address. But more than that, the address itself encoded the card's position in the intellectual structure — you could see, from the identifier alone, that card 21/3d7a was a specific branch within a specific line of thought.

This addressing scheme enabled two things that no other note system of his era could do:

Cross-referencing at scale. Because every card had a unique address, Luhmann could write references on any card that pointed to any other card, regardless of where it was stored. His system contained tens of thousands of these cross-references — links between ideas in different fields, connections that would have been invisible in a system organized by topic folders.

Insertion without reorganization. When a new idea emerged that related to an existing one, Luhmann could branch off a new card at the precise point of connection — card 15a3b slots in between 15a3a and 15a4 — without renumbering anything. The identifier was stable. The address never changed. Every existing reference to 15a3a still worked.

The Zettelkasten Forum captures the underlying principle succinctly: "Only with a unique identifier can you address Zettel individually. Only with that capability can you create a web of thoughts." The identifier is not a convenience. It is the mechanism that turns a collection of notes into a network of ideas.

Naming is a cognitive act, not an administrative one

Assigning a unique identifier to an idea is not bureaucratic overhead. It is a cognitive operation that changes how you process the idea itself.

Lupyan and Thompson-Schill (2012) proposed the Label-Feedback Hypothesis: when you assign a verbal label to a concept, that label feeds back into your perceptual and cognitive processing of the concept itself. In their experiments, hearing a word like "dog" led to faster recognition of subsequently presented images of dogs than hearing an equally familiar non-verbal cue like a bark. The label did not merely tag the concept — it actively shaped how the brain processed it.

The implications for knowledge work are direct. When you give an idea a unique name — not a vague title like "Thoughts on architecture" but a precise identifier like "ADR-003: Adopt event-driven architecture for payment processing" — you are doing more than filing. You are sharpening the idea. The act of naming forces you to determine what the idea is versus what it is not, where its boundaries fall, and how it differs from the three other ideas that initially seemed similar.

Andy Matuschak makes this explicit in his principle that evergreen note titles are like APIs: "When Evergreen notes are factored and titled well, those titles become an abstraction for the note itself. The entire note's ideas can then be referenced using that handle." A good identifier functions exactly like a well-designed function signature in code — it tells you what the thing does, it distinguishes it from everything else, and it provides a stable interface that other parts of the system can depend on.

This is not about being obsessively organized. It is about making your ideas addressable — available for linking, combining, challenging, and building upon. An idea without an identifier is like a function without a name. It might exist, but nothing else in the system can call it.

Three identifier schemes that work

You do not need to invent a system from scratch. Three schemes have been proven at scale:

Sequential IDs. Assign each note a number: NOTE-001, NOTE-002, NOTE-003. Simple, collision-proof, and easy to implement in any tool. The downside: the number tells you nothing about the content. Luhmann's system improved on this by encoding structure into the sequence, but pure sequential IDs work fine if your note titles carry the descriptive load.

Timestamp IDs. Use the creation date and time as the identifier: 202602221430 (February 22, 2026, 2:30 PM). This scheme guarantees uniqueness (you cannot create two notes at the same instant), preserves chronological information, and requires zero maintenance. Many Zettelkasten practitioners use this approach. The downside: timestamps are not human-memorable, so you rely on search and links rather than recall.

Descriptive slugs. Use a short, URL-style descriptor: architecture-decision-event-driven-rewrite. This is what the web uses. It is human-readable, search-friendly, and self-documenting. The downside: you must enforce uniqueness yourself, and slugs sometimes need updating when your understanding of the idea evolves. The fix is to keep the slug stable as the canonical address, even if you update the display title.

Each scheme solves the same problem: ensuring that a reference to idea X retrieves idea X and nothing else, today and in five years.

The AI multiplier: why identifiers matter more now

Every advance in AI-augmented knowledge work amplifies the importance of unique identifiers.

Vector databases, the storage layer behind modern AI retrieval systems, store ideas as high-dimensional embeddings — mathematical representations of meaning. When you query a vector database, it returns the most semantically similar entries. But "similar" is not "identical." If your knowledge base contains three notes about "architecture decisions" without unique identifiers, the AI system has no reliable way to distinguish between them. It will retrieve the one whose embedding is closest to your query, which may or may not be the one you need. Entity resolution — the process of determining that two references point to the same real-world entity — is one of the hardest problems in NLP, and it becomes trivially easy when every entity has a unique ID.

Knowledge graphs make this explicit. In a knowledge graph, each node is an entity with a unique identifier, and edges represent typed relationships between entities. The graph can answer questions like "What does ADR-003 depend on?" or "Which decisions were made before the team adopted Kubernetes?" only because each node has a stable address. Without unique IDs, the graph collapses into a set of ambiguous labels connected by ambiguous edges.

Retrieval-Augmented Generation (RAG), the dominant pattern for grounding AI responses in your own knowledge, depends entirely on retrieving the right document for the right query. If your documents are ambiguously labeled, the retrieval step introduces noise — and the AI generates confident, well-written answers grounded in the wrong source. Unique identifiers are the mechanism that makes RAG retrieval precise rather than approximate.

The practical implication: if you plan to use AI as a thinking partner — to surface connections, challenge assumptions, or synthesize across your notes — your knowledge base needs to be as precisely addressed as a well-designed database. Every note an ID. Every ID unique. Every reference resolvable.

The cost of not doing this

The failure mode here is subtle. Nobody wakes up and thinks "my knowledge system failed because I didn't use unique identifiers." Instead, the cost accumulates invisibly:

You link to the wrong version of a document and make a decision based on outdated information. You search for an idea you know you wrote down and find four candidates, losing ten minutes determining which one you meant. You try to build on a previous insight but cannot locate it precisely, so you reconstruct it from memory — lossy, incomplete, and disconnected from the context that made the original insight valuable.

Over months, these micro-failures compound. Your knowledge system becomes a search problem instead of a navigation problem. You stop linking between notes because you cannot trust that the link will resolve to the right target. You stop building on previous ideas because retrieval is unreliable. The system degrades from a network into a pile.

The fix is not retroactive — going back and assigning identifiers to 500 existing notes is possible but painful. The fix is prospective: from this point forward, every idea you capture gets a unique address before you write a single word of content.

From addressing to decomposition

You now have two foundational practices from Phase 2: atomicity (one idea per container, L-0021) and addressing (one unique identifier per idea, this lesson). Together, they give you a system where each idea is self-contained and independently referenceable.

This sets up the next move. In L-0023, you will practice decomposition — taking a complex idea and breaking it into atomic parts. Decomposition without addressing produces fragments you cannot reassemble. You break "our onboarding process" into seven components, but if those components do not have unique identifiers, you cannot link them back together, cannot reference component 4 from component 7, cannot build a map of how the parts relate.

Addressing is the infrastructure that makes decomposition productive. Without it, breaking things apart is just making a mess. With it, every piece you extract becomes a node in a network — findable, linkable, composable.

Assign the identifier first. Then write the idea. That is the order. That is the practice.