Tags are lightweight relationships

The simplest connection you can make

You have two notes. One is about a conversation where your manager gave you feedback you disagreed with. The other is about a book passage on how confirmation bias distorts self-assessment. These notes live in different folders, were written weeks apart, and share no obvious structural connection. But they are about the same thing — the difficulty of hearing information that contradicts your self-image.

A tag makes that invisible relationship visible. Add #self-assessment to both notes, and they are now connected. Not through a folder hierarchy, not through a complex linking protocol, not through a database schema. Through a single shared label.

This is what a tag does at its most fundamental level: it declares that two or more atoms share something in common. A tag is not a category. It is not a filing system. It is a lightweight relationship — the minimum viable assertion that two pieces of knowledge belong in the same conversation.

The reason this matters for your epistemic infrastructure is that most of the valuable connections between your ideas are not hierarchical. They are lateral, associative, and surprising. A tag is the tool that captures those connections with the lowest possible friction.

From Ranganathan to folksonomies: why one hierarchy is never enough

The problem with folders — and with any single hierarchy — is that knowledge does not organize itself along one axis. A note about decision fatigue in sprint planning is simultaneously about team dynamics, cognitive psychology, meeting design, and your Tuesday afternoon. Which folder does it belong in?

The Indian mathematician and librarian S. R. Ranganathan solved this problem in the 1930s with his Colon Classification system — the first faceted classification scheme. Ranganathan recognized that a book about, say, the history of rice cultivation in Japan cannot be placed in a single category. It belongs to agriculture, history, botany, and Japanese studies simultaneously. His system assigned values from multiple independent facets (Personality, Matter, Energy, Space, Time) that could be combined to describe any subject from multiple angles at once.

Faceted classification was a radical insight: any item can be described by multiple independent dimensions, and no single dimension is primary. Tags are the personal knowledge management equivalent of Ranganathan's facets. When you tag an atom with #decision-fatigue and #sprint-planning and #cognitive-load, you are applying three independent facets to a single piece of knowledge. The atom does not need to choose where it lives. It can participate in multiple conversations simultaneously.

This principle scaled explosively when the internet made tagging social. In 2004, Thomas Vander Wal coined the term "folksonomy" — a portmanteau of "folk" and "taxonomy" — to describe what was happening on platforms like Delicious and Flickr: millions of users applying their own tags to shared content, with no central authority dictating the vocabulary. Vander Wal was careful to distinguish folksonomies from taxonomies. A taxonomy is designed top-down by an expert. A folksonomy emerges bottom-up from usage. The resulting structure is messy, redundant, and inconsistent — and it works at scale precisely because it demands nothing from the individual user except one low-friction action: type a word.

Scott Golder and Bernardo Huberman, studying Delicious bookmark tagging in 2006, found something remarkable: despite the apparent chaos of millions of independent taggers using whatever words they wanted, tag distributions stabilized into predictable power-law patterns. A small number of tags dominated, a long tail of niche tags persisted, and the proportions reached equilibrium. Order emerged from uncoordinated individual actions — not because anyone designed it, but because shared language and shared context naturally converge.

The hashtag proof: lightweight metadata at planetary scale

If you want evidence that tags create real connections, look at what happened when Chris Messina posted a tweet on August 23, 2007: "how do you feel about using # (pound) for groups. As in #barcamp [msg]?" Twitter's leadership initially dismissed the idea as "too nerdy." No one at the company built it into the platform for years.

It did not matter. Users adopted the hashtag organically. When wildfires swept through San Diego in October 2007, people tagged their tweets with #sandiegofire, creating a real-time information channel that no one had designed or moderated. A lightweight metadata convention — a word prefixed with a symbol — produced emergent community, real-time coordination, and a navigable information stream. The hashtag eventually became infrastructure for political movements (#MeToo, #BlackLivesMatter), professional communities (#BuildInPublic, #DevOps), and every cultural moment in between.

The lesson is not that hashtags are magic. The lesson is that the cost of creating a connection determines how many connections get created. When the cost is "type a # and a word," people tag everything. When the cost is "decide which folder in a three-level hierarchy this belongs to," people defer, misfile, or abandon the system entirely. Tags succeed because they are cheap. And cheap connections, at scale, produce structure that expensive connections never reach.

The tag trap: when lightweight becomes weightless

Tags have a well-documented failure mode. Tiago Forte, creator of the PARA method for personal knowledge management, spent years arguing against tags before arriving at a more nuanced position. His critique identifies five pitfalls: tags create a memory burden (you cannot remember which tags you have used), they cause decision fatigue (choosing the right tag for each note is a micro-decision), they reward cataloging over action, they tend toward abstraction (tagging something #productivity tells you almost nothing), and perfect tagging systems collapse when the user stops maintaining them.

Forte's argument is not that tags are useless — it is that tags should track what you do with a note, not what a note is about. He advocates tagging by action and deliverable rather than by conceptual category. Instead of #psychology, tag with #blog-draft or #client-presentation. Instead of building an abstract taxonomy, let tags reflect the concrete contexts in which you actually use your knowledge.

This is a useful corrective, but it misses the epistemic function of tags. If your only goal is productivity — getting the right note into the right project at the right time — then action-based tagging is sufficient. But if you are building a knowledge graph that surfaces unexpected connections over months and years, you need tags that describe what an idea is about, not just what you plan to do with it. The note about decision fatigue in sprint planning needs #cognitive-load because three months from now, when you write a note about why you cannot focus after 3 PM, you want the system to surface the connection.

The real discipline is not avoiding tags. It is avoiding premature taxonomy. Niklas Luhmann, who maintained 90,000 notes in his Zettelkasten over 40 years, kept a keyword index of only a few thousand terms. Crucially, his register was not a tagging system — individual notes were not tagged. The index served as an entry point into clusters of linked notes. Once inside a cluster, Luhmann navigated by following direct links between notes, not by browsing tags. His keyword index was sparse by design. He indexed only "significant" words and typically listed just one or two note references per keyword.

The principle translates directly to personal tagging: your tags are entry points, not a classification system. You do not need a tag for every concept. You need tags for the concepts you expect to revisit, recombine, or trace across contexts. A tag that appears on only one note is not doing work. A tag that connects five notes from three different months is earning its keep.

The alignment problem: your tags are not my tags

There is a deeper challenge that anyone building a personal tagging system will eventually encounter: your vocabulary is idiosyncratic. What you call #decision-debt, someone else calls #deferred-choices. What you tag #mental-models, a colleague tags #frameworks. The concepts overlap but the labels diverge.

In computer science, this is called the ontology alignment problem — the challenge of establishing correspondences between concepts in independently developed classification systems. Organizations spend significant resources mapping "client" in one system to "customer" in another, resolving whether "revenue" in the finance database means the same thing as "revenue" in the sales dashboard. The problem is real at every scale, from enterprise data integration to two people trying to share a note library.

For personal knowledge management, the alignment problem has a practical implication: your tagging vocabulary must evolve. The tags you create in month one will not serve you in month twelve. Terms that seemed precise will blur. Distinctions that seemed important will collapse. New concepts will arrive that do not fit your existing vocabulary. This is not a sign of failure. It is a sign that your thinking is developing, and your metadata needs to keep pace.

The practice is periodic tag hygiene: review your tag list, merge synonyms, retire tags that no longer reflect how you think, and split tags that have become too broad. This is not organizational busywork. It is a form of metacognition — examining the categories through which you organize your knowledge and asking whether those categories still serve you.

Tags as infrastructure for AI retrieval

Tags become dramatically more powerful when AI enters the system. Modern retrieval systems — including the vector databases and embedding models that power tools like Obsidian's AI plugins, Notion AI, and RAG (Retrieval-Augmented Generation) pipelines — work by converting text into high-dimensional vectors that capture semantic meaning. When you ask an AI assistant "what have I written about cognitive load?", the system finds notes whose vector representations are close to the vector for "cognitive load," even if those notes never use that exact phrase.

Tags enhance this process in two ways. First, they provide explicit semantic anchors. A note tagged #cognitive-load will be retrieved for that query even if the note's body text discusses the concept only indirectly — through phrases like "I could not hold all the variables in my head" or "the meeting left me unable to think." The tag bridges the gap between how you describe an experience in natural language and how you categorize it conceptually.

Second, tags create clusters that AI can reason about. When five notes share the tag #decision-debt, an AI assistant can identify patterns across those notes — recurring contexts, escalating consequences, potential interventions — that you might not see by reading them individually. The tag does not just connect the notes for you. It connects them for any system that can read your knowledge base.

This is the trajectory of tagging: from a personal convenience (finding your own notes) to a structural layer that enables machine-augmented thinking. The tags you create today are not just for you-today. They are for you-plus-AI-tomorrow, navigating a knowledge base that has grown beyond what any human can hold in working memory.

From tags to sequences

A tag declares that atoms share something in common. But it does not say anything about order. The notes tagged #decision-debt are connected, but they are a set — unordered, equal in weight, interchangeable in sequence. For many purposes, that is exactly what you want. A tag-based cluster lets you see all instances of a pattern without imposing a narrative.

But some ideas have a natural progression. One insight builds on another. A chain of reasoning moves from premise to evidence to conclusion. When you notice that your tagged atoms are not just related but sequential — when note A must be understood before note B makes sense — you have discovered something a tag cannot express. You have discovered a sequence.

That is where we go next: how ordered series emerge from atoms linked together, not written as one long document.