Binary categories lose information

Every binary hides a spectrum

You're either with us or against us. You're either a leader or a follower. The project is either on track or behind schedule. The code either works or it doesn't.

These statements feel clean. Decisive. But every one of them destroys information in the act of simplifying it. The person who's "against you" might agree with 80% of your position and object to one specific detail. The project that's "behind schedule" might be three days late on documentation while the core functionality shipped early. The code that "doesn't work" might pass 97 out of 100 test cases.

Binary categories — dividing the world into exactly two buckets — are the most aggressive compression a classification system can perform. They reduce every phenomenon to a single bit of information: zero or one, yes or no, in or out. And while that compression is sometimes useful at the point of final action, it is almost always destructive during the process of understanding.

This lesson is about recognizing when you've compressed too early, and what that compression costs you.

The structure of the false dichotomy

Logicians have studied this pattern for centuries. The false dichotomy (also called the false dilemma or either-or fallacy) is an informal logical fallacy that presents two options as the only possibilities when additional alternatives exist. The error doesn't live in the reasoning structure — it lives in the premise. You accept a disjunction ("either A or B") that excludes real options, then reason validly from that flawed starting point.

The technical distinction matters: in formal logic, two propositions can be contradictories (one must be true and the other false, like "alive" and "not alive") or contraries (both can be false, like "brilliant" and "terrible"). The law of excluded middle — articulated by Aristotle in his Metaphysics — guarantees that for any proposition, either it or its negation is true. But the false dichotomy exploits this by treating contraries as if they were contradictories. "You're either brilliant or terrible at this" smuggles in the structure of a logical contradiction while applying it to a graded spectrum where most people land somewhere in the middle.

Aristotle himself recognized the limits. In On Interpretation, Book 9, he questioned whether the law of excluded middle applies to future contingents — his famous "sea battle" problem. If "there will be a sea battle tomorrow" must be either true or false right now, then the future seems already determined. The father of binary logic saw, 2,400 years ago, that forcing everything into two values creates problems that the framework itself cannot resolve.

What gets lost: three domains of damage

1. Cognitive: all-or-nothing thinking

In the 1960s, psychiatrist Aaron Beck identified a pattern in his depressed patients that he called all-or-nothing thinking — the tendency to evaluate experiences in extreme, black-and-white categories with no middle ground. A single mistake means "I'm a total failure." One awkward conversation means "nobody likes me." David Burns later popularized this as the first item on his list of ten cognitive distortions, describing it as "the tendency to evaluate your personal qualities in extreme, black-or-white categories."

The clinical evidence is stark. All-or-nothing thinking correlates with increased anxiety, depression, and a chronic sense of being on edge. When every outcome must be either perfect success or catastrophic failure, the entire space between those poles — where most of actual life occurs — becomes invisible. Cognitive behavioral therapy addresses this directly through a technique called cognitive restructuring: learning to place experiences on a continuum of 0 to 100 rather than cramming them into a binary of 0 or 1.

This isn't a clinical curiosity. It's the default mode for most untrained thinkers. "Either I'm productive today or I wasted the day." "Either this relationship is working or it's broken." "Either I understand this concept or I don't." Each of these binaries erases a landscape of partial progress, mixed signals, and evolving understanding.

2. Political: the left-right collapse

Few domains illustrate binary information loss better than political discourse. The left-right spectrum — originally referring to the seating arrangement in the French National Assembly of 1789 — has become the dominant frame through which billions of people categorize political positions. But research consistently shows it distorts more than it reveals.

Pew Research Center's longitudinal studies on political polarization demonstrate that self-reported labels like "Democrat" or "Republican" obscure the mechanics of what's actually happening. A phenomenon called sorting drives much of perceived polarization: voters increasingly align their party identity with pre-existing views without actually changing those views. The binary label shifts; the underlying positions stay complex.

The deeper problem is dimensional collapse. A person might be fiscally conservative, socially progressive, hawkish on foreign policy, and libertarian on drug policy. The left-right binary compresses these four (or more) dimensions into one. Political scientists have noted that contemporary polarization depends less on a single left-right axis and increasingly on divisions like religious versus secular, nationalist versus globalist, or urban versus rural — each of which is itself a spectrum, not a binary.

Fernbach, Rogers, Fox, and Sloman's 2013 research revealed something instructive: when people were asked to explain their policy preferences in mechanistic detail rather than simply declaring a position, their views consistently became more moderate. The binary framing ("I'm for it" / "I'm against it") was sustaining the extremity. Forcing richer articulation dissolved it. The binary wasn't describing their actual position — it was creating it.

3. Technical: when boolean isn't enough

Software engineers encounter binary information loss as a concrete design problem. A boolean flag — true or false — is the programming equivalent of a binary category. And experienced engineers learn, often painfully, that booleans almost always need to become enums.

Consider an approved field on a document record. What does false mean? "Not yet reviewed"? "Reviewed and rejected"? "Temporarily suspended pending additional information"? The boolean conflates at least three distinct states into one bucket. Nothing in the code prevents you from treating a pending review the same as an explicit rejection — and that conflation will eventually produce a bug that's hard to trace because the data model itself lost the information.

The multiple-boolean problem compounds this. When you track a process with isStarted, isCompleted, and isCancelled, you get eight possible combinations of true and false — but only three or four of them represent valid states. isStarted: false, isCompleted: true makes no sense, but nothing prevents it. An enum type — NotStarted | InProgress | Completed | Cancelled — makes invalid states unrepresentable. It carries more information in the type system itself, and the compiler catches errors that the boolean version silently allows.

Lawrence Kesteloot's widely-cited engineering principle captures it directly: "Prefer enums over booleans." Not because enums are fancier, but because booleans lose information that the system needs, and you pay for that loss in bugs, confusion, and code that fights you when requirements change.

Why binaries are seductive anyway

If binaries lose so much information, why does every domain default to them?

Speed. Binary decisions are fast. Your brain's pattern-matching machinery can produce a yes/no judgment in milliseconds. Adding categories adds decision time. In genuinely urgent situations — a server is down, a patient is coding, a car is veering toward you — binary response ("act / don't act") is appropriate because the cost of delay exceeds the cost of lost information.

Social coordination. Votes are binary. Legal verdicts are binary. Committee decisions are binary. When a group must converge on a single action, binary framing forces resolution. The problem isn't the binary output — it's when binary framing colonizes the deliberation process that precedes the vote.

Cognitive ease. Holding a spectrum in working memory is harder than holding a binary. If your working memory can handle roughly 3 to 5 items (per Nelson Cowan's research), a binary uses one slot. A five-point scale uses one slot plus the effort of calibration. Binaries feel cleaner because they cost less cognitive overhead — but that savings comes directly from discarding the information that would make your judgment more accurate.

Beyond binary: what the alternatives look like

The history of formal systems is, in many ways, the history of escaping binary constraints.

In 1965, mathematician Lotfi Zadeh introduced fuzzy set theory, which assigns a degree of membership between 0 and 1 rather than forcing elements into "in the set" or "not in the set." A temperature of 68 degrees might have 0.3 membership in "cold" and 0.7 membership in "comfortable" — simultaneously belonging to both categories to different degrees. Fuzzy logic now powers everything from washing machine controllers to medical diagnostic systems, precisely because real-world phenomena resist binary classification.

In cognitive science, Eleanor Rosch's prototype theory (1973-1975) demonstrated that human categories have graded structure. A robin is a more "typical" bird than a penguin. A chair is a more "typical" piece of furniture than a floor lamp. Categories aren't defined by sharp boundaries but by proximity to a central prototype, with membership fading gradually at the edges. George Lakoff extended this in Women, Fire, and Dangerous Things (1987), arguing that all human categorization works through radial structures radiating outward from prototypes — not through binary membership tests.

In developmental psychology, William Perry's scheme of intellectual development (1970) traces a progression from dualism (there are right answers and wrong answers, authorities know which is which) through multiplicity (there are multiple perspectives, no one knows for sure) to relativism (knowledge is contextual, arguments must be evaluated within frameworks) and finally to commitment within relativism (you choose positions knowing they're contextual, and commit to them anyway). Michael Basseches extended this further into dialectical thinking — the capacity to hold two opposing positions and synthesize them into something neither position alone could produce. The trajectory of intellectual maturation is literally a movement away from binary categorization.

AI and the Third Brain: binary classifiers vs. richer outputs

If you use AI as part of your thinking infrastructure — what this curriculum calls the Third Brain — the binary problem has direct practical consequences.

In machine learning, binary classification assigns inputs to one of two classes: spam or not spam, positive sentiment or negative sentiment, tumor or no tumor. It's the simplest classification architecture, and it works well when the domain genuinely has two states. But collapsing a multi-class problem into a binary one destroys information. A patient evaluation system that outputs only "at risk" or "not at risk" hides the distinction between cardiovascular risk, metabolic risk, and psychological risk — each of which demands a different intervention.

Multi-class classification preserves categorical distinctions. Regression preserves continuous values. A model that outputs a probability distribution across five categories carries far more information than one that outputs a single binary label. The same principle applies when you prompt an LLM: asking "Is this a good strategy?" invites a binary. Asking "Rate this strategy on five dimensions — feasibility, impact, risk, alignment, and reversibility — using a 1-to-10 scale for each" produces output you can actually reason with.

The general principle: every time you compress a multi-dimensional input into a binary output, you make a lossy conversion. Sometimes that compression is the right call — you do eventually need to decide "ship or don't ship." But the compression should happen as late as possible, after you've preserved and examined the richer signal.

The protocol: defer binary compression

Here's the actionable pattern:

Notice the binary. When you catch yourself (or a system, or a colleague) framing something as either/or, pause. Name the binary explicitly: "We're treating this as X or not-X."
Ask what's hiding in each bucket. For each side of the binary, list at least two things that would end up in the same category but for different reasons. If "no hire" contains both "skills gap" and "compensation mismatch," the binary is hiding actionable distinctions.
Identify the actual dimensions. What independent axes of variation does the binary collapse? A "good/bad" project review might compress timeline adherence, technical quality, team morale, and stakeholder satisfaction into one bit. Name each dimension separately.
Preserve dimensions through deliberation. Use scales, multi-point rubrics, or categorical breakdowns throughout your reasoning process. Let each dimension carry its own signal.
Compress to binary only at the point of action. When you must decide — hire or don't hire, ship or don't ship, accept or reject — make the compression explicitly, knowing what you're discarding. Document the richer signal alongside the binary outcome so future you (or future colleagues) can recover what the binary lost.

This is the difference between a thinker who uses binary categories as a crutch and one who uses them as a deliberate, final-stage compression tool. Both end up with a yes or no. But the second one knows what the yes or no is made of.

Where this leads

The predecessor to this lesson — explicit categories beat implicit categories — established that naming your categories gives you the power to evaluate and revise them. This lesson adds a specific failure mode: when your explicit categories number exactly two, you've likely compressed too aggressively. The successor — spectrum thinking preserves nuance — introduces the primary alternative: modeling phenomena as positions on a continuum rather than members of discrete groups.

The progression is deliberate. First you make your categories visible. Then you notice when they're too coarse. Then you learn to use gradients instead of boundaries. Each step recovers information that the previous one revealed you were losing.