Record observations before conclusions

Your notes are full of conclusions disguised as observations

Open your last journal entry, your most recent meeting notes, or the last message you sent someone describing an event. Read it carefully. How much of what you wrote is something a camera could have recorded, and how much is your interpretation of what happened?

Most people, when asked to describe what they observed, immediately produce conclusions. "The client was frustrated." "The meeting went badly." "She wasn't listening." These feel like observations because they arrived quickly, without deliberate thought. But none of them are things you saw. They are things you inferred from things you saw — and the gap between the two is where most of your thinking errors live.

The discipline of recording observations before conclusions is one of the oldest and most powerful practices in science, ethnography, and qualitative research. It is also one of the hardest habits to build, because your brain is wired to skip the observation step entirely. This lesson teaches you why that matters and how to interrupt the pattern.

The scientific tradition of separating seeing from interpreting

Charles Darwin kept meticulous field notebooks during the voyage of the Beagle from 1831 to 1836. His system was revealing: he maintained separate sections for raw observations and for theoretical speculation. He kept thirty to forty large portfolios in labelled cabinets, organized by topic, where he filed detached references and memoranda. He wrote out separate abstracts and maintained indices of facts. The observation infrastructure came first; the theory of natural selection emerged years later from the accumulated weight of carefully separated data (Chancellor, "Darwin's Beagle Field Notebooks," Darwin Online).

Darwin understood something that beginning naturalists consistently struggle with: the difficulty of distinguishing between what you actually observed and what you think it means. The Natural History Institute has documented this problem extensively in educational settings — students who are asked to keep field notes routinely mix description with interpretation until they are explicitly trained to separate them. Their recommended practice involves requiring an "Interpretations" section that is physically separated from the descriptive record, forcing the writer to complete one before beginning the other (Natural History Institute, "Assessing Field Notes to Promote Deeper Levels of Observation").

Jane Goodall's research at Gombe set the standard for what meticulous observation looks like in practice. Her field notes recorded specific behaviors in granular detail — how chimpanzees formed social groups, which individuals interacted, what tools were fashioned and from what materials, the sequence of grooming and feeding behaviors. She documented what she saw so thoroughly that her observations remained scientifically useful decades later, even as interpretive frameworks evolved. Researchers at Gombe still follow her methodology: daily "focal follows" that record the group's composition, foods eaten, and interactions between individuals, 365 days a year (Jane Goodall Institute, "Our Legacy of Science").

The pattern across these cases is consistent. The scientists who produced the most enduring work were not the ones with the best theories. They were the ones who recorded observations so carefully that their data outlived their initial interpretations.

Why your brain skips the observation step

Your brain does not naturally produce raw observations. It produces interpretations — fast, automatic, and confident. Daniel Kahneman's dual-process model explains why: System 1 generates immediate narrative explanations for everything you encounter. "He's angry." "She's disengaged." "This project is failing." These arrive pre-packaged, feeling like facts about the world rather than stories your brain constructed.

The problem is compounded by working memory constraints. Nelson Cowan's research established that your central cognitive workspace holds roughly four chunks at a time (Cowan, 2001, "The Magical Number 4 in Short-Term Memory"). When you experience an event — a difficult conversation, a surprising email, a team dynamic you can't quite name — your working memory cannot simultaneously hold the raw sensory data, generate an interpretation, and evaluate whether that interpretation is warranted. Something has to go. What goes is the raw data. You keep the conclusion and discard the observations that produced it.

This is why writing matters. When you externalize observations onto a page, you free working memory from the burden of holding raw data. The page becomes an extension of your cognitive workspace — the observations persist there while your mind does the interpretive work. Without externalization, you are forced to interpret in real time and discard the evidence. With it, you can observe first and interpret later, with the full record in front of you.

The two-column method: observation on the left, interpretation on the right

The most practical technique for building this discipline is the double-entry format. Draw a vertical line down the center of a page. The left column is for observations only — what a camera or microphone would have captured. The right column is for interpretations — what you think it means.

The rules for the left column are strict:

Sensory-level facts only. "He raised his voice" not "he was angry." "She checked her phone three times during the presentation" not "she was disengaged." "The response time increased from 200ms to 1.4s" not "the system is degrading."
Sequence matters. Record what happened in order. The sequence often reveals causation that your interpretation would have skipped.
Exhaust observation before interpreting. Fill the left column completely before writing a single word in the right column. This is the hard part. Your brain will generate interpretations constantly while you're observing. Notice them, but do not write them down yet.

The right column is where interpretation lives — and where its limits become visible. When you have a full left column of observations, you frequently discover that multiple interpretations are equally plausible. "He raised his voice after the third question" could mean frustration, passion, hearing difficulty, or cultural communication norms. With only the interpretation ("he was angry"), you never see the alternatives.

This technique has roots in qualitative research methodology, where field researchers are trained to maintain separate descriptive and reflective sections in their notes. The guidance is explicit: record vivid descriptions of actions that take place in the field instead of recording an interpretation of them. This is particularly important early in the research process because immediately trying to interpret events can lead to premature conclusions that prevent later insight (J-PAL, "Qualitative Observations").

Clifford Geertz, the anthropologist who developed the concept of "thick description," would push back on a clean separation — he argued that observation and interpretation are deeply intertwined in ethnographic work. But his own method actually reinforces the discipline: thick description requires exhaustive recording of behavior, context, and meaning-layers before any interpretive claim is made. The thickness of the description is what gives the interpretation its warrant (Geertz, 1973, "Thick Description: Toward an Interpretive Theory of Culture"). Even Geertz needed the observations first.

What Pennebaker's research reveals about writing before analyzing

James Pennebaker's research program on expressive writing — spanning hundreds of studies since 1986 — provides direct evidence for why recording observations before conclusions matters for cognitive clarity.

Pennebaker found that people who benefited most from writing about difficult experiences showed a specific linguistic pattern: they used relatively few causal and insight words ("because," "realize," "understand") at the beginning and progressively more of them over subsequent writing sessions. The cognitive organization emerged from the writing process itself, not from arriving at the page with a pre-formed conclusion (Pennebaker, 2018, "Expressive Writing in Psychological Science").

Critically, Pennebaker's first study found that people who wrote only about the facts of a trauma without emotional engagement did not improve — but neither did people who wrote only about emotions without cognitive structure. The benefit came from beginning with observation (what happened) and allowing interpretation (what it means) to develop through the writing process. The observation record provided the raw material that cognitive processing could organize into coherent meaning.

This maps directly to the two-column practice. The left column is not the end point — it is the foundation. You are not trying to avoid interpretation. You are trying to ground interpretation in actual observed evidence, and the act of writing observations first is what makes that grounding possible.

The engineering parallel: logs are not alerts

Software engineers work with this same distinction every day, though they use different vocabulary. In observability practice, the three pillars are logs, metrics, and traces. Logs are immutable, timestamped records of discrete events — what actually happened in the system. Metrics are aggregated numerical summaries. Alerts are triggered interpretations: "something is wrong."

The discipline of incident investigation mirrors the two-column method precisely. When a production system fails, the first step is not to explain why. The first step is to gather the raw data — the logs, the request traces, the deployment timeline, the metric graphs. Teams that skip this step and jump straight to "the database was overloaded" or "the deploy was bad" routinely misdiagnose incidents, because their interpretation skipped observations that would have pointed to a different root cause.

As IBM's observability framework puts it: logs are essential for deep investigation and root cause analysis, while metrics and alerts are better for detecting that something needs attention. Conflating the two — treating an alert as an explanation — is one of the most common failure modes in incident response (IBM, "Three Pillars of Observability"). The fix is the same as in personal observation: record what happened first, completely, before you start explaining why.

Postmortem culture in mature engineering organizations enforces this separation structurally. A good incident timeline reads like a left column: "14:32 — deploy v2.3.1 reached 100% of production traffic. 14:35 — p99 latency increased from 180ms to 2.1s. 14:37 — error rate exceeded 5% threshold. 14:38 — on-call engineer paged." The analysis section comes after the timeline is complete. The two are never mixed.

Using AI as an observation partner

Large language models default to mixing observation and conclusion. Ask an AI to "analyze this meeting transcript" and it will produce a blend of what happened and what it thinks it means, presented with equal confidence. This mirrors the exact cognitive failure this lesson addresses — except at machine speed.

The fix is to prompt for separation explicitly. Instead of "What happened in this meeting?", try:

"Read this transcript. In the first section, list only observable facts: who spoke, what they said, what actions were taken. Do not interpret tone, intent, or meaning. In a second section, offer interpretations of these observations."

This structured prompting approach forces the model to show its observational work before generating conclusions. OpenAI's own guidance on reasoning models recommends a similar pattern: provide clear structure, separate the evidence-gathering step from the synthesis step, and ask the model to explain its reasoning process rather than jumping to answers (OpenAI, "Reasoning Best Practices").

You can also use AI to audit your own observation practice. Paste your two-column notes into a model and ask: "Review my left column. Flag any entries that contain interpretation rather than pure observation." The model is surprisingly good at catching conclusions disguised as observations — "the team seemed disengaged" versus "three of five team members were looking at their laptops during the presentation."

The key insight: the discipline you build in your own observation practice directly improves how you prompt and evaluate AI outputs. A person who cannot distinguish their own observations from their own conclusions will not notice when an AI blends the two.

The protocol

Daily practice (5 minutes):

Choose one event from today that produced a reaction — positive, negative, or confusing.
Open a blank page with a vertical line down the middle.
Left column: write only what a camera would have captured. Specific behaviors, specific words, specific sequences.
Right column: only after the left column is complete, write what you think it means.
Count: how many different interpretations could the same observations support?

Weekly review (10 minutes):

Review your week's entries. Look for patterns in the left column that your right-column interpretations missed on the day.
Notice which observations you consistently skip — these are your observational blind spots.

The litmus test for any entry in the left column: Could a stranger with no context verify this from a recording? If yes, it is observation. If no, it belongs in the right column.

Where this leads

Recording observations before conclusions is a mechanical skill. It does not require wisdom or special insight — it requires a page, a line down the middle, and the discipline to fill the left side first. But what it produces is the foundation for everything that follows in this phase.

In the next lesson — Judgment is useful after observation is complete — you'll learn that the point is not to eliminate interpretation. Judgment is powerful and necessary. The point is to ensure your judgments are grounded in actual evidence rather than in stories your System 1 generated before you finished looking.

The scientists whose observations endured for centuries were not people who avoided conclusions. They were people who refused to let conclusions arrive before the observations were complete.

Sources

Chancellor, G. "Introduction to Darwin's Beagle Field Notebooks (1831-1836)." Darwin Online. darwin-online.org.uk
Natural History Institute. "Assessing Field Notes to Promote Deeper Levels of Observation." naturalhistoryinstitute.org
Cowan, N. (2001). "The Magical Number 4 in Short-Term Memory: A Reconsideration of Mental Storage Capacity." Behavioral and Brain Sciences, 24(1), 87-114. pubmed.ncbi.nlm.nih.gov
Geertz, C. (1973). "Thick Description: Toward an Interpretive Theory of Culture." The Interpretation of Cultures. people.ucsc.edu
Pennebaker, J.W. (2018). "Expressive Writing in Psychological Science." Perspectives on Psychological Science, 13(2), 226-229. journals.sagepub.com
IBM. "Three Pillars of Observability: Logs, Metrics and Traces." ibm.com
Jane Goodall Institute. "Our Legacy of Science." janegoodall.org