Where the Table Lives
What pockets of reducibility tell us about why context engineering works, why small models are enough for most tasks, and why the creative leap still needs the whole stack.
Three Views of a Table
There is a table in front of you. You can describe it three ways.
You can describe it as roughly ten to the twenty-seventh atoms arranged in a quasi-stable lattice, each atom vibrating, each bond storing a tiny amount of energy, the whole arrangement decaying so slowly it might as well be permanent on any timescale you care about.
You can describe it as four legs, a top, some joinery, screws, a finish. Wood from somewhere. Made by someone.
You can describe it as the surface where the laptop goes, the place the coffee sits, the thing that determines whether the room feels open or crowded.
All three descriptions are correct. None of them is more real than the others. But for any task you might actually have, exactly one of them is useful, and the other two are noise. Drag the slider below and watch the description change.
The interior designer does not need the atoms. The materials engineer does not need the room. The carpenter sits somewhere between. Each task lives at a layer, and getting that layer right is the difference between an answer and noise dressed up as one.
This essay is about what physics tells us about why those layers exist, why large language models seem to operate at all of them at once, and why the practical craft of working with these models comes down to picking which layer you actually want.
Pockets of Reducibility
The interesting question is not why we can describe a table at multiple layers. That is just language. The interesting question is why the higher-level description works. Why can the interior designer ignore the atoms? Why does the atomic chaos not leak through and make the table behave unpredictably from across the room?
The short answer is that most of the microscopic detail cancels. The atoms in the table vibrate, but the vibrations average out. What survives the averaging is a small number of stable, large-scale properties: this thing is rigid, this thing is here, this thing supports weight. The table is a pocket of reducibility, a region of the world where the high-level description captures everything the low-level description would have told you about anything you care about.
So what does a layer actually contain? The answer is simpler than it sounds. Each layer is a vocabulary, a list of words for the stable configurations the world admits at that scale. A token is one word in one of these vocabularies. The model has learned all of them, every layer, every word. The act of using the model is choosing which word at which layer applies to the question in front of you.
The visual above shows what a layer contains. The next one shows how layers connect. The same physical thing exists at every layer simultaneously, just under different names, and the move that produces each higher-layer name is the same move: compression of what sits below into a single handle. Slide through the next visual to watch the move happen at two adjacent layers at once.
Most of the world is full of these pockets. A cell is one. A market is one most of the time. A company is one when it is functioning normally. Expertise is partly a memorized map of the pockets in a domain, and a sense of when each one is reliable enough to lean on as a single handle.
But the pockets are not everywhere, and this is where the picture gets stranger than it looks. Stephen Wolfram has spent decades on the opposite case: a large class of systems are computationally irreducible. Turbulence is one. The weather past two weeks is one. A market in a crisis is one. The pockets of reducibility are the lucky exceptions, the corners of the world where some accident of physics or biology left a shortcut. Intelligence lives in those pockets, and the hard problems live on the boundaries between them.
Strip away the jargon and irreducibility is a claim about prediction. A pocket of reducibility is a corner of the world whose future you can shortcut to: a formula, a rule of thumb, a compression that hands you the answer without living through every step in between. Irreducible means the opposite. There is no formula and no shortcut. The only way to know the future is to run the whole thing, step by step, in full. The simulation is the prediction, and it takes as long as the thing itself.
The clearest way to feel this is to ask of anything: can you call where it ends up before it gets there? Two pairs make the point. First a ball you throw in the open against a ball you drop through pegs. Then your money over thirty years against a single stock next week.
Two things matter for the rest of the essay. The pockets are stacked, not strung along a line: the same table is a valid pocket at the atomic layer, the parts layer, the object layer, the room layer, all at once. And the skill is not just working well inside a layer. It is sensing when a layer has stopped being enough and you need to drop to the one below, or when no layer short of brute simulation will save you at all.
Tokens Are Compressed Layers
Now bring this to language models. A large model does not represent the world. It represents tokens, and tokens are learned pockets of reducibility, pieces of the world the model has decided are compressible enough to be worth a handle.
The word table is a token because the model has learned that whenever this concept appears, a predictable cluster of properties travels with it. Flat surface. Legs. Indoor. Supports objects. The model does not carry the atoms when it uses the word. It carries the pocket.
The argument the essay needs is this: the same physical object has tokens at every layer. The model has learned oak and wood grain and mortise and tenon and dining room and open floor plan and urban density. Each is its own pocket. Each compresses a different slice of the world. A larger model with more pretraining has learned more of these pockets, at finer granularity, with more connections between layers. A smaller model has fewer.
This connects directly to the dormant capability argument in Reach. A large pretrained model has token layers latent inside it at every scale, most of them unused at any given moment. The question is not whether the model has the layer. For frontier models, it almost certainly does. The question is which layer gets activated by the context you put in front of it.
The Right Grain
So far the picture is vertical. Atoms below, parts above, then objects, then rooms, then cities. Each a different layer of compression. The interior designer works at the room layer, the carpenter at the parts layer, and so on. This is most of the story but not all of it.
Inside any one layer, tokens still come in different sizes. Stay at the parts layer of the table. Tabletop is a token. Leg is a token. Joinery is a token. All three live at the same layer. None is more abstract than the others. But for the task of moving the table, only the coarse tokens matter, where the surfaces and load-bearing pieces are. For the task of fixing a wobble, only the joinery matters, and a model working at the whole-object grain is looking at the wrong thing.
This matters separately from layer selection, because most context-engineering failures are not layer failures. Nobody confuses atoms with rooms. They go wrong by being too fine or too coarse inside the right layer. The analyst who summarizes a quarterly report line by line chose too fine a grain. The one who says "earnings were up" chose too coarse. The right grain sits between, and it depends entirely on what the reader will do with the answer.
The model has all three grains. Table, tabletop, mortise and tenon are all in there, learned from millions of examples. Which one fires depends entirely on what you put in the context. Ask where the furniture should go and you want the whole-object token. Ask why the table wobbles and you want the joint. The model is not choosing the grain for you. You are.
Context Engineering Is Layer Selection
The interior designer does not need atoms. They need the table-and-chair layer, plus enough of the room layer to make placement decisions, plus a sense of where the morning light enters from. Hand them the atomic positions of every object in the room and you have technically given them more information and made them strictly worse at their job.
This is the part that becomes obvious only after you have seen it once. A well-constructed prompt with personas does not beat a prompt with raw demographic statistics because personas contain more information. They beat it because personas are the right pocket of reducibility for questions about human behavior. Statistics live one layer too low. Atoms-of-the-town would live two layers too low and would produce worse answers despite containing strictly more data.
The visual below is a worked example from the kind of thing I actually look at. One mid-market company whose margins slipped. You can show it to a model three ways: as financial aggregates, as its individual customers, or as the relationships between them. Pick a question and watch which lens can answer it.
The pattern generalizes. The analyst who dumps every line item from a 10-K into the model and asks for a summary is stuck at the statistics layer. Moving up to the customers themselves is better. But the concentration risk in the example above only appeared at the relationship layer, the one most people never bother to assemble because it takes real work to build. The first analyst drowns in numbers. The last one finds the thing that actually moves the deal.
Context engineering, in the version that actually matters, is not about adding more. It is about picking the right pocket and excluding the rest.
Small Models, Big Models, and When Each Is Enough
The practical claim follows directly. Most tasks live inside a single, well-defined pocket of reducibility. Routing a customer email. Classifying a support ticket. Summarizing a meeting. Extracting fields from a contract. For these, the relevant token layer is narrow and stable. A small model that has that layer well-represented is sufficient. You do not need a four hundred billion parameter model to know what follow up next week means.
A large model's advantage does not show up on these tasks. It shows up on tasks that require moving between layers. Recognizing that the email is not a routing question but a customer-relationship question. Noticing that the contract clause does not parse at the legal layer because it is referencing a tax structure that lives at a different layer. Knowing when the current layer has broken down and a drop one layer is required.
This makes the small-versus-large argument mechanical. Single-pocket tasks are cheap, and a model with the right pocket fires correctly regardless of size. Cross-pocket tasks are expensive, and they require a model with the latent breadth to hold multiple pockets at once and switch between them.
It also makes the orchestration architecture concrete. The associate moving from manual model-building to agent orchestration is not just delegating busywork. They are routing each task to the model whose pocket fits: small specialized models for the pieces that live in one pocket, a large model as the orchestrator that knows which pocket a sub-task belongs to and when the sub-task needs reframing because it is sitting at the wrong layer. The Manager and the Partner do not run the room because they are smarter than the Analysts. They run it because they can see all the layers at once.
The Creative Leap
There is one last regime, and it is the one that ties this essay back to Reach.
Sometimes the right answer does not live in the obvious layer. Sometimes it does not live in any of the obvious layers. It lives in a completely different domain, and finding it requires reaching sideways to something that, on the face of it, has nothing to do with the problem.
An interior designer who designs a great room mostly works at the room layer. An interior designer who creates a new kind of room pulled something from somewhere else. Maybe they were reading about coral reefs. Maybe they noticed that healthy coral reef systems generate enormous structural complexity from a small set of growth rules, and that this complexity supports a density of life that engineered structures cannot match. And maybe they started asking what a room would look like if it were grown rather than designed.
This is biophilic architecture, and the layer the idea came from is not in any architecture textbook. It came from marine biology. The pocket of reducibility that contained the answer was three domains over.
The model on its own does not make this leap. Most prompts pin it to a single domain, and that is correct, because most of the time you want a conventional answer. The leap requires either a person doing the cross-domain reaching themselves and then handing the model a context that already mixes the layers, or a context broad enough that the model is invited to look across pockets it would not normally connect.
This is what a large pretrained model gives you that a small one cannot. Not raw capability on the task you are pointing it at. Latent access to every pocket at once. Most of the time, that latent access is wasted. The model is answering an email. It does not need the coral reefs. But on the rare task where the answer requires a cross-layer reach, only a model that has the layers can leap between them.
Most of the work of using these models well is narrowing. Pick the layer. Exclude the rest. Get a clean answer. But occasionally, the task is not to answer the question that was asked. It is to find the layer where the answer actually lives, and that layer might be three domains over. Both regimes use the same hardware. Knowing which regime you are in is the craft.
The table on your desk is many things at once. It is atoms, it is parts, it is an object, it is a piece of a room, it is a node in the city. All of these are real. None of them is more real than the others. The question is never which description is true. The question is always which one you actually need.
So with the models in front of us. They contain the whole stack. The work is in knowing which layer to wake.
The Compression Canopy is the broader 2,000 ft essay behind this one: tokens as compressed handles on reality, and intelligent space as the geometry that forms between them.
Reach develops the dormant-capability argument from a different angle: which parts of a large model's latent breadth show up in your career depends on the layers your work forces you to traverse.
The Stable Orbits applies the pocket-of-reducibility idea at the firm scale: viable business shapes are basins in the coordination-cost landscape, and AI is reshaping the terrain.