May 4, 2026
|
7 min read
|

The Rocks in the River

Taiichi Ohno1 used to walk the floor of Toyota's plants and pick fights with his managers about their inventory piles.

The managers had reasons. The piles were safety stock — protection against a machine breaking down, a supplier running late, a worker calling in sick, a batch coming back defective. The piles let production keep moving when something went wrong. That's what they were for.

Ohno disagreed. He thought the inventory wasn't protecting the system. It was hiding the fact that the system was broken. Your machines do break down. Your suppliers are late. Your batches are defective. The piles let you pretend none of that was true, because there was always enough buffer to absorb it.

He had a metaphor for this that has stuck around for seventy years: the boat on the river. The water is your inventory. The rocks under the surface are the real problems. As long as the water is high, the boat sails fine and nobody looks down. Lower the water — and you see the rocks.

That was the actual point of the Toyota Production System2. Not efficiency. Not speed. Diagnosis. Pull the inventory out, force the rocks into view, and then make the organization deal with them. Just-in-time wasn't a logistics technique. It was a diagnostic instrument disguised as one.

The plants that copied the surface and got worse

Through the 80s and 90s, Western manufacturers watched Toyota eat their lunch and decided to copy what they saw.

What they saw was the visible stuff. Less inventory. Leaner lines. Faster cycle times. So they cut the inventory and ran leaner and tried to ship faster. Some of them pulled it off — the NUMMI joint venture in Fremont3 is the famous case, where GM and Toyota ran the same plant with the same workers and got completely different results depending on which side was managing it.

But many didn't. They removed the buffer without building the thing that was supposed to replace it. Toyota's water was low because the rocks had been dealt with — surfaced, named, fixed, one at a time, over years. The imitators lowered the water and discovered they still had every rock the Japanese plants had spent a decade removing. They had less buffer and the same broken processes. The defects didn't get absorbed anymore. They compounded.

Lower water, same rocks, faster boat.

Code was the buffer

When code was expensive, engineering was always the visible bottleneck. Everyone knew it. Everyone built around it. And because everyone built around it, every other dysfunction in the organization had somewhere to hide.

Slow shipping hid unclear strategy. "We can't build that this quarter" hid "we don't actually know what we want." Long review queues hid that nobody was sure who owned the decision. Quarterly planning was a forum for arguing about engineering capacity, which meant it was usually not a forum for arguing about whether the thing was worth building.

The cost of code wasn't just a constraint. It was doing work — the same way Toyota's inventory was doing work. It was forcing coordination by making people slow down enough to coordinate. It was a buffer.

Then AI pulled the buffer out.

What the rocks look like

A few show up everywhere. None of them are new — they were always there. They just used to be underwater.

Review capacity as the actual bottleneck. When coding took hours or days, review took minutes by comparison and nobody noticed. Now writing code takes minutes and review is the rate-limiter. The orgs that treated review as something you squeeze in between your real work — a tax, an interruption, a chore — discover that review was the real work all along. Reading code carefully, asking what it's doing, catching what's missing: that's how quality enters the system, and now it's the only place it can.

Dev infrastructure breaking under PR load. Flaky tests you used to retry once a week now fail twelve times a day. CI pipelines sized for human throughput cave under agentic throughput. None of this is new — the infra was always like this. There just wasn't enough traffic to expose it.

Backlogs that have to be argued about. The backlog used to be effectively infinite, which made prioritization easy: we'll never get to most of it, so we don't have to choose. Now the math has shifted enough that "we don't have capacity" stops being a satisfying answer to every request. The org has to start saying out loud what it wants and what it doesn't.

Ownership gaps that didn't matter when changes were rare. When a service changed once a month, fuzzy ownership was a survivable problem — somebody figured it out before the next change landed. When a service changes twelve times a week, "I'm not sure who owns this" becomes a 2am pager question with no answer.

Dysfunctional orgs don't get faster with AI. They get worse.

This is the same pattern as the lean-imitator plants in the 90s. Less buffer plus the same broken processes equals defects compounding instead of being absorbed. More variants shipped in parallel. More drift between systems that were already drifting. More conflicting changes landing in the same week because nobody had to slow down enough to notice the conflict.

The friction was load-bearing. It was forcing coordination by being expensive. Remove the cost without replacing the coordination, and the chaos accelerates.

The orgs adopting Cursor and Claude Code as a productivity play, without asking what the friction was protecting them from, are the ones discovering this the hard way right now.

What Toyota actually built

The visible parts of TPS — kanban, andon cord, takt time — were the easy parts. They're the parts you can put on a poster.

The hard parts were cultural. A relentless commitment to surfacing problems instead of routing around them. The authority for any line worker to stop production when something looked wrong, and the expectation that they would use it. Kaizen as a daily practice rather than an annual offsite. The whole machine was built so that the rocks couldn't stay rocks for long once they came into view.

The software version of this isn't a tool stack. It's explicit ownership written down somewhere people can find it. It's deciding who decides before you start generating five versions of the thing. It's a willingness to stop the line when something is unclear, instead of engineering around it and shipping the ambiguity downstream.

The point was never to go faster

Ohno understood something that most of his imitators didn't.

The point of lowering the water was never to make the boat go faster. It was to make the rocks impossible to ignore, and then to actually deal with them. The speed came later, as a consequence. If you lowered the water without doing the second part, you didn't get a faster boat. You got a wrecked one.

AI is doing the first half for us right now, whether we asked it to or not. The water is dropping across the industry. The rocks are coming up.

The second half is the work. Get to it.

Footnotes

  1. Taiichi Ohno (1912–1990) was a Toyota industrial engineer and executive credited as the primary architect of the Toyota Production System. His book Toyota Production System: Beyond Large-Scale Production (1978) is the foundational text.

  2. The Toyota Production System (TPS) is the manufacturing philosophy developed at Toyota through the mid-20th century. It became the basis for lean manufacturing and influenced production systems across industries.

  3. New United Motor Manufacturing, Inc. (NUMMI) was a joint venture between GM and Toyota that operated in Fremont, California from 1984 to 2010. GM sent its managers and workers — including many from the same plant GM had previously shuttered for poor performance — and Toyota ran it using TPS. Quality and productivity improved dramatically. The story is told in detail in a 2010 This American Life episode.

Copyright 2017-2026 Gergely Nemeth