Context windows got huge. Attention spans didn't.

Models can now hold your entire codebase in one shot. The hard part was never storage.

Share

Context windows have gotten genuinely large. A million tokens is enough to hold a full-length novel, a substantial codebase, or several months of customer conversations. The technical constraint that forced AI products to work with fragments of information, to chunk, summarize, and discard, has largely been lifted. This is a real capability improvement.

It has not solved the actual problem.

The problem was never that models could not hold enough information. The problem was that having information and knowing what to do with it are completely different capabilities. A library with perfect recall and no librarian is not useful. You still need something to navigate, filter, prioritize, and act. The context window got bigger. The judgment layer did not automatically follow.

Here is a concrete version of this: you can now feed an AI system every customer support ticket your company has received in the past two years. The model can hold all of it. What most products have not figured out is how to get from 'the model has read all of this' to 'the model identifies the three patterns that predict churn and surfaces them without being asked.' The first half of that is a storage problem. The second half is a reasoning and prioritization problem. Larger context windows helped with the first half only.

The design mistake that follows from this is building features that are impressive at the demo level but underwhelming in actual use. A product that says 'upload your entire knowledge base and ask questions' sounds powerful. In practice, what users get is a retrieval system that sometimes finds the right thing and sometimes does not, with no clear explanation of why or way to improve it. The context window made the demo possible. It did not make the underlying problem easier.

There is also a user attention problem that nobody is solving. Even if the model uses a million-token context perfectly, the output still has to fit in the human attention budget. A model that reads your entire customer history and produces a five-page summary is not useful if the user needed a one-paragraph decision. The compression problem shifted from input to output. Bigger context windows on the input side made the output compression problem worse, because the model now has even more it could potentially say.

The companies that are doing interesting work here are not the ones with the biggest context windows. They are the ones who figured out what the model should actually produce given the context. That requires strong opinions about what users actually need, which requires deep understanding of the workflow, which requires product judgment that is orthogonal to the AI capability itself.

This is also a useful frame for evaluating AI features. When a product ships a 'read all your data' feature, the interesting question is not whether the model can hold all the data. That is a given. The interesting question is: what specific decision or action does this feature enable that the user could not take before? If the answer is fuzzy, the feature is not ready. The context window being big enough is a necessary condition. It is nowhere near sufficient.

The honest version of the technical progress here is this: the models got dramatically better at holding context, and somewhat better at reasoning over it, and the gap between those two rates of improvement is where most AI product struggles are happening right now. Good products are being built by teams who acknowledged that gap and designed around it instead of pretending it did not exist.

What specific decision do you want your user to make after your AI processes their data, and can you describe in one sentence exactly how your product gets them there?