I Have Used Every Major AI Model. The Most Impressive Thing Is the Fundraising.
I Have Been Using AI Since Before It Was Cool. Here Is What I Actually Think Now.
By the time you read this, Claude has probably shipped another agent. OpenAI has announced another Microsoft partnership. Some account on X with a AI generated avatar has posted a thread about how you can save your job with AI in five steps. And a new benchmark has dropped that apparently proves one model is now seven percent smarter than the last one, and seventeen people on Twitter are very excited about it.
The reality remains a little more complicated than any of that. And I have been watching it long enough now to have some opinions worth writing down.
Where This Actually Started For Me
I was one of the earliest users of ChatGPT. Not in the "I read about it in a newsletter" sense but in the genuinely using it from the first weeks sense. I still remember using GPT to scrape through a language exam in my first semester of college. It worked. I also remember trying to use it on a take home calculus quiz shortly after and watching it confidently produce answers that were completely, embarrassingly wrong.
That was my first real lesson with AI. It could sound certain. Certainty was not accuracy. The model had no idea it was wrong. Neither did I, until the grade came back.
But even with that, the momentum was impossible to ignore. Every few months something new appeared. GPT updates, then Bard, then Claude, then the constant cycle of one company briefly dominating the conversation before the next one released something with a marginally better score on a benchmark nobody outside of AI Twitter had heard of, and the discourse shifted entirely. If you remember the Bard days and how chaotic that whole launch was, you know exactly the energy I am describing.
For a while this felt like healthy competition. Then it started feeling like theater.
I kept going deeper anyway. I moved from casual user to genuinely invested, spending real time testing models, understanding their differences, reading the technical writeups. I was among the early users testing Opus 4.5, which I still believe was one of the most impressive model launches of this era, not because of what the benchmarks said but because of how it actually felt to use it on complex, multi-step problems. There was a qualitative leap there that was hard to explain but impossible to miss if you had been paying close attention.
Then came the coding agents. Codex, then the wave of AI native development tools, then the point where I looked around and realized I do not know a single senior engineer in my circle who has written meaningful manual code in the past several months. Not because they cannot. Because they genuinely do not need to. The tooling crossed some threshold and the behavior changed on the other side of it.
That is when my curiosity shifted. Not toward what these models could do next. But toward what the companies behind them were actually building, and why, and for whom, and whether any of it made the kind of sense that holds up under pressure.
The Subscription Was Always The Point
Here is something that becomes obvious once you have watched the pricing evolve in real time. The $20 plan was never a business model. It was onboarding.
Both OpenAI and Anthropic came out cheap. Get the product into as many hands as possible, build the habit, find the power users, normalize the behavior. Free tiers, low entry plans, generous API access for developers. The implicit message was: come in, get comfortable, let this become part of how you work.
And it worked. Brilliantly. Because habit formation in software is one of the most durable moats that exists. Once a tool is embedded into how you think and work, switching costs are not just financial. They are cognitive.
Then, just as investors started nudging leadership to justify the potential hundred billion dollar valuations, the pricing architecture started shifting. The $100 plans. The $200 plans. The enterprise tiers. The API costs that climbed as usage scaled. OpenAI shutting down or scaling back certain capabilities to manage compute costs. The free tier getting thinner. The message shifting from "try everything" to "upgrade for the good stuff."
This is not cynicism. This is just how the playbook works. You read my pricing article, you already know this. The cheap entry was friction reduction. The real revenue was always going to come once the behavior was locked in. The mistake is being surprised by it.
Benchmark Theater
I want to talk about the benchmarks because I think this is where the gap between what is being marketed and what is actually happening becomes most visible.
Every few weeks a new evaluation drops. One model scores a few percentage points higher on some reasoning task, or a coding challenge, or a math benchmark. The announcement goes out. The AI Twitter accounts pick it up. The comparisons get posted. Someone declares a new winner.
And then you use the model on your actual work and the difference is, if you are being honest, pretty marginal.
This is not to say the models are not improving. They are. But there is a difference between improving on the metrics that get measured and improving in the ways that actually change what you can do with the tool. The early leaps, GPT-3 to GPT-4, felt genuinely transformative in use. Something qualitatively changed. The recent cycles feel more like incremental refinement being packaged as revolution, because the market demands a narrative of continuous progress and the companies need to justify the capital they are consuming.
The curve has flattened. The expectations have not. That gap is where a lot of the current hype lives.
The Capital Loop That Is Running Right Now
Here is the part that I find genuinely fascinating and slightly unsettling as a market observer.
The AI infrastructure game right now is essentially a closed loop of commitments. Microsoft invests in OpenAI. OpenAI commits to Azure infrastructure. Oracle signs massive compute deals with AI labs. Nvidia sells GPUs to everyone at extraordinary margins. The valuations go up based on the size of the commitments. The commitments justify the valuations. Around and around it goes.
Nvidia is printing money in a way that very few companies in history have managed. When you are the sole supplier of the pickaxes in a gold rush, it does not matter who finds gold. You win either way. Nvidia's margins through this cycle have been obscene in the best possible sense for their shareholders.
But zoom out slightly and the structure looks a little precarious. These are not revenue-backed valuations in the traditional sense. OpenAI's valuation is built substantially on projected future revenue, on the assumption that the current land grab for AI habit formation will eventually convert into durable enterprise contracts and consumer subscriptions at scale. The infrastructure commitments between these companies are, in many cases, circular. They are promising each other money that will come from the AI revenue that will be enabled by the infrastructure they are promising to each other.
This does not mean it collapses. It means it requires the underlying premise to be true, that AI becomes genuinely indispensable at enterprise scale in a way that justifies the spend. Which might happen. The question is the timeline, and whether the capital can sustain the patience required to get there.
How This Ends
I think there are a few ways this resolves, and none of them are the clean victory lap that the current narrative suggests.
The most likely outcome in the medium term is infrastructure consolidation. The model layer commoditizes. Running a frontier model stops being a differentiator because the capability gap between the top players narrows enough that enterprise buyers stop caring which model they are on and start caring about price, reliability, and integration. This already happened with cloud. AWS, Azure, and GCP do not compete primarily on innovation anymore. They compete on ecosystem lock-in, pricing, and support. AI infrastructure follows the same gravity.
The second possibility is that the real value migrates to the application layer. Models become the electricity. Whoever builds the best product on top of that electricity wins, not whoever generates it. This would be genuinely bad news for the labs that are spending hundreds of billions on compute, because it means their moat is not as durable as their capital commitments assume.
The third possibility is regulatory friction slowing everything down in ways that compress the timeline and force a reckoning with the valuations earlier than the capital structure can absorb. Data sovereignty, energy consumption, national security concerns around frontier AI. Governments are moving slower than the technology but they are moving.
What I am fairly confident about is that the current moment, where every company is raising at extraordinary valuations based on projected futures that nobody can fully model, does not persist in its current form. At some point the commitments have to convert to revenue that makes the math work. And that pressure changes behavior, the same way it changed the behavior of every company I described in the pricing and crab articles.
Where I Actually Land
I started using AI because it felt powerful and slightly magical, which it did, genuinely, in those early months.
I kept using it because it became functionally necessary. The productivity delta for the kind of work I do is real. I am not going to pretend otherwise to seem contrarian.
But I watch it now with a different set of eyes. I see the capital structure and I wonder about the timeline. I see the benchmark announcements and I discount them by default. I see the pricing shifts and I recognize the playbook. I see the infrastructure commitments and I think about what happens when the music slows down.
None of this means AI is not important or that the technology is not real. It clearly is. The question I keep sitting with is simpler than that.
We built an industry on the premise that intelligence could be scaled like software. And maybe it can. But software companies usually have to show the revenue at some point.
I am watching for that part.