Even if Opus 4.5 is the limit it’s still a massively useful tool. I don’t believe it’s the limit though for the simple fact that a lot could be done by creating more specialized models for each subdomain i.e. they’ve focused mostly on web based development but could do the same for any other paradigm.
That's a massive shift in the claim though... I don't think anyone is disputing that it's a useful tool; just the implication that because it's a useful tool and has seen rapid improvement that implies they're going to "get all the way there," so to speak.
Personally I'm not against LLMs or AI itself, but considering how these models are built and trained, I personally refuse to use tools built on others' work without or against their consent (esp. GPL/LGPL/AGPL, Non Commercial / No Derivatives CC licenses and Source Available licenses).
Of course the tech will be useful and ethical if these problems are solved or decided to be solved the right way.
We just need to tax the hell out of the AI companies (assuming they are ever profitable) since all their gains are built on plundering the collective wisdom of humanity.
Linear progression feels slower (and thus more like a plateau) to me than the end of 2022 through end of 2024 period.
The question in my mind is where we are on the s-curve. Are we just now entering hyper-growth? Or are we starting to level out toward maturity?
It seems like it must still be hyper-growth, but it feels less that way to me than it did a year ago. I think in large part my sense is that there are two curves happening simultaneously, but at different rates. There is the growth in capabilities, and then there is the growth in adoption. I think it's the first curve that seems to be to have slown a bit. Model improvements seem both amazing and also less revolutionary to me than they did a year or two ago.
But the other curve is adoption, and I think that one is way further from maturity. The providers are focusing more on the tooling now that the models are good enough. I'm seeing "normies" (that is, non-programmers) starting to realize the power of Claude Code in their own workflows. I think that's gonna be huge and is just getting started.
Odin’s design is informed by simplicity, performance and joy and I hope it stays that way. Maybe it needs to stay a niche language under one person’s control in order to do so since many people can’t help but try to substitute their own values when touring through a different language.
Going all in on AI generated code has taught me more about project management than I learned in the last decade. I also have a much better perspective on what it’s like to be a client contracting a developer to build an app for them. The best part is the AI actually follows all of the processes that you ask it to. Today was the first day that I wrote code in over a week and I still ended up asking for some review from Opus and it went perfectly. At this point no one has an excuse for shipping slop if they can afford $20 a month.
It’s all in how you use it. If you want to learn you can just tell it to walk you through the code or write a tutorial with examples and exercises or give you programming problems to solve or use the socratic method or recommend the best human written tutorials and books or review your code and suggest more idiomatic techniques or help you convert a program from one language or paradigm to another and a million other ways.
I like the AI written tutorial method, both Opus 4.5 and Gemini 3 are good at this. You just have to put the effort in to copytype, make changes, ask questions and put what you’ve learnt into practice. AI code review is also great for discovering alternatives you don’t know about.
Session limit that resets after 5 hours timed from the first message you sent. Most people I’ve seen report between 1 to 2 hours of dev time using Opus 4.5 on the Pro plan before hitting it unless you’re feeding in huge files and doing a bad job of managing your context.
Yeah it’s really not too bad but it does get frustrating when you hit the session limit in the middle of something. I also add $20 of extra usage so I can finish up the work in progress cleanly and have Opus create some notes so we can resume when the session renews. Gotta be careful with extra usage though because you can easily use it up if the context is getting full so it’s best to try to work in small independent chunks and clear the context after each. It’s more work but helps both with usage and Opus performs better when you aren’t pushing the context window to the max.
I half agree, but it should be called “Hobbiest” since that’s what it’s good for. 10 minutes is hyperbolic, I average 1h30m even when using plan mode first and front loading the context with dev diaries, git history, milestone documents and important excerpts from previous conversations. Something tells me your modules might be too big and need refactoring. That said, it’s a pain having to wait hours between sessions and jump when the window opens to make sure I stay on schedule and can get three in a day but that works ok for hobby projects since I can do other things in between. I would agree that if you’re using it for work you absolutely need Max so that should be what’s called the Pro plan but what can you do? They chose the names so now we just need to add disclaimers.
I actually get more mileage out of Claude using a Github Copilot subscription. The regular Claude Pro will give me an hour or up to 90 minutes max, before it reaches the cap. The Github version has a monthly limit for the Claude requests (100 "premium requests") which I find much easier to manage. I was about to switch to the max plan but this setup (both Claude pro and Github Copilot, costing 30 a month together) was just enough for my needs. With a bonus that I can try some of the other model offerings as well.
In practice, how does switching between Claude and GitHub Copilot work?
1. Do you start off using the Claude Code CLI, then when you hit limits, you switch to the GitHub Copilot CLI to finish whatever it is you are working on?
2. Or, you spend most of your time inside VSCode so the model switching happens inside an IDE?
3. Or, you are more of a strict browser-only user, like antirez :)?
I always start in the Claude CLI. Once I hit the token limit, I can do two things: either use Copilot Claude to finish the job, or pick up something completely different, and let the other task wait until the token limit resets. Most importantly, I'm never blocked waiting for the cap.
Good to hear that’s working. When I was using copilot before Opus 4.5 came out I found it didn’t perform as well as Claude Code but maybe it works better now with 4.5 and the latest improvements to VSCode. I’ll have to try it again.
Imagine if we refused to publish any material or exhibit recreations of dinosaurs because the only evidence we have are fossilized skeletons and a few skin texture impressions.
Dinosaurs in the first Jurassic Park were fairly well represented considering what we knew in the late 80s. But our knowledge of dinosaurs has grown, with feathers being the most emblematic change. Yet the Jurassic Park movies steadfastly refuse to put feathers on their 3D monsters in the current movies, because viewers do not expect feathers on the T-Rex.
We might be at that point with repainted statues. Museum visitors are now starting to expect the ugly garish colours.
I've not seen the latest Jurassic Park movie, but I've seen a clip with velociraptor's with feathers, and maybe quetzlcoatalus too? Along with colourful skin on eg compsagnathus.
They seem to have moved on a bit, they're balancing audience expectations with latest research, I expect.
My knowledge of dinosaurs is a few decades old really - any good sources for a summary of T-rex developments in particular or dinosaurs more generally?
I could imagine there's some great videos out there? I'd be keen to have scientific basis given rather than speculative artwork.
Both Anthropic and Google have clear directions. Anthropic is trying to corner the software developer market and succeeding, Google is doing deep integration with their existing products. There’s also Deepseek who seem hell bent on making the cheapest SotA models and supplying the models people can use for research on applications. Even Grok is fairly mission focused on with X integration.