<i>You’re absolutely right! You astutely observed that 2025 was a year with many LLMs and this was a selection of waypoints, summarized in a helpful timeline.</i><p>That’s what most non-tech-person’s year in LLMs looked like.<p>Hopefully 2026 will be the year where companies realize that implementing intrusive chatbots can’t make better ::waving hands:: ya know… <i>UX</i> or whatever.<p>For some reason, they think its helpful to distractingly pop up chat windows on their site because their customers need textual kindergarten handholding to … I don’t know… find the ideal pocket comb for their unique pocket/hair situation, or had an unlikely question about that aerosol pan release spray that a chatbot could actually answer. Well, my dog also thinks she’s helping me by attacking the vacuum when I’m trying to clean. Both ideas are equally valid.<p>And spending a bazillion dollars implementing it doesn’t mean your customers won’t hate it. And forcing your customers into pathways they hate because of your sunk costs mindset means it will never stop costing you more money than it makes.<p>I just hope companies start being honest with themselves about whether or not these things are good, bad, or absolutely abysmal for the customer experience and cut their losses when it makes sense.
They need to be intrusive and shoved in your face. This way, they can say they have a lot of people using them, which is a good and useful metric.
> For some reason, they think its helpful to distractingly pop up chat windows on their site...<p>Companies have been doing this "live support" nonsense far longer than LLMs have been popular.
Indeed. I don't understand why Hacker News is so dismissive about the coming of LLMs, maybe HN readers are going through 5 stages of grief?<p>But LLM is certainly a game changer, I can see it delivering impact bigger than the internet itself. Both require a lot of investments.
Pretty much a whole year of nothing really. Just coming with a bunch of abstraction and ideas trying to solve an unsolvable problem. Getting reliable results from an unreliable process while assuming the process is reliable.<p>At least when herding cats, you can be sure that if the cats are hungry, they will try to get where the food is.
This is extremely dismissive. Claude Code helps me make a majority of changes to our codebase now, particularly small ones, and is an insane efficiency boost. You may not have the same experience for one reason or another, but plenty of devs do, so "nothing happened" is absolutely wrong.<p>2024 was a lot of talk, a lot of "AI could hypothetically do this and that". 2025 was the year where it genuinely started to enter people's workflows. Not everything we've been told would happen has happened (I still make my own presentations and write my own emails) but coding agents certainly have!
And this is one of the <i>vague</i> "AI helped me do more".<p>This is me touting for Emacs<p><i>Emacs was a great plus for me over the last year. The integration with various tooling with comint (REPL integration), compile (build or report tools), TUI (through eat or ansi-term), gave me a unified experience through the buffer paradigm of emacs. Using the same set of commands boosted my editing process and the easy addition of new commands make it easy to fit my development workflow to the editor.</i><p>This is how easy it is to write a non-vague "tool X helped me" and I'm not even an English native speaker.
Did you ship more in 2025 than in 2024?
I’m not sure how to tell you how obvious it is you haven’t actually used these tools.
This comment is legitimately hilarious to me. I thought it was satire at first. The list of what has happened in this field in the last twelve months is <i>staggering</i> to me, while you write it off as essentially nothing.<p>Different strokes, but I’m getting so much more done and mostly enjoying it. Can’t wait to see what 2026 holds!
People who dislike LLMs are generally insistent that they're useless for everything and have infinitely negative value, regardless of facts they're presented with.<p>Anyone that believes that they are completely useless is just as deluded as anyone that believes they're going to bring an AGI utopia next week.
Not in this review: Also the record year in intelligent systems aiding in and prompting human users into fatal self-harm.<p>Will 2026 fare better?
The people working on this stuff have convinced themselves they're on a religious quest so it's not going to get better: <a href="https://x.com/RobertFreundLaw/status/2006111090539687956" rel="nofollow">https://x.com/RobertFreundLaw/status/2006111090539687956</a>
I really hope so.<p>The big labs are (mostly) investing a lot of resources into reducing the chance their models will trigger self-harm and AI psychosis and suchlike. See the GPT-4o retirement (and resulting backlash) for an example of that.<p>But the number of users is exploding too. If they make things 5x less likely to happen but sign up 10x more people it won't be good on that front.
Also essential self-fulfilment.<p>But that one doesn't make headlines ;)
Sure -- but that's fair game in engineering. I work on cars. If we kill people with safety faults I expect it to make more headlines than all the fun roadtrips.<p>What I find interesting with chat bots is that they're "web apps" so to speak, but with safety engineering aspects that type of developer is typically not exposed to or familiar with.
Remember, back in the day, when a year of progress was like, oh, they voted to add some syntactic sugar to Java...
I'm curious how all of the progress will be seen if it does indeed result in mass unemployment (but not eradication) of professional software engineers.
I nearly added a section about that. I wanted to contrast the thing where many companies are reducing junior engineering hires with the thing where Cloudflare and Shopify are hiring 1,000+ interns. I ran out of time and hadn't figured out a good way to frame it though so I dropped it.
My prediction: If we can successfully get rid of most software engineers, we can get rid of most knowledge work. Given the state of robotics, manual labor is likely to outlive intellectual labor.
"Given the state of robotics" reminds me a lot of what was said about llms and image/video models over the past 3 years. Considering how much llms improved, how long can robotics be in this state?<p>I have to think 3 years from now we will be having the same conversation about robots doing real physical labor.<p>"This is the worst they will ever be" feels more apt.
forgot to mention the first murder-suicide instigated by chatgpt
Great summary of the year in LLMs. Is there a predictions (for 2026) blogpost as well?
Nothing about the severe impact on the environment, and the hand waviness about water usage hurt to read. The referenced post was missing every single point about the issue by making it global instead of local. And as if data center buildouts are properly planned and dimensioned for existing infrastructure…<p>Add to this that all the hardware is already old and the amount of waste we’re producing right now is mind boggling, and for what, fun tools for the use of one?<p>I don’t live in the US, but the amount of tax money being siphoned to a few tech bros should have heads rolling and I really don’t want to see it happening in Europe.<p>But I guess we got a new version number on a few models and some blown up benchmarks so that’s good, oh and of course the svg images we will never use for anything.
"Nothing about the severe impact on the environment"<p>I literally said:<p>"AI data centers continue to burn vast amounts of energy and the arms race to build them continues to accelerate in a way that feels unsustainable."<p>AND I linked to my coverage from last year, which is still true today (hence why I felt no need to update it): <a href="https://simonwillison.net/2024/Dec/31/llms-in-2024/#the-environmental-impact-got-better" rel="nofollow">https://simonwillison.net/2024/Dec/31/llms-in-2024/#the-envi...</a>
These are excellent every year, thank you for all the wonderful work you do.
> The (only?) year of MCP<p>I like to believe, but MCP is quickly turning into an enterprise thing so I think it will stick around for good.
What an amazing progress in just short time. The future is bright! Happy New Year y'all!
> Vendor-independent options include GitHub Copilot CLI, Amp, OpenHands CLI, and Pi<p>...and the best of them all, OpenCode[1] :)<p>[1]: <a href="https://opencode.ai" rel="nofollow">https://opencode.ai</a>
I don't know why you're downloaded, OpenCode is by far the best.
Good call, I'll add that. I think I mentally scrambled it with OpenHands.
How did I miss this until now! Thank you for sharing.
> The year of YOLO and the Normalization of Deviance #<p>On this including AI agents deleting home folders, I was able to run agents in Firejail by isolating vscode (Most of my agents are vscode based ones, like Kilo Code).<p>I wrote a little guide on how I did it <a href="https://softwareengineeringstandard.com/2025/12/15/ai-agents-firejail-sandbox/" rel="nofollow">https://softwareengineeringstandard.com/2025/12/15/ai-agents...</a><p>Took a bit of tweaking, vscode crashing a bunch of times with not being able to read its config files, but I got there in the end. Now it can only write to my projects folder. All of my projects are backed up in git.
2025: The Year in LLMs<p>I will never stop treating hallucinations as inventions. I dare you to stop me. i double dog dare y