94 comments

  • daxfohl1 day ago
    I worry about the &quot;brain atrophy&quot; part, as I&#x27;ve felt this too. And not just atrophy, but even moreso I think it&#x27;s evolving into &quot;complacency&quot;.<p>Like there have been multiple times now where I wanted the code to look a certain way, but it kept pulling back to the way it wanted to do things. Like if I had stated certain design goals recently it would adhere to them, but after a few iterations it would forget again and go back to its original approach, or mix the two, or whatever. Eventually it was easier just to quit fighting it and let it do things the way it wanted.<p>What I&#x27;ve seen is that after the initial dopamine rush of being able to do things that would have taken much longer manually, a few iterations of this kind of interaction has slowly led to a disillusionment of the whole project, as AI keeps pushing it in a direction I didn&#x27;t want.<p>I think this is especially true if you&#x27;re trying to experiment with new approaches to things. LLMs are, by definition, biased by what was in their training data. You can shock them out of it momentarily, whish is awesome for a few rounds, but over time the gravitational pull of what&#x27;s already in their latent space becomes inescapable. (I picture it as working like a giant Sierpinski triangle).<p>I want to say the end result is very akin to doom scrolling. Doom tabbing? It&#x27;s like, yeah I could be more creative with just a tad more effort, but the AI is already running and the bar to seeing what the AI will do next is so low, so....
    • aswegs85 hours ago
      &quot;For this invention will produce forgetfulness in the minds of those who learn to use it, because they will not practice their memory. Their trust in writing, produced by external characters which are no part of themselves, will discourage the use of their own memory within them. You have invented an elixir not of memory, but of reminding; and you offer your pupils the appearance of wisdom, not true wisdom, for they will read many things without instruction and will therefore seem [275b] to know many things, when they are for the most part ignorant and hard to get along with, since they are not wise, but only appear wise.&quot; - Socrates on Writing and Reading, Phaedrus 370 BC
      • kelnos5 minutes ago
        It&#x27;s unclear if you&#x27;ve presented this quote in order to support or criticize the idea that new technologies make us dumber. (Perhaps that&#x27;s intentional; if so, bravo).<p>To me, this feels like support. I was never an adult who could not read or write, so I can&#x27;t check my experience against Socrates&#x27; specific concern. But speaking to the idea of memory, I now &quot;outsource&quot; a lot of my memory to my smartphone.<p>In the past, I would just remember my shopping list, and go to the grocery store and get what I needed. Sure, sometimes I&#x27;d forget a thing or two, but it was almost always something unimportant, and rarely was a problem. Now I have my list on my phone, but on many occasions where I don&#x27;t make a shopping list on my phone, when I get to the grocery store I have a lot of trouble remembering what to get, and sometimes finish shopping, check out, and leave the store, only to suddenly remember something important, and have to go back in.<p>I don&#x27;t remember phone numbers anymore. In college (~2000) I had the campus numbers (we didn&#x27;t have cell phones yet) of at least two dozen friends memorized. Today I know my phone number, my wife&#x27;s, and my sister&#x27;s, and that&#x27;s it. (But I still remember the phone number for the first house I lived in, and we moved out of that house when I was five years old. Interestingly, I don&#x27;t remember the area code, but I suppose that makes sense, as area codes weren&#x27;t required for local dialing in the US back in the 80s.)<p>Now, some of this I will probably ascribe to age: I expect our memory gets more fallible as we get older (I&#x27;m in my mid 40s). I used to have all my credit&#x2F;debit card numbers, and their expiration dates and security codes, memorized (five or six of them), but nowadays I can only manage to remember two of them. (And I usually forget or mix up the expiration dates; fortunately many payment forms don&#x27;t seem to check, or are lax about it.) But maybe that <i>is</i> due to new technology to some extent: most&#x2F;all sites where I spend money frequently remember my card for me (and at most only require me to enter the security code). And many also take Paypal or Google Pay, which saves me from having to recall the numbers.<p>So I think new technology making us &quot;dumber&quot; is a very real thing. I&#x27;m not sure if it&#x27;s a good thing or a bad thing. You could say that, in all of my examples, technology serving the place of memory has freed up mental cycles to remember more important things, so it&#x27;s a net positive. But I&#x27;m not so sure.
      • mikemarsh3 hours ago
        Presenting this quote without additional commentary is an interesting Rorschach test.<p>Thankfully more and more people are seriously considering the effects of technology on true wisdom and getting of the &quot;all technological progress clearly is great, look at all these silly unenlightened naysayers from the past&quot; train.
        • runarberg9 minutes ago
          Socrates was right about the effects. Writing did indeed cause us to loose the talent of memorizing. Where he was wrong though (or rather where this quote without context is wrong) is that it turned out that memorizing was by the most part not the important skill to have.<p>When Socrates uses the same warnings about LLMs he may however be correct both on the effect and the importance of the skill being lost. If we loose the ability to think and solve various problems, we may indeed be loosing a very important skill of our humanity.
      • ericmcer1 hour ago
        That is interesting because your mental abilities seem to be correlated with orchestrating a bunch of abstractions you have previously mastered. Are these tools making us stupid because we no longer need to master any of these things? Or are they making us smarter because the abstraction is just trusting AI to handle it for us?
      • beepbooptheory3 hours ago
        If one reads the dialogue, Socrates is not the one &quot;saying&quot; this, but he is telling a story of what King Thamus said to the Egyptian god Theuth, who is the inventor of writing. He is asking the king to give out the writing, but the king is unsure about it.<p>Its what is known as one of the Socratic &quot;myths,&quot; and really just contributes to a web of concepts that leads the dialogue to its ultimate terminus of <i>aporia</i> (being a relatively early Plato dialogue). Socrates, characteristically, doesn&#x27;t really give <i>his</i> take on writing. In the text, he is just trying to help his friend write a horny love letter&#x2F;speech!<p>I can&#x27;t bring it up right now, but the end of the dialogue has a rather beautiful characterization of writing in the positive, saying that perhaps <i>logos</i> can grow out of writing, like a garden.<p>I think if pressed Socrates&#x2F;Plato <i>would</i> say that LLM&#x27;s are merely <i>doxa</i> machines, incapable of <i>logos</i>. But I am just spitballing.
        • dempedempe1 hour ago
          <a href="https:&#x2F;&#x2F;standardebooks.org&#x2F;ebooks&#x2F;plato&#x2F;dialogues&#x2F;benjamin-jowett&#x2F;text&#x2F;phaedo#phaedo-text" rel="nofollow">https:&#x2F;&#x2F;standardebooks.org&#x2F;ebooks&#x2F;plato&#x2F;dialogues&#x2F;benjamin-j...</a>
      • direwolf202 hours ago
        He was right. It did.
      • sifar4 hours ago
        Well, the wisdom part is true.
      • specialist2 hours ago
        Yup.<p>My personal counterpoint is Norman&#x27;s thesis in Things That Make Us Smart.<p>I&#x27;ve long tried, and mostly failed, to consider the tradeoffs, to be ever mindful that technologies are never neutral (winners &amp; losers), per Postman&#x27;s Technopoly.
      • throw109204 hours ago
        Writing&#x2F;reading and AI are so categorically different that the only way you could compare them is if you fundamentally misunderstand how <i>both</i> of them work.<p>And &quot;other people in the past predicted doom about something like this and it didn&#x27;t happen&quot; is a fallacious non-argument even when the things <i>are</i> comparable.
        • ppseafield3 hours ago
          The argument Socrates is making is specifically that writing isn&#x27;t a substitute for thinking, but it will be used as such. People will read things &quot;without instruction&quot; and claim to understand those things, even if they do not. This is a trade-off of writing. And the same thing is happening with LLMs in a widespread manner throughout society: people are having ChatGPT generate essays, exams, legal briefs and filings, analyses, etc., and submitting them as their own work. And many of these people don&#x27;t understand what they have generated.<p>Writing&#x27;s invention is presented as an &quot;elixir of memory&quot;, but it doesn&#x27;t transfer memory and understanding directly - the reader must still think to understand and internalize information. Socrates renames it an &quot;elixir of reminding&quot;, that writing only tells readers what other people have thought or said. It can facilitate understanding, but it can also enable people to take shortcuts around thinking.<p>I feel that this is an apt comparison, for example, for someone who has only ever vibe-coded to an experienced software engineer. The skill of reading (in Socrates&#x27;s argument) is not equivalent to the skill of understanding what is read. Which is why, I presume, the GP posted it in response to a comment regarding fear of skill atrophy - they are practicing code generation but are spending less time thinking about what all of the produced code is doing.
        • wjSgoWPm5bWAhXB4 hours ago
          yes, but people just really like to predict dooms and they also like to be convinced that they live in some special era in human history
          • throw109203 hours ago
            It takes about 30 seconds of thinking and&#x2F;or searching the Internet to realize that people also predict doom when it actually happens - e.g. with people correctly predicting that TikTok will shorten people&#x27;s attention spans.<p>It&#x27;s then quite obvious that the fact that someone, somewhere, predicts a bad thing happening has ~zero bearing on whether it actually happens, and so the claim that &quot;someone predicted doom in the past and it didn&#x27;t happen then so someone predicting doom now is also wrong&quot; is absurd. Calling that idea &quot;intellectually lazy&quot; is an insult to smart-but-lazy people. This is more like intellectually <i>incapable</i>.<p>The fact that people will unironically say such a thing in the face of not only widespread personal anecdotes from well-respected figures, but <i>scientific evidence</i>, is depressing. Maybe people who say these things are heavy LLM users?
            • jrowen1 hour ago
              There is always some set of people predicting all sorts of dooms though. The saying about the broken clock comes to mind.<p>With the right cherry picking, it can always be said that [some set of] the doomsayers were right, or that they were wrong.<p>As you say, someone predicting doom has no bearing on whether it happens, so why engage in it? It&#x27;s just spreading FUD and dwelling on doom. There&#x27;s no expected value to the individual or to others.<p>Personally, I don&#x27;t think &quot;TikTok will shorten people&#x27;s attention spans&quot; qualifies as doom in and of itself.
          • jatari4 hours ago
            We are very clearly living through a moment in history that will be studied intensely for thousands of years.
            • direwolf202 hours ago
              Because of the collapsing empire, mind you, not because of the LLMs.
              • jatari2 hours ago
                Creation of the internet, social media, everyone on the planet getting a pocket sized supercomputer, beginning of the AI boom, Trump&#x2F;beginning of the end of the US, are all reasons people will study this period of time.
                • jrowen43 minutes ago
                  This is really interesting because I wholeheartedly believe the original sentiment that everyone thinks their generation is special, and that &quot;now this time they&#x27;ve really screwed it all up&quot; is quite myopic -- and that human nature and the human experience are relatively constant throughout history while the world changes around us.<p>But, it is really hard to escape the feeling that digital technology and AI are a huge inflection point. In some ways this couple generations might be the singularity. Trump and contemporary geopolitics in general is a footnote, a silly blip that will pale in comparison over time.
        • grogenaut4 hours ago
          I know managers who can read code just fine, they&#x27;re just not able&#x2F;willing to code it. Tho the ai helps with that too. I&#x27;ve had a few managers dabble back into coding esp scripts and whatnot where I want them to be pulling unique data and doing one off investigations.
        • andy_ppp4 hours ago
          I read grandparent comment as saying people have been claiming that the sky is falling forever… AI will be both good for learning and development and bad. It’s always up to the individual if it benefits them or atrophies their minds.
          • oblio3 hours ago
            I&#x27;m not a big fan of LLMs, but while using it for day to day tasks, I get the same feeling I had when I first started the internet (I was lucky to start with broadband internet).<p>That feeling was one of empowerment: I was able to satisfy my curiosity about a lot of topics.<p>LLMs can do the same thing and save me a lot of time. It&#x27;s basically a super charged Google. For programming it&#x27;s a super charged auto complete coupled with a junior researcher.<p>My main concern is independence. LLMs in the hands of just a bunch of unchecked corporations are extremely dangerous. I kind of trusted Google, and even that trust is eroding, and LLMs can be extremely personal. The lack of trust ranges from risk of selling data and general data leaks, to intrusive and worse, hidden ads, etc.
            • runarberg3 hours ago
              When I first started using the internet, I was able to instant text message (IRC) random strangers, using a fake name, and lie about my age. My teacher had us send an email to our ex-classmate who had move to Australia, and she replied the next day, I was able to download the song I just heard on the radio and play it as many times as I wanted on my winamp.<p>These capabilities simply didn’t exist before the Internet. Apart for the email to Australia (which was possible with a fax machine; but much more expensive), LLMs don‘t give you any new capabilities. It just provides a way for you to do what you already can (and should) do with your brain, without using your brain. It is more like using replacing your social interaction with facebook, then it is to experience an instant message group chat for the first time.
              • oblio31 minutes ago
                Before LLMs it was incredibly tedious or expensive or both to get legal guidance for stuff like taxes, where I live. Now I can orient myself much better before I ask an actual tax expert pointed questions, saving a lot of time and money.<p>The list of things they can provide is endless.<p>They&#x27;re not a creator, they&#x27;re an accelerator.<p>And time matters. My interests are myriad but my capacity to pass the entry bar manually is low because I can only invest so much time.
                • runarberg23 minutes ago
                  If this resembles the feeling you had when you first used the internet, it is drastically different from when I used the internet.<p>When I first used the internet, it was not about doing things faster, it was about doing things which were previously simply unavailable to me. A 12 year old me was never gonna fax my previous classmate who moved to Australia, but I certainly emailed her.<p>We are not talking about a creator nor an accelerator, we are talking about an avenue (or a road if you will). When I use the internet, I am the creator, and the internet is the road that gets me there.<p>When I use an LLM it is doing something I can already do, but now I can do it without using my brain. So the feeling is much closer to doomscrolling on social media where previously I could just read a book or meet my pals at the pub. Doomscrolling facebook is certainly faster then reading a book, or socializing at the pub. But it is a poor replacement for either.
        • whistle6504 hours ago
          To understand the impact on computer programming per se, I find it useful to imagine that the first computer programs I had encountered were, somehow, expressed in a rudimentary natural language. That (somewhat) divorces the consideration of AI from its specific impact on programming. Surely it would have pulled me in certain directions. Surely I would have had less direct exposure to the mechanics of things. But, it seems to me that’s a distinction of degree, not of kind.
    • striking22 hours ago
      It&#x27;s not <i>just</i> brain atrophy, I think. I think part of it is that we&#x27;re actively making a tradeoff to focus on learning how to use the model rather than learning how to use our own brains and work with each other.<p>This would be fine if not for one thing: the meta-skill of learning to use the LLM depreciates too. Today&#x27;s LLM is gonna go away someday, the way you have to use it will change. You will be on a forever treadmill, always learning the vagaries of using the new shiny model (and paying for the privilege!)<p>I&#x27;m not going to make myself dependent, let myself atrophy, run on a treadmill forever, for something I happen to rent and can&#x27;t keep. If I wanted a cheap high that I didn&#x27;t mind being dependent on, there&#x27;s more fun ones out there.
      • raducu12 hours ago
        &gt; let myself atrophy, run on a treadmill forever, for something<p>You&#x27;re lucky to afford the luxury not to atrophy.<p>It&#x27;s been almost 4 years since my last software job interview and I know the drills about preparing for one.<p>Long before LLMs my skills naturally atrophy in my day job.<p>I remember the good old days of J2ME of writing everything from scratch. Or writing some graph editor for universiry, or some speculative, huffman coding algorithm.<p>That kept me sharp.<p>But today I feel like I&#x27;m living in that netflix series about people being in Hell and the Devil tricking them they&#x27;re in Heaven and tormenting them: how on planet Earth do I keep sharp with java, streams, virtual threads, rxjava, tuning the jvm, react, kafka, kafka streams, aws, k8s, helm, jenkins pipelines, CI-CD, ECR, istio issues, in-house service discovery, hierarchical multi-regions, metrics and monitoring, autoscaling, spot instances and multi-arch images, multi-az, reliable and scalable yet as cheap as possible, yet as cloud native as possible, hazelcast and distributed systems, low level postgresql performance tuning, apache iceberg, trino, various in-house frameworks and idioms over all of this? Oh, and let&#x27;s not forget the business domain, coding standards, code reviews, mentorships and organazing technical events. Also, it&#x27;s 2026 so nobody hires QA or scrum masters anymore so take on those hats as well.<p>So LLMs it is, the new reality.
        • aftergibson11 hours ago
          This is a very good point. Years ago working in a LAMP stack, the term LAMP could fully describe your software engineering, database setup and infrastructure. I shudder to think of the acronyms for today&#x27;s tech stacks.
          • oldandboring3 hours ago
            And yet many the same people who lament the tooling bloat of today will, in a heartbeat, make lame jokes about PHP. Most of them aren&#x27;t even old enough to have ever done anything serious with it, or seen it in action beyond Wordpress or some spaghetti-code one-pager they had to refactor at their first job. Then they show up on HN with a vibe-coded side project or blog post about how they achieved a 15x performance boost by inventing server-side rendering.
        • carimura8 hours ago
          Ya I agree it&#x27;s totally crazy.... but, do most app deployments need even half that stuff? I feel like most apps at most companies can just build an app and deploy it using some modern paas-like thing.
          • KronisLV7 hours ago
            &gt; I feel like most apps at most companies can just build an app and deploy it using some modern paas-like thing.<p>Most companies (in the global, not SV sense) would be well served by an app that runs in a Docker container in a VPS somewhere and has PostgreSQL and maybe Garage, RabbitMQ and Redis if you wanna get fancy, behind Apache2&#x2F;Nginx&#x2F;Caddy.<p>But obviously that’s not Serious Business™ and won’t give you zero downtime and high availability.<p>Though tbh most mid-size companies would also be okay with Docker Swarm or Nomad and the same software clustered and running behind HAProxy.<p>But that wouldn’t pad your CV so yeah.
            • ryandrake3 hours ago
              &gt; Most companies (in the global, not SV sense) would be well served by an app that runs in a Docker container in a VPS somewhere and has PostgreSQL and maybe Garage, RabbitMQ and Redis if you wanna get fancy, behind Apache2&#x2F;Nginx&#x2F;Caddy.<p>That’s still too much complication. Most companies would be well served by a native .EXE file they could just run on their PC. How did we get to the point where applications by default came with all of this shit?
              • danans2 hours ago
                &gt; That’s still too much complication. Most companies would be well served by a native .EXE file they could just run on their PC<p>I doubt that.<p>As software has grown to solving simple personal computing problems (write a document, create a spreadsheet) to solving organizational problems (sharing and communication within and without the organization), it has necessarily spread beyond the .exe file and local storage.<p>That doesn&#x27;t give a pass to overly complex applications doing a simple thing - that&#x27;s a real issue - but to think most modern company problems could be solved with just a local executable program seems off.
              • direwolf202 hours ago
                When I was in primary school, the librarian used a computer this way, and it worked fine. However, she had to back it up daily or weekly onto a stack of floppy disks, and if she wanted to serve the students from the other computer on the other side of the room, she had to restore the backup on there, and remember which computer had the latest data, and only use that one. When doing a stock–take (scanning every book on the shelves to identify lost books), she had to bring that specific computer around the room in a cart. Such inconveniences are not insurmountable, but they&#x27;re nice to get rid of. You don&#x27;t need to back up a cloud service and it&#x27;s available everywhere, even on smaller devices like your phone.<p>There&#x27;s an intermediate level of convenience. The school did have an IT staff (of one person) and a server and a network. It would be possible to run the library database locally in the school but remotely from the library terminals. It would then require the knowledge of the IT person to administer, but for the librarian it would be just as convenient as a cloud solution.
                • badsectoracula21 minutes ago
                  I think the &#x27;more than one user&#x27; alternative to a &#x27;single EXE on a single computer&#x27; isn&#x27;t the multilayered pie of things that KronisLV mentioned, but a PHP script[0] on an apache server[0] you access via a web browser. You don&#x27;t even need a dedicated DB server as SQLite will do perfectly fine.<p>[0] or similarly easy to get running equivalent
              • KronisLV2 hours ago
                &gt; How did we get to the point where applications by default came with all of this shit?<p>Because when you give your clients instructions on how to setup the environment, they will ignore some of them and then they install OracleJDK while you have tested everything under OpenJDK and you have no idea why the application is performing so much worse in their environment: <a href="https:&#x2F;&#x2F;blog.kronis.dev&#x2F;blog&#x2F;oracle-jdk-and-openjdk-compatibility-is-broken" rel="nofollow">https:&#x2F;&#x2F;blog.kronis.dev&#x2F;blog&#x2F;oracle-jdk-and-openjdk-compatib...</a><p>It&#x27;s not always trivial to package your entire runtime environment unless you wanna push VM images (which is in many ways worse than Docker), so Docker is like the sweet spot for the real world that we live in - a bit more foolproof, the configuration can be ONE docker-compose.yml file, it lets you manage resource limits without having to think about cgroups, as well as storage and exposed ports, custom hosts records and all the other stuff the human factor in the process inevitably fucks up.<p>And in my experience, shipping a self-contained image that someone can just run with docker compose up is infinitely easier than trying to get a bunch of Ansible playbooks in place.<p>If your app can be packaged as an AppImage or Flatpak, or even a fully self contained .deb then great... unless someone also wants to run it on Windows or vice versa or any other environment that you didn&#x27;t anticipate, or it has more dependencies than would be &quot;normal&quot; to include in a single bundle, in which case Docker still works at least somewhat.<p>Software packaging and dependency management sucks, unless we all want to move over to statically compiled executables (which I&#x27;m all for). Desktop GUI software is another can of worms entirely, too.
          • oldandboring3 hours ago
            When I come into a new project and I find all this... &quot;stuff&quot; in use, often what I later find is actually happening with a lot of it is:<p>- nobody remembers why they&#x27;re using it<p>- a lot of it is pinned to old versions or the original configuration because the overhead of maintaining so much tooling is too much for the team and not worth the risk of breaking something<p>- new team members have a hard time getting the &quot;complete picture&quot; of how the software is built and how it deploys and where to look if something goes wrong.
        • dullcrisp2 hours ago
          That was on NBC.
      • daxfohl21 hours ago
        Businesses too. For two years it&#x27;s been &quot;throw everything into AI.&quot; But now that shit is getting real, are they <i>really</i> feeling so coy about letting AI run ahead of their engineering team&#x27;s ability to manage it? How long will it be until we start seeing outages that just don&#x27;t get resolved because the engineers have lost the plot?
        • scorpioxy15 hours ago
          From what I am seeing, no one is feeling coy simply because of the cost savings that management is able to show the higher-ups and shareholders. At that level, there&#x27;s very little understanding of anything technical and outages or bugs will simply get a &quot;we&#x27;ve asked our technical resources to work on it&quot;. But every one understands that spending $50 when you were spending $100 is a great achievement. That&#x27;s if you stop and not think about any downsides. Said management will then take the bonuses and disappear before the explosions start with their resume glowing about all the cost savings and team leadership achievements. I&#x27;ve experienced this first hand very recently.
          • bgilroy263 hours ago
            There really ought to be a class of professionals like forensic accountants who can show up in a corrupted organization and do a post mortem on their management of technical debt
          • daxfohl15 hours ago
            Of all the looming tipping points whereby humans could destroy the fabric of their existence, this one has to be the stupidest. And therefore the most likely.
        • throwup23818 hours ago
          How long until “the LLM did it it” is just as effective as “AWS is down, not my fault”?
          • sarchertech6 hours ago
            Never because the only reason that works with Amazon is that everyone is down at the exact same time.
            • direwolf202 hours ago
              Everyone will suffer from slop code at the same time.
          • draxil8 hours ago
            This to me is the point.. LLMs can&#x27;t be responsible for things. It sits with a human.
            • taylorius3 hours ago
              Why can LLMs not be responsible for things? (genuine question - I&#x27;m not certain myself).
              • pvab32 hours ago
                because it doesn&#x27;t have any skin in the game and can&#x27;t be punished, and can&#x27;t be rewarded for succeeding. Its reputation, career, and dignity are nonexistent.
                • direwolf202 hours ago
                  This doesn&#x27;t seem to have stopped anyone before.
                  • pvab31 hour ago
                    Stopped anyone from doing what? Assigning responsibility to someone with nothing to lose, no dignity or pride, and immune from financial or social injury?
          • shaftoe7 hours ago
            If you’re just a gladhander for an algorithm, what are you really needed for?
      • Aurornis4 hours ago
        &gt; the meta-skill of learning to use the LLM depreciates too. Today&#x27;s LLM is gonna go away someday, the way you have to use it will change. You will be on a forever treadmill, always learning the vagaries of using the new shiny model (and paying for the privilege!)<p>I haven’t found this to be true at all, at least so far.<p>As models improve I find that I can start dropping old tricks and techniques that were necessary to keep old models in line. Prompts get shorter with each new model improvement.<p>It’s not really a cycle where you’re re-learning all the time or the information becomes outdated. The same prompt structure techniques are usually portable across LLMs.
        • rubenflamshep2 hours ago
          Interesting, I’ve experienced the opposite in certain contexts. CC is so hastily shipped that new versions often imbalance existing workflows. E.g. people were raving about the new user prompt tools that CC used to get more context but they messed my simple git slash commands
      • pards7 hours ago
        &gt; I happen to rent and can&#x27;t keep<p>This is my fear - what happens if the AI companies can&#x27;t find a path to profitability and shut down?
        • thevillagechief5 hours ago
          Don&#x27;t threaten us with a good time.
        • satvikpendem4 hours ago
          This is why local models are so important. Even if the non-local ones shut down, and even if you can&#x27;t run local ones on your own hardware, there will still be inference providers willing to serve your requests.
        • MillionOClock4 hours ago
          Recently I was thinking about how some (expensive) customer electronics like the Mac Studio can run pretty powerful open source models with a pretty efficient power consumption, that could pretty easily run on private renewable energy, and that are on most (all?) fronts much more powerful than the original ChatGPT especially if connected to a good knowledge base. Meaning that aside from very extreme scenarios I think it is safe to say that there will always be a way not to go back to how we used to code, as long as we can offer the correct hardware and energy. Of course personally I think we will never need to go to such extreme ends... despite knowing of people who seem to seriously think developed countries heavily run out of electricity one day, which, while I reckon there might be tensions, seems like a laughable idea IMHO.
      • infecto5 hours ago
        I think you have to be aware of how you use any tool but I don’t think this is a forever treadmill. It’s pretty clear to me since early on that the goal is for you the user to not have to craft the perfect prompt. At least for my workflow it’s pretty darn close to that for me.
      • prettyblocks5 hours ago
        In my experience all technology has been like this though. We are on the treadmill of learning the new thing with our without LLMs. That&#x27;s what makes tech work so fun and rewarding (for me anyway).
      • rurp14 hours ago
        I have deliberately moderated my use of AI in large part for this reason. For a solid two years now I&#x27;ve been constantly seeing claims of &quot;<i>this</i> model&#x2F;IDE&#x2F;Agent&#x2F;approach&#x2F;etc is the future of writing code! It makes me 50x more productive, and will do the same for you!&quot; And inevitabely those have all fallen by the wayside and been replaced by some new shiny thing. As someone who doesn&#x27;t get intrinsic joy out of chasing the latest tech fad I usually move along and wait to see if whatever is being hyped really starts to take over the world.<p>This isn&#x27;t to say LLMs won&#x27;t change software development forever, I think they will. But I doubt anyone has any idea what kind of tools and approaches everyone will be using 5 or 10 years from now, except that I really doubt it will be whatever is being hyped up at this exact moment.
        • apercu7 hours ago
          HN is where I keep hearing the “50× more productive” claims the most. I’ve been reading 2024 annual reports and 2025 quarterlies to see whether any of this shows up on the other side of the hype.<p>So far, the only company making loud, concrete claims backed by audited financials is Klarna and once you dig in, their improved profitability lines up far more cleanly with layoffs, hiring freezes, business simplification, and a cyclical rebound than with Gen-AI magically multiplying output. AI helped support a smaller org that eliminated more complicated financial products that have edge cases, but it didn’t create a step-change in productivity.<p>If Gen-AI were making tech workers even 10× more productive at scale, you’d expect to see it reflected in revenue per employee, margins, or operating leverage across the sector.<p>We’re just not seeing that yet.
          • laserlight6 hours ago
            I have friends who make such 50x productivity claims. They are correct if we define productivity as creating untested apps and games and their features that will never ship --- or be purchased, even if they were to ship. Thus, “productivity” has become just another point of contention.
            • apercu3 hours ago
              100% agree. There are far more half-baked, incomplete &quot;products&quot; and projects out there now that it is easier to generate code. Generously, that doesn&#x27;t necessarily equate to productivity.<p>I&#x27;ve agree with the fact that the last 10% of a project is the hardest part, and that&#x27;s the part that Gen-AI sucks at (hell, maybe the 30%).
          • sarchertech6 hours ago
            &gt; If Gen-AI were making tech workers even 10× more productive at scale, you’d expect to see it reflected in revenue per employee, margins, or operating leverage across the sector.<p>If we’re even just talking a 2x multiplier, it should show up in some externally verifiable numbers.
            • apercu3 hours ago
              I agree, and we might be seeing this but there is so much noise, so many other factors, and we&#x27;re in the midst of capital re-asserting control after a temporary loss of leverage which might also be part of a productivity boost (people are scared so they are working harder).<p>The issue is that I&#x27;m not a professional financial analyst and I can&#x27;t spend all day on comps so I can&#x27;t tell through the noise yet if we&#x27;re seeing even 2x related to AI.<p>But, if we&#x27;re seeing 10x, I&#x27;d be finding it in the financials. Hell, a blind squirrel would, and it&#x27;s simply not there.
      • Kostic7 hours ago
        I assume you&#x27;re living in a city. You&#x27;re already renting out a lot of things to others (security, electricity, water, food, shelter, transportation), what is different with white collar work?
        • bondarchuk6 hours ago
          &gt;<i>the city gets destroyed</i><p>vs.<p>&gt;<i>a company goes bankrupt or pivots</i><p>I can see a few differences.
      • locknitpicker12 hours ago
        &gt; It&#x27;s not just brain atrophy, I think. I think part of it is that we&#x27;re actively making a tradeoff to focus on learning how to use the model rather than learning how to use our own brains and work with each other.<p>I agree with the sentiment but I would have framed it differently. The LLM is a tool, just like code completion or a code generator. Right now we focus mainly on how to use a tool, the coding agent, to achieve a goal. This takes place at a strategic level. Prior to the inception of LLMs, we focused mainly on how to write code to achieve a goal. This took place at a tactical level, and required making decisions and paying attention to a multitude of details. With LLMs our focus shifts to a higher-level abstraction. Also, operational concerns change. When writing and maintaining code yourself, you focus on architectures that help you simplify some classes of changes. When using LLMs, your focus shifts to building context and aiding the model effectively implement their changes. The two goals seem related, but are radically different.<p>I think a fairer description is that with LLMs we stop exercising some skills that are only required or relevant if you are writing your code yourself. It&#x27;s like driving with an automatic transmission vs manual transmission.
        • bandrami11 hours ago
          Previous tools have been deterministic and understandable. I write code with emacs and can at any point look at the source and tell you why it did what it did. But I could produce the same program with vi or vscode or whatever, at the cost of some frustration. But they all ultimately transform keystrokes to a text file in largely the same way, and the compiler I&#x27;m targeting changes that to asm and thence to binary in a predictable and visible way.<p>An LLM is always going to be a black box that is neither predictable nor visible (the unpredictability is necessary for how the tool functions; the invisibility is not but seems too late to fix now). So teams start cargo culting ways to deal with specific LLMs&#x27; idiosyncrasies and your domain knowledge becomes about a specific product that someone else has control over. It&#x27;s like learning a specific office suite or whatever.
          • TeMPOraL10 hours ago
            &gt; <i>An LLM is always going to be a black box that is neither predictable nor visible (the unpredictability is necessary for how the tool functions; the invisibility is not but seems too late to fix now)</i><p>So basically, like a co-worker.<p>That&#x27;s why I keep insisting that anthropomorphising LLMs is to be embraced, not avoided, because it gives much better high-level, first-order intuition as to where they belong in a larger computing system, and where they shouldn&#x27;t be put.
            • bandrami10 hours ago
              &gt; So basically, like a co-worker.<p>Arguably, though I don&#x27;t particularly need another co-worker. Also co-workers are not tools (except sometimes in the derogatory sense).
            • draxil7 hours ago
              Sort of except it seems the more the co-worker does the job it atrophies my ability to understand.. So soon we&#x27;ll all be that annoyingly ignorant manager saying, &quot;I don&#x27;t know, I want the button to be bigger&quot;. Yay?
          • ryanjshaw9 hours ago
            &gt; and can at any point look at the source and tell you why it did what it did<p>Even years later? Most people can’t unless there’s good comments and design. Which AI can replicate, so if we need to do that anyway, how is AI specially worse than a human looking back at code written poorly years ago?
            • bandrami8 hours ago
              I mean, Emacs&#x27;s oldest source files are like 40 years old at this point, and yes they are in fact legible? I&#x27;m not sure what you&#x27;re asking -- you absolutely can (and if you use it long enough, will) read the source code of your text editor.
              • draxil7 hours ago
                Well especially the lisp parts!
        • koiueo9 hours ago
          The little experience I have with LLM confidently shows that LLMs are much better at navigating and modifying a well structured code base. And they struggle, sometimes to a point where they can&#x27;t progress at all, if tasked to work on bad code. I mean, the kind of bad you always get after multiple rounds of unsupervised vibe coding.
    • nemothekid20 hours ago
      I think I should write more about but I have been feeling very similar. I&#x27;ve been recently exploring using claude code&#x2F;codex recently as the &quot;default&quot;, so I&#x27;ve decided to implement a side project.<p>My gripe with AI tools in the past is that the kind of work I do is large and complex and with previous models it just wasn&#x27;t efficient to either provide enough context or deal with context rot when working on a large application - especially when that application doesn&#x27;t have a million examples online.<p>I&#x27;ve been trying to implement a multiplayer game with server authoritative networking in Rust with Bevy. I specifically chose Bevy as the latest version was after Claude&#x27;s cut off, it had a number of breaking changes, and there aren&#x27;t a lot of deep examples online.<p>Overall it&#x27;s going well, but one downside is that I don&#x27;t really understand the code &quot;in my bones&quot;. If you told me tomorrow that I had optimize latency or if there was a 1 in 100 edge case, not only would I not know where to look, I don&#x27;t think I could tell you how the game engine works.<p>In the past, I could not have ever gotten this far without really understanding my tools. Today, I have a semi functional game and, truth be told, I don&#x27;t even know what an ECS is and what advantages it provides. I really consider this a huge problem: if I had to maintain this in production, if there was a SEV0 bug, am I confident enough I could fix it? Or am I confident the model could figure it out? Or is the model good enough that it could scan the entire code base and intuit a solution? One of these three questions have to be answered or else brain atrophy is a real risk.
      • bedrio14 hours ago
        I&#x27;m worried about that too. If the error is reproducible, the model can eventually figure it out from experience. But a ghost bug that I can&#x27;t pattern? The model ends up in a &quot;you&#x27;re absolutely right&quot; loop as it incorrectly guesses different solutions.
        • mattmanser11 hours ago
          Are ghost bugs even real?<p>My first job had the Devs working front-line support years ago. Due to that, I learnt an important lessons in bug fixing.<p>Always be able to re-create the bug first.<p>There are no such thing as ghost bugs, you just need to ask the reporter the right questions.<p>Unless your code is multi-threaded, to which I say, good luck!
          • chickensong4 hours ago
            They&#x27;re real at scale. Plenty of bugs don&#x27;t suface until you&#x27;re running under heavy load on distributed infrastructure. Often the culprit is low in the stack. Asking the reporter the right questions may not help in this case. You have full traces, but can&#x27;t reproduce in a test environment.<p>When the cause is difficult to source or fix, it&#x27;s sometimes easier to address the effect by coding around the problem, which is why mature code tends to have some unintuitive warts to handle edge cases.
          • yencabulator3 hours ago
            &gt; Unless your code is multi-threaded, to which I say, good luck!<p>What isn&#x27;t multi-threaded these days? Kinda hard to serve HTTP without concurrency, and practically every new business needs to be on the web (or to serve multiple mobile clients; same deal).<p>All you need is a database and web form submission and now you have a full distributed system in your hands.
            • direwolf202 hours ago
              nginx is single–threaded, but you&#x27;re absolutely right — any concurrency leads to the same ghost bugs.
              • yencabulator2 hours ago
                nginx is also from the era when fast static file serving was still a huge challenge, and &quot;enough to run a business&quot; for many purposes -- most software written has more mutable state, and much more potential for edge cases.
            • mattmanser1 hour ago
              Only superficially so, await&#x2F;async isn&#x27;t usually like the old spaghetti multi-threaded code people used to write.
              • yencabulator1 hour ago
                You mean in a single-threaded context like Javascript? (Or with Python GIL giving the impression of the same.) That removes some memory corruption races, but leaves all the logical problems in place. The biggest change is that you only have fixed points where interleaving can happen, limiting the possibilities -- but in either scenario, the number of possible paths is so big it&#x27;s typically not human-accessible.<p>Webdevs not aware of race conditions -&gt; complex page fails to load. They&#x27;re lucky in how the domain sandboxes their bugs into affecting just that one page.
          • SpicyLemonZest11 hours ago
            Historically I would have agreed with you. But since the rise of LLM-assisted coding, I&#x27;ve encountered an increasing number of things I&#x27;d call clear &quot;ghost bugs&quot; in single threaded code. I found a fun one today where invoking a process four times with a very specific access pattern would cause a key result of the second invocation to be overwritten. (It is not a coincidence, I don&#x27;t think, that these are exactly the kind of bugs a genAI-as-a-service provider might never notice in production.)
      • mh226616 hours ago
        &gt; I&#x27;ve been trying to implement a multiplayer game with server authoritative networking in Rust with Bevy. I specifically chose Bevy as the latest version was after Claude&#x27;s cut off, it had a number of breaking changes, and there aren&#x27;t a lot of deep examples online.<p>I am interested in doing something similar (Bevy. not multiplayer).<p>I had the thought that you ought be able to provide a cargo doc or rust-analyzer equivalent over MCP? This... must exist?<p>I&#x27;m also curious how you test if the game is, um... fun? Maybe it doesn&#x27;t apply so much for a multiplayer game, I&#x27;m thinking of stuff like the enemy patterns and timings in a soulslike, Zelda, etc.<p>I did use ChatGPT to get some rendering code for a retro RCT&#x2F;SimCity-style terrain mesh in Bevy and it basically worked, though several times I had to tell it &quot;yeah uh nothing shows up&quot;, at which point is said &quot;of course! the problem is...&quot; and then I learned about mesh winding, fine, okay... felt like I was in over my head and decided to go to a 2D game instead so didn&#x27;t pursue that further.
        • nemothekid15 hours ago
          &gt;<i>I had the thought that you ought be able to provide a cargo doc or rust-analyzer equivalent over MCP? This... must exist?</i><p>I&#x27;ve found that there are two issues that arise that I&#x27;m not sure how to solve. You can give it docs and point to it and it can generally figure out syntax, but the next issue I see is that without examples, it kind of just brute forces problems like a 14 year old.<p>For example, the input system originally just let you move left and right, and it popped it into an observer function. As I added more and more controls, it began to litter with more and more code, until it was ~600 line function responsible for a large chunk of game logic.<p>While trying to parse it I then had it refactor the code - but I don&#x27;t know if the current code is idiomatic. What would be the cargo doc or rust-analyzer equivalent for good architecture?<p>Im running into this same problem when trying to claude code for internal projects. Some parts of the codebase just have really intuitive internal frameworks and claude code can rip through them and provide great idiomatic code. Others are bogged down by years of tech debt and performance hacks and claude code can&#x27;t be trusted with anything other than multi-paragraph prompts.<p>&gt;<i>I&#x27;m also curious how you test if the game is, um... fun?</i><p>Lucky enough for me this is a learning exercise, so I&#x27;m not optimizing for fun. I guess you could ask claude code to inject more fun.
          • azrazalea_debt5 hours ago
            &gt; What would be the cargo doc or rust-analyzer equivalent for good architecture?<p>Well, this is where you still need to know your tools. You should understand what ECS is and why it is used in games, so that you can push the LLM to use it in the right places. You should understand idiomatic patterns in the languages the LLM is using. Understand YAGNI, SOLID, DDD, etc etc.<p>Those are where the LLMs fall down, so that&#x27;s where you come in. The individual lines of code after being told what architecture to use and what is idiomatic is where the LLM shines.
            • nemothekid2 hours ago
              What you describe is how I use LLM tools today, but the reason I am approaching my project in this way is because I feel I need to brace myself for a future where developers are expected to &quot;know your tools&quot;<p>When I look around today - its clear more and more people are diving in head first into fully agentic workflows and I simply don&#x27;t believe they can churn out 10k+ lines of code today and be intimately familiar with the code base. Therefore you are left with two futures:<p>* Agentic-heavy SWEs will eventually blow up under the weight of all their tech debt<p>* Coding models are going to continue to get better where tech debt wont matter.<p>If the answer if (1), then I do not need to change anything today. If the answer is (2), then you need to prepare for a world where almost all code is written by an agent, but almost all responsibility is shouldered by you.<p>In kind of an ignorant way, I&#x27;m actually avoiding trying to properly learn what an ECS is and how the engine is structured, as sort of a handicap. If in the future I&#x27;m managing a team of engineers (however that looks) who are building a metaphorical tower of babel, I&#x27;d like to develop to heuristic in navigating that mountain.
      • storystarling9 hours ago
        I ran into similar issues with context rot on a larger backend project recently. I ended up writing a tool that parses the AST to strip out function bodies and only feeds the relevant signatures and type definitions into the prompt.<p>It cuts down the input tokens significantly which is nice for the monthly bill, but I found the main benefit is that it actually stops the model from getting distracted by existing implementation details. It feels a bit like overengineering but it makes reasoning about the system architecture much more reliable when you don&#x27;t have to dump the whole codebase into the context window.
      • jv222224 hours ago
        &gt; I don&#x27;t really understand the code &quot;in my bones&quot;.<p>Man, I absolutely hate this feeling.
    • krupan20 hours ago
      I&#x27;ve been thinking along these lines. LLMs seem to have arrived right when we were all getting addicted to reels&#x2F;tic tocks&#x2F;whatever. For some reason we love to swipe, swipe, swipe, until we get something funny&#x2F;interesting&#x2F;shocking, that gives us a short-lasting dopamine hit (or whatever chemicals it is) that feels good for about 1 second, and we want MORE, so we keep swiping.<p>Using an LLM is almost exactly the same. You get the occasional, &quot;wow! I&#x27;ve never seen it do that before!&quot; moments (whether that thing it just did was even useful or not), get a short hit of feel goods, and then we keep using it trying to get another hit. It keeps providing them at just the right intervals for people to keep them going just like they do with tick tock
      • neves6 hours ago
        It&#x27;s exactly the argument here:<p><a href="https:&#x2F;&#x2F;www.fast.ai&#x2F;posts&#x2F;2026-01-28-dark-flow&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.fast.ai&#x2F;posts&#x2F;2026-01-28-dark-flow&#x2F;</a>
    • CharlieDigital19 hours ago
      I ran into a new problem today: &quot;reading atrophy&quot;.<p>As in if the LLM doesn&#x27;t know about it, some devs are basically giving up and not even going to RTFM. I literally had to explain to someone today how something works by...reading through the docs and linking them the docs with screenshots and highlighted paragraphs of text.<p>Still got push back along the lines of &quot;not sure if this will work&quot;. It&#x27;s. Literally. In. The. Docs.
      • finaard12 hours ago
        That&#x27;s not really a new thing now, it just shows differently.<p>15 years ago I was working in an environment where they had lots of Indians as cheap labour - and the same thing will show up in any environment where you go for hiring a mass of cheap people while looking more at the cost than at qualifications: You pretty much need to trick them into reading stuff that are relevant.<p>I remember one case where one had a problem they couldn&#x27;t solve, and couldn&#x27;t give me enough info to help remotely. In the end I was sitting next to them, and made them read anything showing up on the screen out loud. Took a few tries where they were just closing dialog boxes without reading it, but eventually we had that under control enough that they were able to read the error messages to me, and then went &quot;Oh, so _that&#x27;s_ the problem?!&quot;<p>Overall interacting with a LLM feels a lot like interacting with one of them back then, even down to the same excuses (&quot;I didn&#x27;t break anything in that commit, that test case was never passing&quot;) - and my expectation for what I can get out of it is pretty much the same as back then, and approach to interacting with it is pretty similar. It&#x27;s pretty much an even cheaper unskilled developer, you just need to treat it as such. And you don&#x27;t pair it up with other unskilled developers.
      • acessoproibido5 hours ago
        As someone working in technical support for a long time, this has always been the case.<p>You can have as many extremely detailed and easy to parse gudies, references, etc. there will always be a portion of customers who refuse to read them.<p>Never could figure out why because they aren&#x27;t stupid or anything.
      • globular-toast11 hours ago
        The mere existence of the phrase &quot;RTFM&quot; shows that this phenomenon was already a thing. LLMs are the worst thing to happen to people who couldn&#x27;t read before. When HR type people ask what my &quot;superpower&quot; is I&#x27;m so tempted to say &quot;I can read&quot;, because I honestly feel like it&#x27;s the only difference between me and people who suck at working independently.
    • overfeed16 hours ago
      &gt; Eventually it was easier just to quit fighting it and let it do things the way it wanted.<p>I wouldn&#x27;t have believed it a few tears ago if you told me the industry would one day, in lockstep, decide that shipping more tech-debt is awesome. If the unstated bet doesn&#x27;t pay off, that is, AI development will outpace the rate it generates cruft, then there will be hell to pay.
      • ithkuil15 hours ago
        Don&#x27;t worry. This will create the demand for even more powerful models that are able to untangle the mess created by previous models.<p>Once we realize the kind of mess _those_ models created, well, we&#x27;ll need even more capable models.<p>It&#x27;s a variation on the theme of Kernighan insight about the more &quot;clever&quot; you are while coding the harder it will be to debug.<p>EDIT: Simplicity is a way out but it&#x27;s hard under normal circumstances, now with this kind of pressure to ship fast because the colleague with the AI chimp can outperform you, aiming at simplicity will require some widespread understanding
        • bandrami10 hours ago
          &quot;That&#x27;s the brilliant part: when the winter comes the apes freeze to death!&quot;
      • scorpioxy15 hours ago
        As someone who&#x27;s been commissioned many times before to work on or salvage &quot;rescue projects&quot; with huge amounts of tech debt, I welcome that day. Still not there yet though I am starting to feel the vibes shifting.<p>This isn&#x27;t anything new of course. Previously it was with projects built by looking for the cheapest bidder and letting them loose on an ill-defined problem. And you can just imagine what kind of code that produced. Except the scale is much larger.<p>My favorite example of this was a project that simply stopped working due to the amount of bugs generated from layers upon layers of bad code that was never addressed. That took around 2 years of work to undo. Roughly 6 months to un-break all the functionality and 6 more months to clean up the core and then start building on top.
        • sally_glance12 hours ago
          Are you not worried that the sibling comment is right and the solution to this will be &quot;more AI&quot; in the future? So instead of hiring a team of human experts to cleanup, management might just dump more money into some specialized AI refactoring platform or hire a single AI coordinator... Or maybe they skip to rebuild using AI faster, because AI is good at greenfield. Then they only need a specialized migration AI to automate the regular switchovers.<p>I used to be unconcerned, but I admit to be a little frightened of the future now.
          • scorpioxy10 hours ago
            Well, in general worrying about the future is not useful. Regardless of what you think, it is always uncertain. I specifically stay away from taking part in such speculative threads here on HN.<p>What&#x27;s interesting to me though is that very similar promises were being made about AI in the 80s. Then came the &quot;AI Winter&quot; after the hype cycle and promises got very far from reality. Generative AI is the current cycle and who knows, maybe it can fulfill all the promises and hype. Or maybe not.<p>There&#x27;s a lot of irrationality currently and until that settles down, it is difficult to see what is real and useful and what is smoke and mirrors.
      • e12e9 hours ago
        &gt; ... few tears ago<p>Brilliant. Even if it was a typo.
      • TeMPOraL11 hours ago
        The industry decided that <i>decades ago</i>. We may like to talk about quality and forethought, but when you actually go to work, you quickly discover it doesn&#x27;t matter. Small companies tell you &quot;we gotta go fast&quot;, large companies demand clear OKRs and focusing on actually delivering impact - either way, no one cares about tech debt, because they see it as unavoidable fact of life. Even more so now, as ZIRP went away and no one can afford to pay devs to polish the turd ad infinitum. The mantra is, ship it and do the next thing, clean up the old thing if it ever becomes a problem.<p>And guess what, I&#x27;m finally convinced they&#x27;re right.<p>Consider: it&#x27;s been that way for decades. We may tell ourselves good developers write quality code given the chance, but the truth is, the median programmer is a junior with &lt;5 years of experience, and they cannot write quality code to save their life. That&#x27;s purely the consequence of <i>rapid growth of software industry itself</i>. ~all production code in the past few decades was written by juniors, it continues to be so today; those who advance to senior level end up mostly tutoring new juniors instead of coding.<p>Or, all that put another way: tech debt is not <i>wrong</i>. It&#x27;s a tool, a trade-off. It&#x27;s perfectly fine to be loaded with it, if taking it lets you move forward and earn enough to afford paying installments when they&#x27;re due. Like with housing: you&#x27;re better off buying it with lump payment, or off savings in treasury bonds, but few have that money on hand and life is finite, so people just get a mortgage and move on.<p>--<p>Edited to add: There&#x27;s a silver lining, though. LLMs make tech debt <i>legible and quantifiable</i>.<p>LLMs are affected by tech debt <i>even more</i> than human devs are, because (currently) they&#x27;re dumber, they have less cognitive capability around abstractions and generalizations[0]. They make up for it by working much faster - which is a curse in terms of amplifying tech debt, but also a blessing, because you can <i>literally see them slowing down</i>.<p>Developer productivity is hard to measure in large part because the process is invisible (happens in people&#x27;s heads and notes), and cause-and-effect chains play out over weeks or months. LLM agents compress that to <i>hours to days</i>, and the process itself is laid bare in the chat transcript, easy to inspect and analyze.<p>The way I see it, LLMs will finally allow us to turn software development at tactical level from art into an engineering process. Though it might be too late for it to be of any use to human devs.<p>--<p>[0] - At least the out-of-distribution ones - quirks unique to particular codebase and people behind it.
      • daxfohl15 hours ago
        &gt; unstated bet<p>(except where it&#x27;s been stated, championed, enforced, and ultimated in no unequivocal terms by every executive in the tech industry)
        • overfeed14 hours ago
          I&#x27;m yet to encounter an AI-bull who admits the LLM tendency towards creating tech debt- outside of footnotes stating it can be fixed by better prompting (with no examples), or solved by whatever tool they are selling
      • naasking4 hours ago
        &gt; I wouldn&#x27;t have believed it a few tears ago if you told me the industry would one day, in lockstep, decide that shipping more tech-debt is awesome.<p>It&#x27;s not debt if you never have to pay it back. If a model can regenerate a whole relibale codebase in minutes from a spec, then your assessment of &quot;tech debt&quot; in that output becomes meaningless.
    • nonethewiser1 hour ago
      &gt; Like there have been multiple times now where I wanted the code to look a certain way, but it kept pulling back to the way it wanted to do things. Like if I had stated certain design goals recently it would adhere to them, but after a few iterations it would forget again and go back to its original approach, or mix the two, or whatever. Eventually it was easier just to quit fighting it and let it do things the way it wanted.<p>Absolutely. At a certain level of usage, you just have to let it do it&#x27;s thing.<p>People are going to take issue with that. You absolutely don&#x27;t have to let it do its thing. In that case you have to be way more in the loop. Which isn&#x27;t necessarily a bad thing.<p>But assuming you want it to basically do everything while you direct it, it becomes pointless to manage certain details. One thing in my experience is that Claude always wants to use ReactRouter. My personal preference is TanStack router, so I asked it to use it initially. That never really created any problems but after like the 3rd time of realizing I forget to specify it, I also realized that it&#x27;s totally pointless. ReactRouter works fine and Claude uses it fine - its pointless to specify otherwise.
    • gritspants21 hours ago
      My disillusionment comes from the feeling I am just cosplaying my job. There is nothing to distinguish one cosplayer from another. I am just doordashing software, at this point, and I&#x27;m not in control.
      • FitchApps4 hours ago
        100% there....it&#x27;s getting to a point where a project manager reports a bug AND also pastes a response from Claude (he ran Claude against our codebase) on how to fix the bug..Like I&#x27;m just copying what Claude said and making sure the thing compiles (.NET). What makes me sleep at night...for now is the fact that Claude isn&#x27;t supporting 9pm deployments and AWS Infra support ...it&#x27;s already writing code but not supporting it yet...
      • phito9 hours ago
        What kind of software are you writing? Are you just a &quot;code monkey&quot; implementing perfectly described Jira tickets (no offense meant)? I cannot imagine feeling this way with what I&#x27;m working on, writing code is just a small part of it, most of the time is spent trying to figure out how to integrate the various (undocumented and actively evolving) external services involved together in a coherent, maintainable and resilient way. LLMs absolutely cannot figure this out themselves, I have to figure it out myself and then write it all in its context, and even then it mostly comes up with sub-par, unmaintainable solutions if I wasn&#x27;t being precise engouh.<p>They are amazing for side projects but not for serious code with real world impact where most of the context is in multiple people&#x27;s head.
        • gritspants4 hours ago
          No, I am not a code monkey. I have an odd role working directly for an exec in a highly regulated industry, managing their tech pursuits&#x2F;projects. The work can range from exciting to boring depending on the business cycle. Currently it is quite boring, so I&#x27;ve leaned into using AI a bit more just to see how I like it. I don&#x27;t think that I do.
      • solumunus12 hours ago
        I don’t get this at all. I’m using LLM’s all day and I’m constantly having to make smart architectural choices that other less experienced devs won’t be making. Are you just prompting and going with whatever the initial output is, letting the LLM make decisions? Every moderately sized task should start with a plan, I can spend hours planning, going off and thinking, coming back to the plan and adding&#x2F;changing things, etc. Sometimes it will be days before I tell the LLM to “go”. I’m also constantly optimising the context available to the LLM, and making more specific skills to improve results. It’s very clear to me that knowledge and effort is still crucial to good long term output… Not everyone will get the same results, in fact everyone is NOT getting the same results, you can see this by reading the wildly different feedback on HN. To some LLM’s are a force multiplier while others claim they can’t get a single piece of decent output…<p>I think the way you’re using these tools that makes you feel this way is a choice. You’re choosing to not be in control and do as little as possible.
        • Otterly997 hours ago
          Exactly.<p>Once you start using it intelligently, the results can be really satisfying and helpful. People complaining about 1000 lines of codes being generated? Ask it to generate functions one at a time and make small implementations. People complaining about having to run a linter? Ask it to automatically run it after each code execution. People complaining about losing track? Have it log every modifications in a file.<p>I think you get my point. You need to treat it as a super powerful tool that can do so many things that you have to guide it if you want to have a result that conforms to what you have in mind.
        • rustyhancock10 hours ago
          One challenge is, are those decisions making tangible differences?<p>We won&#x27;t know until the code being produced especially greenfields hits any kind of maturity 5 years+ atleast?
          • mlrtime8 hours ago
            It&#x27;s not that challenging, the answer is, it depends.<p>It&#x27;s like a junior dev writing features for a product everyday vs a principle engineer. The junior might be adding a feature with O(n^2) performance while principle has seen this before and writes it O(log n).<p>If the feature never reaches significance, the &quot;better&quot; solution doesn&#x27;t matter, but it might!<p>The principle may write once and it is solid and never touched, but the junior might be good enough to never need coming back to, same with a llm and the right operator.
            • rustyhancock1 hour ago
              There&#x27;s that, but I actually think LLMs are becoming very good at not making the bad simple choice.<p>What they&#x27;re worse at is the bits I can&#x27;t easily see.<p>An example is that I recently was working on a project building a library with Claude. The code in pieces all looked excellent.<p>When I wrote some code making use of it several similar functions which were conceptually similar had signatures that were subtly mismatched.<p>Different programmers might have picked each patterns. And probably consistently made similar rules for the various projects they worked on.<p>To an LLM they are just happenstances and feel no friction.<p>A real project with real humans writing the code would notice the mismatch. Even if they aren&#x27;t working on those parts at the same time just from working on it across say a weekend.<p>But how many more decisions do we make convenient only for us meat bags that a LLM doesn&#x27;t notice?
          • solumunus8 hours ago
            What? Of course it makes a difference when I direct it away from a bad solution towards a good solution. I know as soon as I review the output and it has done what I asked, or it hasn&#x27;t and I make a correction. Why would I need to wait 5 years? That makes no sense, I can see the output.<p>If you&#x27;re using LLM&#x27;s and you don&#x27;t know what good&#x2F;bad output looks like then of course you&#x27;re going to have problems, but such a person would have the same problems without the LLM...
            • rustyhancock2 hours ago
              The problem is the LLMs are exceptionally good at producing output that appears good.<p>That&#x27;s what it&#x27;s ultimately been tuned to do.<p>The way I see this play out is output that satisfied me but that I would not produce myself.<p>Over a large project that adds up and typically is glaringly obvious to everyone but the person who was using the LLM.<p>My only guess as to why that is, is because most of what we do and why we do it we&#x27;re not conscious of. The threshold we&#x27;d intervene at is higher than the original effort it takes to do the right thing.<p>If these things don&#x27;t apply to you. Then I think you&#x27;re coming up to a golden era.
    • InfinityByTen10 hours ago
      I find the atrophy and zoning out or context switching problematic, because it takes a few seconds&#x2F; minutes in &quot;thinking&quot; and then BAM! I have 500 lines of all sorts of buggy and problematic code to review and get a sycophantic, not-enough-mature entity to correct.<p>At some point, I find myself needing to disconnect out of overwhelm and frustration. Faster responses isn&#x27;t necessarily better. I want more observability in the development process so that I can be a party to it. I really have felt that I need to orchestrate multiple agents working in tandem, playing sort of a bad-cop, good-cop and a maybe a third trying to moderate that discussion and get a fourth to effectively incorporate a human in the mix. But that&#x27;s too much to integrate in my day job.
    • ekropotin2 hours ago
      The solution for brain atrophy I personally arrived is to use coding agents at work, where, let’s be honest, velocity is a top priority and code purity doesn’t matter that much. Since we use stack I super familial with, I can quite fast verify produced code and tweak it if needed.<p>However, for hobby projects where I purposely use tech I’m not very familiar with, I force myself not to use LLMs at all - even as a chat. Thus, operating The old way - writing code manually, reading documentation, etc brings me a joy of learning back and, hopefully, establishes new neurone connections.
    • dkubb2 hours ago
      You could probably combat this somewhat with a skill that references to examples of the code you don&#x27;t want and the code you do. And then each time you tell it to correct the code you ask it to put that example into the references.<p>You then tell your agent to always run that skill prior to moving on. If the examples are pattern matchable you can even have the agent write custom lints if your linter supports extension or even write a poor man’s linter using ast-grep.<p>I usually have a second session running that is mainly there to audit the code and help me add and adjust skills while I keep the main session on the task of working on the feature. I&#x27;ve found this far easier to stay engaged than context switching between unrelated tasks.
    • amluto15 hours ago
      I’ve actually found the tool that inspires the most worry about brain atrophy to be Copilot. Vscode is full of flashing suggestions all over. A couple days ago, I wanted to write a very quick program, and it was basically impossible to write any of it without Copilot suggesting a whole series of ways to do what it thought I was doing. And it seems that MS wants this: the obvious control to turn it off is actually just “snooze.”<p>I found the setting and turned it off for real. Good riddance. I’ll use the hotkey on occasion.
      • mlrtime8 hours ago
        Yes! I spent more time trying to figure out how to turn off that garbage copilot suggesting then I did editing this 5 year old python program.<p>I use claude daily, no problems with it. But vscode + copilot suggestions was garbage!
    • SenHeng3 hours ago
      Another thing I’ve experienced is scope creep into the average. Both Claude and ChatGPT keep making recommendations and suggestions that turns the original request into something that resembles other typical features. Sometimes that’s a good thing, because it means I’ve missed something. A lot of times, especially when I’m just riffing on ideas, it turns into something mundane and ordinary and I’ll have lost my earlier train of thought.<p>A quick example is trying to build a simple expenses app with it. I just want to store a list of transactions with it. I’ve already written the types and data model and just need the AI to give me the plumping. And it will always end up inserting recommendations about double entry bookkeeping.
      • fragmede3 hours ago
        yeah but that&#x27;s like recommending a webserver for your Internet facing website. If you want to give an example of scope creep, you need a better example than double entry book keeping for an accounting app.
        • SenHeng2 hours ago
          You’ve just illustrated exactly the problem. You assumed I was building an accounting app. I’ve experienced the same issue with building features for calculating the brightness of a room, or 3D visualisations of brightness patterns, managing inventory and cataloguing lighting fixtures and so on.<p>It’s great for churning out stuff that already exists, but that also means it’ll massage your idea into one of them.
    • chickensong3 hours ago
      &gt; AI keeps pushing it in a direction I didn&#x27;t want<p>The AI definitely has preferences and attention issues, but there are ways to overcome them.<p>Defining code styles in a design doc, and setting up initial examples in key files goes a long way. Claude seems pretty happy to follow existing patterns under these conditions unless context is strained.<p>I have pretty good results using a structured workflow that runs a core loop of steps on each change, with a hook that injects instructions to keep attention focused.
    • sosomoxie18 hours ago
      I&#x27;ve gone years without coding and when I come back to it, it&#x27;s like riding a bike! In each iteration of my coding career, I have become a better developer, even after a large gap. Now I can &quot;code&quot; during my gap. Were I ever to hand-code again, I&#x27;m sure my skills would be there. They don&#x27;t atrophy, like your ability to ride a bike doesn&#x27;t atrophy. Yes you may need to warm back up, but all the connections in your brain are still there.
      • Ronsenshi11 hours ago
        You might still have the skillset to write code, but depending on length of the break your knowledge of tools, frameworks, patterns would be fairly outdated.<p>I used to know a person like that - high in the company structure who would claim he was a great engineer, but all the actual engineers would make jokes about him and his ancient skills during private conversations.
        • withinboredom10 hours ago
          I’d push back on this framing a bit. There&#x27;s a subtle ageism baked into the assumption that someone who stepped away from day-to-day coding has &quot;ancient skills&quot; worth mocking.<p>Yes, specific frameworks and tooling knowledge atrophy without use, and that’s true for anyone at any career stage. A developer who spent three years exclusively in React would be rusty on backend patterns too. But you’re conflating current tool familiarity with engineering ability, and those are different things.<p>The fundamentals: system design, debugging methodology, reading and reasoning about unfamiliar code, understanding tradeoffs ... those transfer. Someone with deep experience often ramps up on new stacks faster than you’d expect, precisely because they’ve seen the same patterns repackaged multiple times.<p>If the person you’re describing was genuinely overconfident about skills they hadn’t maintained, that’s a fair critique. But &quot;the actual engineers making jokes about his ancient skills&quot; sounds less like a measured assessment and more like the kind of dismissiveness that writes off experienced people before seeing what they can actually do.<p>Worth asking: were people laughing because he was genuinely incompetent, or because he didn’t know the hot framework of the moment? Because those are very different things.
          • Ronsenshi10 hours ago
            This has nothing to do about ageism. This applies to any person of any age who has ego big enough to think that their knowledge of industry is relevant after they take prolonged break and be socially inept enough to brag about how they are still &quot;in&quot;.<p>I don&#x27;t disagree with your point about fundamentals, but in an industry where there seems to be new JS framework any time somebody sneezes - latest tools are very much relevant too. And of course the big thing is language changes. The events I&#x27;m describing happened in the late 00s-early 10s. When language updates picked up steam: Python, JS, PHP, C++. Somebody who used C++ 98 can&#x27;t claim to have up to date knowledge in C++ in 2015.<p>So to answer your question - people were laughing at his ego, not the fact that he didn&#x27;t know some hot new framework.
            • withinboredom9 hours ago
              I beg to differ. I started with C in the 90s, then C# in &#x27;05, then PHP in &#x27;12, then Go in &#x27;21. The things I learned in C still apply to Go, C#, and PHP. And I even started contributing to open source C projects in &#x27;24 ... all my skills and knowledge were still relevant. This sounds exactly like ageism to me, but I clearly have a different perspective than you.
              • Ronsenshi6 hours ago
                Yes, we clearly have different perspectives. I observed an arrogant person who despite their multi-year break from engineering of any kind strongly believed that they still were as capable as engineers who remained in the field during that time.<p>Maybe you had to be there.
        • sosomoxie5 hours ago
          I code in Vim, use Linux... all of those tools are pretty constant. New frameworks are easy to pick up. I&#x27;ve been able to become productive with very little downtime after multi-year breaks several times.
      • runarberg13 hours ago
        Have you ever learnt a foreign language (say Mongolian, or Danish) and then never spoken it, nor even read anything in it for over 10 years? It is not like riding a bike, it doesn’t just come back like that. You have to actually relearn the language, practice it, and you will suck at it for months. Comprehension comes first (within weeks) but you will be speaking with grammatical errors, mispronunciations, etc. for much longer. You won‘t have to learn the language from scratch, second time around is much easier, but you <i>will</i> have to put in the effort. And if you use google translate instead of your brain, you won‘t relearn the language at all. You will simply forget it.
        • sosomoxie5 hours ago
          I have not and I&#x27;m actually really bad at learning human languages, but know a dozen programming languages. You would think they would be similar, but for some reason it&#x27;s really easy for me to program in any language and really hard for me to pick up a human language.
          • Miraste3 hours ago
            Learning human languages is not a similar process to learning programming languages at all. I&#x27;ve never been sure why so many people think it is.
            • runarberg2 hours ago
              I provided it as a counter example to the learning how to bike myth.<p>Learning how to bike requires only a handful of skills, most of them are located in the motor control centers in your brain (mostly in the Cerebellum), which is known to retain skills much better then any other parts of your brain. Your programing skills are comprised of thousands of separate skills which are mostly located in your frontal-cortex (mostly in your frontal and temporal lobes), and learning a foreign language is basically that but more (like 10x more).<p>So while a foreign language is not the perfect analogy (nothing is), I think it is a reasonable analogy as a counter example to the bicycle myth.
        • tayo4212 hours ago
          Anecdotally, i burned out pretty hard and basically didn&#x27;t open a text editor for half a year (unemployed too). Eventually i got an itch to write code again and it didn&#x27;t really feel like I was really worse. Maybe it wasn&#x27;t long enough atrophy but code doesn&#x27;t seem to quite work like language though ime.
          • Ronsenshi11 hours ago
            Six months is definitely not long enough of a break for skills to degrade. But it&#x27;s not just skills, as I wrote in another comment, the biggest thing is knowledge of new tools, new versions of language and its features.<p>I&#x27;d say there&#x27;s at most around 2 years of knowledge runtime (maybe with all this AI stuff this is even shorter). After that period if you don&#x27;t keep your knowledge up to date it fairly quickly becomes obsolete.
            • runarberg4 hours ago
              I would imagine there is probably some reverse S-curve of skill loss going on. The first year you may retain like 90% (and the 10% are obscure words, rare grammar structures, expressions, etc.), then in the next 2 years you loose more and more every year, and by the 3rd year you’ve lost like 50% of the language, including some common words, useful grammar structures, but retain common greetings, basic structures, etc. and then after like year 5 the regression starts to slow down and by year 10 you may still know 20%, but it is the most basic stuff, and you won‘t be able to use the language in any meaningful way.
    • kitd5 hours ago
      I think this is where tools like OpenSpec [1] may help. The deterioration in quality is because the context is degrading, often due to incomplete or amibiguous requests from the coder. With a more disciplined way of creating <i>and persisting locally</i> the specs for the work, especially if the agent got involved in creating that too, you&#x27;ll have a much better chance of keeping the agent focussed and aligned.<p>[1] - <a href="https:&#x2F;&#x2F;openspec.dev&#x2F;" rel="nofollow">https:&#x2F;&#x2F;openspec.dev&#x2F;</a>
    • zamalek19 hours ago
      &gt; I worry about the &quot;brain atrophy&quot; part, as I&#x27;ve felt this too. And not just atrophy, but even moreso I think it&#x27;s evolving into &quot;complacency&quot;.<p>Not trusting the ML&#x27;s output is step one here, that keeps you intellectually involved - but it&#x27;s still a far cry from solving the majority of problems yourself (instead you only solve problems ML did a poor job at).<p>Step two: I delineate interesting and uninteresting work, and Claude becomes a pair programmer <i>without keyboard access</i> for the latter - I bounce ideas off of it etc. making it an intelligent rubber duck. [Edit to clarify, a caveat is that] I do not bore myself with trivialities such as retrieving a customer from the DB in a REST call (but again, I do verify the output).
      • bandrami10 hours ago
        &gt; I do not bore myself with trivialities such as retrieving a customer from the DB in a REST call<p>Genuine question, why isn&#x27;t your ORM doing that? I see a lot of use cases for LLMs that seem to be more expensive ways to do snippets and frameworks...
    • freediver20 hours ago
      My experience is the opposite - I haven&#x27;t used my brain more in a while.. Typing characters was never what developers were valued for anyway. The joy of building is back too.
      • mlrtime8 hours ago
        100% same, I had brain fog before the llms, I got tired of reading new docs over and over again for new languages. I became a manager and lost it all.<p>Now back to IC with 25+ years of experience + LLM = god mode, and its fun again.
      • swader99920 hours ago
        Same. I feel I need to be way more into the domain and what the user is trying to do than ever before.
    • alansaber7 hours ago
      &quot;I wanted the code to look a certain way, but it kept pulling back to the way it wanted to do things.&quot;<p>I would argue this is ok for front-end. For back-end? very, very bad- if you can&#x27;t get a usable output do it by hand.
      • phrotoma7 hours ago
        &quot;rip it out&quot; is a phrase I&#x27;ve been saying more often to the robots.
    • epolanski20 hours ago
      &gt; Like if I had stated certain design goals recently it would adhere to them, but after a few iterations it would forget again and go back to its original approach, or mix the two, or whatever.<p>Context management, proper prompting and clear instructions, proper documentation are still relevant.
    • abm539 hours ago
      My advice: keep it on a tight leash.<p>In the happy case where I have a good idea of the changes necessary, I will ask it to do small things, step by step, and examine what it does and commit.<p>In the unhappy case where one is faced with a massive codebase and no idea where to start, I find asking it to just “do the thing” generates slop, but enough for me to use as inspiration for the above.
    • polytely19 hours ago
      I feel like I&#x27;m still a couple steps behind in skill level as my lead and is trying to gain more experience I do wonder if I am shooting myself in the foot if I rely too much on AI at this stage. The senior engineer I&#x27;m trying to learn from can very effectively use ai because he has very good judgement of code quality, I feel like if I use AI too much I might lose out on chance to improve my judgement. It&#x27;s a hard dilemma.
    • seer16 hours ago
      Honestly, this seems very much like the jump from being an individual contributor to being an engineering manager.<p>The time it happened for me was rather abrupt, with no training in between, and the feeling was eerily similar.<p>You know _exactly_ why the best solution is, you talk to your reports, but they have minds of their own, as well as egos, and they do things … their own way.<p>At some point I stopped obsessing with details and was just giving guidance and direction only in the cases where it really mattered, or when asked, but let people make their own mistakes.<p>Now LLMs don’t really learn on their own or anything, but the feeling of “letting go of small trivial things” is sorta similar. You concentrate on the bigger picture, and if it chose to do an iterative for loop instead of using a functional approach the way you like it … well the tests still pass, don’t they.
      • Ronsenshi11 hours ago
        The only issue is that as an engineering manager you reasonably expect that the team learns new things, improve their skills, in general grow as engineers. With AI and its context handling you&#x27;re working with a team where each member has severe brain damage that affects their ability to form long term memories. You can rewire their brain to a degree teaching them new &quot;skills&quot; or giving them new tools, but they still don&#x27;t actually learn from their mistakes or their experiences.
        • mlrtime8 hours ago
          As a manager I would encourage them to use the LLM tools. I would also encourage unit tests, e2e testing, testing coverages, CI pipelines automating the testing, automatic pr reviewing etc...<p>It&#x27;s also peeking at the big&#x2F;impactful changes and ignoring the small ones.<p>Your job isn&#x27;t to make sure they don&#x27;t have &quot;brain damage&quot; its to keep them productive and not shipping mistakes.
        • dysoco6 hours ago
          Being optimistic (or pessimistic heh), if things keep the trend then the models will evolve as well and will probably be quite better in one year than they are now.
    • keeganpoppen3 hours ago
      yeah, because the thing is: at the end of the day: laying things out the way LLMs can understand is becoming more important than doing them the “right” way— a more insidious form of the same complacency. and one in which i am absolutely complicit.
    • SpaceL10n5 hours ago
      LLMs are yet another layer between us and the end result. I remain wary of this distance and am super grateful I learned coding the hard way.
    • &gt; I want to say it&#x27;s very akin to doom scrolling. Doom tabbing? It&#x27;s like, yeah I could be more creative with just a tad more effort, but the AI is already running and the bar to seeing what the AI will do next is so low, so....<p>Yea exactly, Like we are just waiting so that it gets completed and after it gets completed then what? We ask it to do new things again.<p>Just as how if we are doom scrolling, we watch something for a minute then scroll down and watch something new again.<p>The whole notion of progress feels completely fake with this. Somehow I guess I was in a bubble of time where I had always end up using AI in web browsers (just as when chatgpt 3 came) and my workflow didn&#x27;t change because it was free but recently changed it when some new free services dropped.<p>&quot;Doom-tabbing&quot; or complete out of the loop AI agentic programming just feels really weird to me sucking the joy &amp; I wouldn&#x27;t even consider myself a guy particular interested in writing code as I had been using AI to write code for a long time.<p>I think the problem for me was that I always considered myself a computer tinker before coder. So when AI came for coding, my tinkering skills were given a boost (I could make projects of curiosity I couldn&#x27;t earlier) but now with AI agents in this autonomous esque way, it has come for my tinkering &amp; I do feel replaced or just feel like my ability of tinkering and my interests and my knowledge and my experience is just not taken up into account if AI agent will write the whole code in multi file structure, run commands and then deploy it straight to a website.<p>I mean my point is tinkering was an active hobby, now its becoming a passive hobby, doom-tinkering? I feel like I have caught up on the feeling a bit earlier with just vibe from my heart but is it just me who feels this or?<p>What could be a name for what I feel?
    • mupuff123413 hours ago
      He didn&#x27;t say &quot;brain atrophy&quot;, he was talking about coding abilities.
    • nathias8 hours ago
      it&#x27;s not about brain atrophy, it&#x27;s skill atrophy
    • lighthouse12123 hours ago
      [dead]
    • dirtytoken723 hours ago
      [dead]
    • stuaxo22 hours ago
      LLMs have some terrible patterns, don&#x27;t know what do ? Just chuck a class named Service in.<p>Have to really look out for the crap.
  • atonse1 day ago
    &gt; LLM coding will split up engineers based on those who primarily liked coding and those who primarily liked building.<p>I’ve always said I’m a builder even though I’ve also enjoyed programming (but for an outcome, never for the sake of the code)<p>This perfectly sums up what I’ve been observing between people like me (builders) who are ecstatic about this new world and programmers who talk about the craft of programming, sometimes butting heads.<p>One viewpoint isn’t necessarily more valid, just a difference of wiring.
    • ryandrake1 day ago
      I noticed the same thing, but wasn&#x27;t able to put it into words before reading that. Been experimenting with LLM-based coding just so I can understand it and talk intelligently about it (instead of just being that grouchy curmudgeon), and the thought in the back of my mind while using Claude Code is always:<p>&quot;I got into programming because I like programming, not whatever this is...&quot;<p>Yes, I&#x27;m building stupid things faster, but I didn&#x27;t get into programming because I wanted to build tons of things. I got into it for the thrill of defining a problem in terms of data structures and instructions a computer could understand, entering those instructions into the computer, and then watching victoriously while those instructions were executed.<p>If I was intellectually excited about telling something to do this for me, I&#x27;d have gotten into management.
      • viccis20 hours ago
        Same. This kind of coding feels like it got rid of the building aspect of programming that always felt nice, and it replaced it entirely with business logic concerns, product requirements, code reviews, etc. All the stuff I can generally take or leave. It&#x27;s like I&#x27;m always in a meeting.<p>&gt;If I was intellectually excited about telling something to do this for me, I&#x27;d have gotten into management.<p>Exactly this. This is the simplest and tersest way of explaining it yet.
        • zigman17 hours ago
          Maybe I don&#x27;t entirely get it, but what is stopping you to just continue coding?
          • nfgrep5 hours ago
            Speaking for myself, speed. I’d be noticeably slower than my peers if I was crafting code by hand all day.
        • taytus9 hours ago
          Because you are not coding, you are building. I&#x27;ve been coding since I was 7 years old, now I&#x27;m building.
          • mlrtime8 hours ago
            I&#x27;d go one step higher, we&#x27;re not builders, we&#x27;re problem solvers.<p>Sometimes the problem needs building, sometimes not.<p>I&#x27;m an Engineer, I see a problem and want to solve it. I don&#x27;t care if I have to write code, have a llm build something new, or maybe even destroy something. I want to solve the problem for the business and move to the next one, most of the time it is having a llm write code though.
      • nunez20 hours ago
        Same same. Writing the actual code is always a huge motivator behind my side projects. Yes, producing the outcome is important, but the journey taken to get there is a lot of fun for me.<p>I used Claude Code to implement a OpenAI 4o-vision powered receipt scanning feature in an expense tracking tool I wrote by hand four years ago. It did it in two or three shots while taking my codebase into account.<p>It was very neat, and it works great [^0], but I can&#x27;t latch onto the idea of writing code this way. Powering through bugs while implementing a new library or learning how to optimize my test suite in a new language is thrilling.<p>Unfortunately (for me), it&#x27;s not hard at all to see how the &quot;builders&quot; that see code as a means to an end would LOVE this, and businesses want builders, not crafters.<p>In effect, knowing the fundamentals is getting devalued at a rate I&#x27;ve never seen before.<p>[^0] Before I used Claude to implement this feature, my workflow for processing receipts looked like this: Tap iOS Shortcut, enter the amount, snap a pic of the receipt, type up the merchant, amount and description for the expense, then have the shortcut POST that to my expenses tracking toolkit which, then, POSTs that into a Google Sheet. This feature amounted the need for me to enter the merchant and amount. Unfortunately, it often took more time to confirm that the merchant, amount and date details OpenAI provided were correct (and correct it when details were wrong, which was most of the the time) than it did to type out those details manually, so I just went back to my manual workflow. However, the temptation to just glance at the details and tap &quot;This looks correct&quot; was extremely high, even if the info it generated was completely wrong! It&#x27;s the perfect analogue to what I&#x27;ve been witnessing throughout the rise of the LLMs.
      • polishdude2022 hours ago
        What I have enjoyed about programming is being able to get the computer to do exactly what I want. The possibilities are bounded by only what I can conceive in my mind. I feel like with AI that can happen faster.
        • testaccount2822 hours ago
          &gt; get the computer to do exactly what I want.<p>&gt; with AI that can happen faster.<p>well, not <i>exactly</i> that.
          • polishdude2019 hours ago
            For simple things it can. But then for more complex things that&#x27;s where I step it
        • chrisjj10 hours ago
          Have you an example of getting a coding chatbot to do exactly what you want?
          • simonw9 hours ago
            <a href="https:&#x2F;&#x2F;gisthost.github.io&#x2F;?a41ce6304367e2ced59cd237c576b817&#x2F;page-001.html" rel="nofollow">https:&#x2F;&#x2F;gisthost.github.io&#x2F;?a41ce6304367e2ced59cd237c576b817...</a> - which built <a href="https:&#x2F;&#x2F;github.com&#x2F;datasette&#x2F;datasette-transactions" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;datasette&#x2F;datasette-transactions</a> exactly the way I wanted it to be built
            • thefaux1 hour ago
              The examples that you and others provide are always fundamentally uninteresting to me. Many, if not most, are some variant of a CRUD application. I have yet seen a single ai generated thing that I personally wanted to use and&#x2F;or spend time with. I also can&#x27;t help but wonder what we might have accomplished if we devoted the same amount of resources to developing better tools, languages and frameworks to developers instead of automating the generation of boiler plate and selling developer&#x27;s own skills back to them. Imagine if open source maintainers instead had been flooded with billions of dollars in capital. What might be possible?<p>And also, the capacities of llms are almost besides the point. I don&#x27;t use llms but I have no doubt that for any arbitrary problem that can be expressed textually and is computable in finite time, in the limit as time goes to infinity, an llm will be able to solve it. The more important and interesting questions are what _should_ we build with llms and what should we _not_ build with them. These arguments about capacity are distracting from these more important questions.
              • simonw1 hour ago
                Considering how much time developers spend building uninteresting CRUD applications I would argue that if all LLMs can do is speed that process up they&#x27;re already worth their weight in bytes.<p>The impression I get from this comment is that <i>no example</i> would convince you that LLMs are worthwhile.
            • audience_mem9 hours ago
              The problem with replying to the proof-demanders is that they&#x27;ll always pick it apart and find some reason it doesn&#x27;t fit their definition. You must be familiar with that at this point.
              • chrisjj8 hours ago
                Worse, they might even attempt to verify your claims e.g. &quot;When AI &#x27;builds a browser,&#x27; check the repo before believing the hype&quot; <a href="https:&#x2F;&#x2F;www.theregister.com&#x2F;2026&#x2F;01&#x2F;26&#x2F;cursor_opinion&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.theregister.com&#x2F;2026&#x2F;01&#x2F;26&#x2F;cursor_opinion&#x2F;</a>
            • chrisjj8 hours ago
              &gt; exactly the way I wanted it to be built<p>You verified each line?
              • simonw4 hours ago
                I looked closely enough to confirm there were no architectural mistakes or nasty gotchas. It&#x27;s code I would have been happy to write myself, only here I got it written on my phone while riding the BART.
              • mlrtime8 hours ago
                What? Why would you want to?<p>See this is a perfect example of OPs statement! I don&#x27;t care about the lines, I care about the output! It was never about the lines of code.<p>Your comment makes it very clear there are different viewpoints here. We care about problem-&gt;solution. You care about the actual code more than the solution.
                • chrisjj7 hours ago
                  &gt; I don&#x27;t care about the lines, I care about the output! It was never about the lines of code.<p>&gt; Your comment makes it very clear there are different viewpoints here.<p>Agreed.<p>I care that code output not include leaked secrets, malware installation, stealth cryptomining etc.<p>Some others don&#x27;t.
          • audience_mem9 hours ago
            Is this a joke? Are you genuinely implying that no one has ever got an LLM to write code that does exactly what they want?
            • chrisjj8 hours ago
              No. Mashing up other peoples&#x27; code scraped from the web is not what I&#x27;d call writing code.
              • audience_mem8 hours ago
                Can you not see how you truly, deep down, are afraid you might be wrong?<p>It&#x27;s clouding your vision.
      • smhinsey16 hours ago
        This gets at the heart of the quality of results issues a lot of people are talking about elsewhere here. Right now, if you treat them as a system where you can tell it what you want and it will do it for you, you&#x27;re building a sandcastle. Instead of that, also describe the correct data structures and appropriate algorithms to use against them, as well as the particulars of how you want the problem solved, it&#x27;s a different situation altogether. Like most systems, the quality of output is in some way determined by the quality of input.<p>There is a strange insistence on not helping the LLM arrive at the best outcome in the subtext to this question a lot of times. I feel like we are living through the John Henry legend in real time
      • thepasch11 hours ago
        &gt; I got into it for the thrill of defining a problem in terms of data structures and instructions a computer could understand, entering those instructions into the computer, and then watching victoriously while those instructions were executed.<p>You can still do that with Claude Code. In fact, Claude Code works best the more granular your instructions get.
        • chrisjj10 hours ago
          &gt; Claude Code works best the more granular your instructions get.<p>So best feed it machine code?
      • atonse22 hours ago
        Funny you say that. Because I have never enjoyed management as much as being hands on and directly solving problems.<p>So maybe our common ground is that we are direct problem solvers. :-)
        • Ronsenshi11 hours ago
          For some reason this makes me think of a jigsaw puzzle. People usually complete these puzzles because they enjoy the process where on the end you get a picture that you can frame if you want to. Some people seem to want to get the resulting picture. No interest in process at all.<p>I guess that&#x27;s the same people who went to all those coding camps during their hay day because they heard about software engineering salaries. They just want the money.
          • direwolf207 hours ago
            When I last bought a Lego Technic set because I wanted to play with making mechanisms with gears and stuff, I assembled it according to the instructions, which was fun, and then the final result was also cool and I couldn&#x27;t bear to dismantle it.
    • lelanthran52 minutes ago
      &gt; &gt; LLM coding will split up engineers based on those who primarily liked coding and those who primarily liked building.<p>&gt; I’ve always said I’m a builder even though I’ve also enjoyed programming (but for an outcome, never for the sake of the code)<p>&gt; This perfectly sums up what I’ve been observing between people like me (builders) who are ecstatic about this new world and programmers who talk about the craft of programming, sometimes butting heads.<p>That&#x27;s one take, sure, but it&#x27;s a specially crafted one to make you feel good about your position in this argument.<p>The counter-argument is that LLM coding splits up engineers based on those who primarily like engineering and those who like managing.<p>You&#x27;re obviously one of the latter. I, OTOH, prefer engineering.
    • addisonj1 day ago
      IMO, this isn&#x27;t entirely a &quot;new world&quot; either, it is just a new domain where the conversation amplifies the opinions even more (weird how that is happening in a lot of places)<p>What I mean by that: you had compiled vs interpreted languages, you had types vs untyped, testing strategies, all that, at least in some part, was a conversation about the tradeoffs between moving fast&#x2F;shipping and maintainability.<p>But it isn&#x27;t just tech, it is also in methodologies and the words use, from &quot;build fast and break things&quot; and &quot;yagni&quot; to &quot;design patterns&quot; and &quot;abstractions&quot;<p>As you say, it is a different viewpoint... but my biggest concern with where are as industry is that these are not just &quot;equally valid&quot; viewpoints of how to build software... it is quite literally different stages of software, that, AFAICT, pretty much all successful software has to go through.<p>Much of my career has been spent in teams at companies with products that are undergoing the transition from &quot;hip app built by scrappy team&quot; to &quot;profitable, reliable software&quot; and it is <i>painful</i>. Going from something where you have 5 people who know all the ins and outs and can fix serious bugs or ship features in a few days to something that has easy clean boundaries to scale to 100 engineers of a wide range of familiarities with the tech, the problem domain, skill levels, and opinions is just really hard. I am not convinced yet that AI will solve the problem, and I am also unsure it doesn&#x27;t risk making it worse (at least in the short term)
      • dpflan22 hours ago
        “””<p>Much of my career has been spent in teams at companies with products that are undergoing the transition from &quot;hip app built by scrappy team&quot; to &quot;profitable, reliable software&quot; and it is painful. Going from something where you have 5 people who know all the ins and outs and can fix serious bugs or ship features in a few days to something that has easy clean boundaries to scale to 100 engineers of a wide range of familiarities with the tech, the problem domain, skill levels, and opinions is just really hard. I am not convinced yet that AI will solve the problem, and I am also unsure it doesn&#x27;t risk making it worse (at least in the short term)<p>“””<p>This perspective is crucial. Scale is the great equalizer &#x2F; demoralizer, scale of the org and scale of the systems. Systems become complex quickly, and verifiability of correctness and function becomes harder. Companies that built from day with AI and have AI influencing them as they scale, where does complexity begin to run up against the limitations of AI and cause regression? Or if all goes well, amplification?
    • coffeeaddict123 hours ago
      But how can you be a responsible builder if you don&#x27;t have trust in the LLMs doing the &quot;right thing&quot;? Suppose you&#x27;re the head of a software team where you&#x27;ve picked up the best candidates for a given project, in that scenario I can see how one is able to <i>trust</i> the team members to orchestrate the implementation of your ideas and intentions, with you not being intimately familiar with the details. Can we place the same trust in LLM agents? I&#x27;m not sure. Even if one could somehow prove that LLM are very reliable, the fact an AI agents aren&#x27;t <i>accountable</i> beings renders the whole situation vastly different than the human equivalent.
      • handoflixue12 hours ago
        Trust but verify:<p>I test all of the code I produce via LLMs, usually doing fairly tight cycles. I also review the unit test coverage manually, so that I have a decent sense that it really is testing things - the goal is less perfect unit tests and more just quickly catching regressions. If I have a lot of complex workflows that need testing, I&#x27;ll have it write unit tests and spell out the specific edge cases I&#x27;m worried about, or setup cheat codes I can invoke to test those workflows out in the UI&#x2F;CLI.<p>Trust comes from using them often - you get a feeling for what a model is good and bad at, and what LLMs in general are good and bad at. Most of them are a bit of a mess when it comes to UI design, for instance, but they can throw together a perfectly serviceable &quot;About This&quot; HTML page. Any long-form text they write (such as that About page) is probably trash, but that&#x27;s super-easy to edit manually. You can often just edit down what they write: they&#x27;re actually decent writers, just very verbose and unfocused.<p>I find it similar to management: you have to learn how each employee works. Unless you&#x27;re in the Top 1%, you can&#x27;t rely on every employee giving 110% and always producing perfect PRs. Bugs happen, and even NASA-strictness doesn&#x27;t bring that down to zero.<p>And just like management, some models are going to be the wrong employee for you because they think your style guide is stupid and keep writing code how they think it should be written.
      • inerte23 hours ago
        You don&#x27;t simply put a body in a seat and get software. There are entire systems enabling this trust: college, resume, samples, referral, interviews, tests and CI, monitoring, mentoring, and performance feedback.<p>And accountability can still exist? Is the engineer that created or reviewed a Pull Request using Claude Code less accountable then one that used PICO?
        • coffeeaddict123 hours ago
          &gt; And accountability can still exist? Is the engineer that created or reviewed a Pull Request using Claude Code less accountable then one that used PICO?<p>The point is that in the human scenario, you can hold the human agents accountable. You cannot do that with AI. Of course, you as the orchestrator of agents will be accountable to someone, but you won&#x27;t have the benefit of holding your &quot;subordinates&quot; accountable, which is what you do in a human team. IMO, this renders the whole situation vastly different (whether good or bad I&#x27;m not sure).
          • polishdude2021 hours ago
            You can switch to another LLM provider or stop using them altogether. It&#x27;s even easier than firing a developer.
            • ipaddr21 hours ago
              It is as easy as getting rid of Microsoft Teams at your org.
        • chrisjj19 hours ago
          Of course he is - because he invested so much less.
    • giancarlostoro5 hours ago
      Yeah, I think this is a bit of insight I had not realized &#x2F; been able to word correctly yet. There&#x27;s developers who can let Claud go at it, and be fearless about it like me (though I mostly do it for side projects, but WOW) and then there&#x27;s developers who will use it like a hammer or axe to help cut down or mold whatever is in their path.<p>I think both approaches are okay, the biggest thing for me is the former needs to test way more, and review the code more, as developers we don&#x27;t read code enough, with the &quot;prompt and forget&quot; approach we have a lot of free time we could spend reading the code, asking the model to refactor and refine the code. I am shocked when I hear about hundreds of thousands of lines in some projects. I&#x27;ve rebuilt Beads from the ground up and I&#x27;m under 10 lines of code.<p>So we&#x27;re going to have various level of AI Code Builders if you will: Junior, Mid, Senior, Architect. I don&#x27;t know if models will ever pick up the slack for Juniors any time soon. We would need massive context windows for models, and who will pay for that? We need a major AI breakthrough to where the cost goes down drastically before that becomes profitable.
    • chrisjj10 hours ago
      &gt; &gt; LLM coding will split up engineers based on those who primarily liked coding and those who primarily liked building.<p>This is much less significant than the fact LLMs split engineers on those who primarily like quality v. those who primarily like speed.
      • chickensong2 hours ago
        That split has always existed. LLMs can be used on either side of the divide.
        • chrisjj1 hour ago
          We see a ton of &quot;AI let me code a program X faster than ever before.&quot;<p>We see almost no &quot;AI let me code a program X better than ever before.&quot;
          • chickensong10 minutes ago
            I can&#x27;t argue that. The scale was already imbalanced as well, and vibe coding has lowered the bar even more, so the gap will continue to grow for now.<p>I&#x27;m just saying that LLMs aren&#x27;t causing the divide. Accelerating yes, but I think simply equating AI usage to poor quality is wrong. Craftsmen now have a powerful tool as well, to analyze, nitpick, and refactor in ways that were previously difficult to justify.<p>It also seems premature for so many devs to jump to hardline &quot;AI bad&quot; stances. So far the tech is improving quite well. We may not be able to 1-shot much of quality yet, but it remains to be seen if that will hold.<p>Personally, I have hopes that AI will eventually push code quality much higher than it&#x27;s ever been. I might be totally wrong of course, but to me it feels logical that computers would be very good at writing computer programs once the foundation is built.
          • Philpax35 minutes ago
            See this episode of Oxide and Friends, where they discuss just that: <a href="https:&#x2F;&#x2F;oxide-and-friends.transistor.fm&#x2F;episodes&#x2F;engineering-rigor-in-the-llm-age" rel="nofollow">https:&#x2F;&#x2F;oxide-and-friends.transistor.fm&#x2F;episodes&#x2F;engineering...</a>
    • senderista21 hours ago
      Maybe there&#x27;s an intermediate category: people who like designing software? I personally find system design more engaging than coding (even though I enjoy coding as well). That&#x27;s different from just producing an opaque artifact that seems to solve my problem.
    • nfgrep5 hours ago
      I’ve heard something similar: “there are people who enjoy the process, and people who enjoy the outcome”. I think this saying comes moreso from artistic circles.<p>I’ve always considered myself a “process” person, I would even get hung-up on certain projects because I enjoyed crafting them so much.<p>LLM’s have taken a bit of that “process” enjoyment from me, but I think have also forced some more “outcome” thinking into my head, which I’m taking as a positive.
    • concats11 hours ago
      I remember leaving university going into my first engineering job, thinking &quot;Where is all the engineering? All the problem solving and building complex system? All the math and science? Have I been demoted to a lowly programmer?&quot;<p>Took me a few years to realize that this wasn&#x27;t a universal feeling, and that many others found the programming tasks more fulfilling than any challenging engineering. I suppose this is merely another manifestation of the same phenomena.
    • mkozlows1 day ago
      I think he&#x27;s really getting at something there. I&#x27;ve been thinking about this a lot (in the context of trying to understand the persistent-on-HN skepticism about LLMs), and the framing I came up with[1] is top-down vs. bottom-up dev styles, aka architecting code and then filling in implementations, vs. writing code and having architecture evolve.<p>[1] <a href="https:&#x2F;&#x2F;www.klio.org&#x2F;theory-of-llm-dev-skepticism&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.klio.org&#x2F;theory-of-llm-dev-skepticism&#x2F;</a>
      • jamauro13 hours ago
        I like this framing. Nice typography btw, a pleasure to read.
    • verdverm1 day ago
      I think the division is more likely tied to writing. You have to fundamentally change how you do your job, from one of writing a formal language for a compiler to one of writing natural language for a junior-goldfish-memory-allstar-developer, closer to management then to contributor.<p>This distinction to me separates the two primary camps
    • bjackman9 hours ago
      There&#x27;s more to it than just coding Vs building though.<p>For a long time in my career now I&#x27;ve been in a situation where I&#x27;d be able to build more if I was willing to abstract myself and become a slide-merchant&#x2F;coalition-builder. I don&#x27;t want to do this though.<p>Yet, I&#x27;m still quite an enthusiastic vibe-coder.<p>I think it&#x27;s less about coding Vs building and more about tolerance for abstraction and politics. And I don&#x27;t think there are that many people who are so intolerant of abstraction that they won&#x27;t let agents write a bunch of code for them.
    • codyb16 hours ago
      I think there&#x27;s a place for both.<p>We have services deployed globally serving millions of customers where rigor is really important.<p>And we have internal users who&#x27;re building browser extensions with AI that provide valuable information about the interface they&#x27;re looking at including links to the internal record management, and key metadata that&#x27;s affecting content placement.<p>These tools could be handed out on Zip drives in the street and it would just show our users some of the metadata already being served up to them, but it&#x27;s amazing to strip out 75% of the process of certain things and just have our user (in this case though, it&#x27;s one user who is driving all of this, so it does take some technical inclination) build out these tools that save our editors so much time when doing this before would have been months and months and months of discovery and coordination and designs that probably wouldn&#x27;t actually be as useful in the end after the wants of the user are diluted through 18 layers of process.
    • netcraft5 hours ago
      agree completely. I used to be (and still would love to be) a process person, enjoying hand writing bulletproof artisanal code. Switching to startups many years ago gave me a whole new perspective, and its been interesting the struggle between writing code and shipping. Especially when you dont know how long the code you are writing will actually live. LLMs are fantastic in that space.
    • asimovDev11 hours ago
      To me this is similar to car enthusiasms. Some people absolutely love to build their project car, it&#x27;s a major part of the hobby for them. Others just love the experience of driving, so they buy ready cars or just pay someone to work on the car.
      • stevenhuang10 hours ago
        Alternatively, others just want to get to their destination.
    • jimbokun1 day ago
      The new LLM centered workflow is really just a management job now.<p>Managers and project managers are valuable roles and have important skill sets. But there&#x27;s really very little connection with the role of software development that used to exist.<p>It&#x27;s a bit odd to me to include both of these roles under a single label of &quot;builders&quot;, as they have so little in common.<p>EDIT: this goes into more detail about how coding (and soon other kinds of knowledge work) is just a management task now: <a href="https:&#x2F;&#x2F;www.oneusefulthing.org&#x2F;p&#x2F;management-as-ai-superpower&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.oneusefulthing.org&#x2F;p&#x2F;management-as-ai-superpower...</a>
      • simianwords22 hours ago
        i don&#x27;t disagree. at some point LLM&#x27;s might become good enough that we wouldn&#x27;t need exact technical expertise.
    • slaymaker19071 day ago
      I enjoy both and have ended up using AI a lot differently than vibe coders. I rarely use it for generating implementations, but I use it extensively for helping me understand docs&#x2F;apis and more importantly, for debugging. AI saves me so much time trying to figure out why things aren’t working and in code review.<p>I deliberately avoid full vibe coding since I think doing so will rust my skills as a programmer. It also really doesn’t save much time in my experience. Once I have a design in mind, implementation is not the hard part.
    • greenie_beans6 hours ago
      makes sense if you are a data scientist where people need to be boxed into tidy little categories. but some people probably fall into both categories.
    • monkaiju17 hours ago
      So far I haven&#x27;t seen it actually be effective at &quot;building&quot; in a work context with any complexity, and this despite some on our team desperately trying to make that the case.
      • FeepingCreature12 hours ago
        I have! You have to be realistic about the projects. The more irreducible local context it needs, the less useful it will be. Great for greenfield code, oneshots, write once read once run for months.
      • barrell16 hours ago
        Agreed. I don’t care for engineering or coding, and would gladly give it up the moment I can. I’m also running a one man business where every hour counts (and where I’m responsible for maintaining every feature).<p>The fact of the matter is LLMs produce lower quality at higher volumes in more time than it would take to write it myself, and I’m a very mediocre engineer.<p>I find this seperation of “coding” vs “building” so offensive. It’s basically just saying some people are only concerned with “inputs”, while others with “outputs”. This kind of rhetoric is so toxic.<p>It’s like saying LLM art is separating people into people who like to scribble, and people who like to make art.
    • globular-toast11 hours ago
      I like building, but I don&#x27;t fool myself into thinking it can be done by taking shortcuts. You could build something that looks like a house for half the cost but it won&#x27;t be structurally sound. That&#x27;s why I care about the details. Someone has to.
    • &gt; I enjoy both and have ended up using AI a lot differently than vibe coders. I rarely use it for generating implementations, but I use it extensively for helping me understand docs&#x2F;apis and more importantly, for debugging. AI saves me so much time trying to figure out why things aren’t working and in code review.<p>I had felt like this and still do but man, at some point, I feel like the management churn feels real &amp; I just feel suffering from a new problem.<p>Suppose, I actually end up having services literally deployed from a single prompt nothing else. Earlier I used to have AI write code but I was interested in the deployment and everything around it, now there are services which do that really neatly for you (I also really didn&#x27;t give into the agent hype and mostly used browsers LLM)<p>Like on one hand you feel more free to build projects but the whole joy of project <i>completely</i> got reduced.<p>I mean, I guess I am one of the junior dev&#x27;s so to me AI writing code on topics I didn&#x27;t know&#x2F;prototyping felt awesome.<p>I mean I was still involved in say copy pasting or looking at the code it generates. Seeing the errors and sometimes trying things out myself. If AI is doing all that too, idk<p>For some reason, recently I have been disinterested in AI. I have used it quite a lot for prototyping but I feel like this <i>complete</i> out of the loop programming just very off to me with recent services.<p>I also feel like there is this sense of if I buy for some AI thing, to maximally extract &quot;value&quot; out of it.<p>I guess the issue could be that I can have vague terms or have a very small text file as input (like just do X alternative in Y lang) and I am now unable to understand the architectural decisions and the overwhelmed-ness out of it.<p>Probably gonna take either spec-driven development where I clearly define the architecture or development where I saw something primagen do recently which is that the AI will only manipulate code of that particular function, (I am imagining it for a file as well) and somehow I feel like its something that I could enjoy more because right now it feels like I don&#x27;t know what I have built at times.<p>When I prototype with single file projects using say browser for funsies&#x2F;any idea. I get some idea of what the code kind of uses with its dependencies and functions names from start&#x2F;end even if I didn&#x27;t look at the middle<p>A bit of ramble I guess but the thing which kind of is making me feel this is that I was talking to somebody and shwocasing them some service where AI + server is there and they asked for something in a prompt and I wrote it. Then I let it do its job but I was also thinking how I would architect it (it was some detect food and then find BMR, and I was thinking first to use any api but then I thought that meh it might be hard, why not use AI vision models, okay what&#x27;s the best, gemini seems good&#x2F;cheap)<p>and I went to the coding thing to see what it did and it actually went even beyond by using the free tier of gemini (which I guess didn&#x27;t end up working could be some rate limit of my own key but honestly it would&#x27;ve been the thing I would&#x27;ve tried too)<p>So like, I used to pride myself on the architectural decisions I make even if AI could write code faster but now that is taken away as well.<p>I really don&#x27;t want to read AI code so much so honestly at this point, I might as well write code myself and learn hands on but I have a problem with build fast in public like attitude that I have &amp; just not finding it fun.<p>I feel like I should do a more active job in my projects &amp; I am really just figuring out what&#x27;s the perfect way to use AI in such contexts &amp; when to use how much.<p>Thoughts?
  • jermberj21 minutes ago
    &gt; The most common category is that the models make wrong assumptions on your behalf and just run along with them without checking. They also don&#x27;t manage their confusion, they don&#x27;t seek clarifications, they don&#x27;t surface inconsistencies, they don&#x27;t present tradeoffs, they don&#x27;t push back when they should, and they are still a little too sycophantic.<p>Does this not undercut everything going on here. Like, what?
    • awsanswers14 minutes ago
      It&#x27;s predictable so you run defense around it with prompting, validation and model tuning. It generates volumes of working code in seconds from natural language prompts so it&#x27;s extremely business efficient. We&#x27;re talking about tools that generate correct code to 95% of a solution, the follow up human and automated test review, and second coding pass to fix the 5% are a non issue.
  • markb13911 hours ago
    I retired from paid sw dev work in 2020 when COVID arrived. I’ve worked on my small projects since with all development by hand. I’d followed the rise of AI, but not used it. Late last year I started a project that included reverse engineering some firmware that runs on an Intel 8096 based embedded processor. I’d never worked on that processor before. There are tools available, but they cost many $. So, I started to think about a simple disassembler. 2 weeks ago we decided to try Claude to see what it could do. We now have a disassembler, assembler and a partially working emulator. No doubt there are bugs and missing features and the code is a bit messy, but boy has it sped up the work. One thing did occur to me. Vendors of small utilities could be in trouble. For example I needed to cut out some pages from a pdf. I could have found a tool online(I’m sure there are several), write one myself. However, Claude quickly performed the task.
    • gyomu4 hours ago
      &gt; Vendors of small utilities could be in trouble<p>This is a mix of the “in the future, everyone will have a 3D printer at home and just 3D print random parts they need” and “anyone can trivially build Dropbox with rsync themselves” arguments.<p>Tech savvy users who know how to use LLMs aren’t how vendors of small utilities stay in business.<p>They stay in business because they sell things to users who are truly clueless with tech (99% of the population, which can’t even figure out the settings app on their phone), and solid distribution&#x2F;marketing is how you reach those users and can’t really be trivially hacked because everyone is trying to hack it.<p>Or they stay in business because they offer some sort of guarantee (whether legal, technical, or other) that the users don’t want to burden themselves with because they have other, more important stuff to worry about.
      • CamperBob220 minutes ago
        I don&#x27;t know. It&#x27;s one thing to tell Joe or Jane User to &quot;Get an FTP account, mount it locally with curlftpfs, and then use SVN or CVS on the mounted filesystem.&quot; But if Joe or Jane can just cut-and-paste that advice into a prompt and get their own personal Dropbox...
    • TeMPOraL10 hours ago
      &gt; <i>Vendors of small utilities could be in trouble. For example I needed to cut out some pages from a pdf. I could have found a tool online(I’m sure there are several), write one myself. However, Claude quickly performed the task.</i><p>Definitely. Making small, single-purpose utilities with LLMs is almost as easy these days as googling for them on-line - much easier, in fact, if you account for time spent filtering out all the malware, adware, &quot;to finish the process, register an account&quot; and plain broken &quot;tools&quot; that dominate SERP.<p>Case in point, last time my wife needed to generate a few QR codes for some printouts for an NGO event, I just had LLM make one as a static, single-page client-side tool and hosted it myself -- because that was the fastest way to guarantee it&#x27;s fast, reliable, free of surveillance economy bullshit, and doesn&#x27;t employ URL shorteners (surprisingly common pattern that sometimes becomes a nasty problem down the line; see e.g. a high-profile case of some QR codes on food products leading to porn sites after shortlink got recycled).
      • Antibabelic8 hours ago
        Whatever happened to just typing &quot;apt install qrencode&quot;? It&#x27;s definitely &quot;fast, reliable, free of surveillance economy bullshit, and doesn&#x27;t employ URL shorteners&quot;.
        • senko5 hours ago
          You need to know &quot;qrencode&quot; exists under that exact name. Claude already knows about it and how to use it.
          • Antibabelic5 hours ago
            Sure, but that&#x27;s entirely different from vibe-coding a tool, which sounds like a colossal waste of resources.
            • simonw2 hours ago
              Having an LLM spit out a few hundred lines of HTML and JavaScript is not a colossal waste of resources, it&#x27;s equivalent to running a microwave for a couple of seconds.
            • agos4 hours ago
              as long as that wast and the associated cost is heavily subsidized as it is today, nobody will care
        • direwolf202 hours ago
          Users can&#x27;t use command–line tools. They just can&#x27;t. It has to be user–friendly or it doesn&#x27;t exist.
        • simonw2 hours ago
          A &quot;static, single-page client-side tool&quot; is <i>so much better</i> than &quot;Step 1: install Linux...&quot;
  • jedberg19 hours ago
    &gt; You realize that stamina is a core bottleneck to work<p>There has been a lot of research that shows that grit is far more correlated to success than intelligence. This is an interesting way to show something similar.<p>AIs have endless grit (or at least as endless as your budget). They may outperform us simply because they don&#x27;t ever get tired and give up.<p>Full quote for context:<p>Tenacity. It&#x27;s so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day. It&#x27;s a &quot;feel the AGI&quot; moment to watch it struggle with something for a long time just to come out victorious 30 minutes later. You realize that stamina is a core bottleneck to work and that with LLMs in hand it has been dramatically increased.
    • djeastm8 hours ago
      &gt;They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day.<p>&quot;Listen, and understand! That Terminator is out there! It can&#x27;t be bargained with. It can&#x27;t be reasoned with. It doesn&#x27;t feel pity, or remorse, or fear. And it absolutely will not stop... ever, until you are dead!&quot;
    • Loeffelmann12 hours ago
      If you ever work with LLMs you know that they quite frequently give up.<p>Sometimes it&#x27;s a<p><pre><code> &#x2F;&#x2F; TODO: implement logic </code></pre> or a<p>&quot;this feature would require extensive logic and changes to the existing codebase&quot;.<p>Sometimes they just declare their work done. Ignoring failing tests and builds.<p>You can nudge them to keep going but I often feel like, when they behave like this, they are at their limit of what they can achieve.
      • wongarsu10 hours ago
        If I tell it to implement something it will sometimes declare their work done before it&#x27;s done. But if I give Claude Code a verifiable goal like making the unit tests pass it will work tirelessly until that goal is achieved. I don&#x27;t always like the solution, but the tenacity everyone is talking about is there
        • koiueo8 hours ago
          &gt; but the tenacity everyone is talking about is there<p>I always double-check if it doesn&#x27;t simply exclude the failing test.<p>The last time I had this, I discovered it later in the process. When I pointed this out to the LLM, it responded, that it acknowledged thefact of ignoring the test in CLAUDE.md, and this is justified because [...]. In other words, &quot;known issue, fuck off&quot;
        • jpnc7 hours ago
          tenacity == while loop
      • mlrtime8 hours ago
        Nope, not for me, unless I tell it to.<p>Context matters, for an LLM just like a person. When I wrote code I&#x27;d add TODOs because we cannot context switch to another problem we see every time.<p>But you can keep the agent fixated on the task AND have it create these TODOs, but ultimately it is your responsibility to find them and fix them (with another agent).
      • jedberg11 hours ago
        &gt; If you ever work with LLMs you know that they quite frequently give up.<p>If you try to single shot something perhaps. But with multiple shots, or an agent swarm where one agent tells another to try again, it&#x27;ll keep going until it has a working solution.
        • alansaber7 hours ago
          Yeah exactly this is a scope problem, actual input&#x2F;output size is always limited&gt; I am 100% sure CC etc are using multiple LLM calls for each response, even though from the response streaming it looks like just one.
      • energy12311 hours ago
        Using LLMs to clean those up is part of the workflow that you&#x27;re responsible for (... for now). If you&#x27;re hoping to get ideal results in a single inference, forget it.
    • ryanjshaw9 hours ago
      I realized a long time ago that I’m better at computer stuff not because I’m smarter but because I will sit there all day and night to figure something out while others will give up. I always thought that was my superpower in the job industry but now I’m not so sure if it will transfer to getting AI to do what I need done…
      • mlrtime8 hours ago
        Same, I barely made it through Engineering school, but would stay up all night figuring out everything a computer could do (before the internet).<p>I did it because I enjoyed it, and still do. I just do it with LLMs now. There is more to figure out than ever before and things get created faster than I have time to understand them.<p>LLM should be enabling this, not making it more depressing.
        • Schlagbohrer4 hours ago
          Me three. I was not as smart as many of my peers in uni but I freakin <i>LOVE</i> the subject matter and I also love studying and feeling that progress of learning, which led me to put in the huge number of hours necessary to be successful and have a positive attitude the whole time.
    • michalsustr9 hours ago
      The tenacity aspect makes me worried about the paper clip AI misalignment scenario more than before.
    • dust4211 hours ago
      &gt; AIs have endless grit (or at least as endless as your budget).<p>That is the only thing he doesn&#x27;t address: the money it costs to run the AI. If you let the agents loose, they easily burn north of 100M tokens per hour. Now at $25&#x2F;1M tokens that gets quickly expensive. At some point, when we are all drug^W AI dependent, the VCs will start to cash in on their investments.
    • AnimalMuppet3 hours ago
      But even tenacity is not enough. You also need an internal timer. &quot;Wait a minute, this is taking too long, it shouldn&#x27;t be this hard. Is my overall approach completely wrong?&quot;<p>I&#x27;m not sure AIs have that. Humans do, or at least the good ones do. They don&#x27;t quit on the <i>problem</i>, but they know when it&#x27;s time to consider quitting on the <i>approach</i>.
    • gregjor9 hours ago
      LLMs do not have grit or tenacity. Tenacity doesn&#x27;t desribe a machine that doesn&#x27;t need sleep or experience tiredness, or stress. Grit doesn&#x27;t describe a chatbot that will tirelessly spew out answers and code because it has no stake or interest in the result, never perceives that it doesn&#x27;t know something, and never reflects on its shortcomings.
    • lighthouse12123 hours ago
      [dead]
  • netcraft5 hours ago
    &gt; Tenacity. It&#x27;s so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day.<p>This is true to an extent for sure and they will go much longer than most engineers without getting &quot;tired&quot;, but I&#x27;ve def seen both sonnet and opus give up multiple times. They&#x27;ve updated code to skip tests they couldn&#x27;t get to pass, given up on bugs they couldn&#x27;t track down, etc. I literally had it ask &quot;could we work on something else and come back to this&quot;
    • lucianbr4 hours ago
      The glorified autocomplete. Why would the LLM &quot;work on something else then get back on this&quot;, is it&#x27;s subconscious going to solve the problem during that time?<p>But because people say it, it says it too. Making sense is optional.
      • havefunbesafe4 hours ago
        Ive found that clearing the context and getting back to it later actually DOES work. When you restart, your personal context is cleared and you might be better at describing the problem you are solving in a more informationally dense way.
      • Davidzheng1 hour ago
        not impossible right? the new context can provide some needed hints, etc...
    • Schlagbohrer4 hours ago
      Reminiscent of a time just a year or two ago where the LLMs would get downright frustrated and sassy
    • manbash3 hours ago
      Oh, definitely. Also, they end up getting stuck in a loop, adding and removing the same code endlessly.<p>And then someone comes and &quot;improves&quot; their agent with additional &quot;do not repeat yourself&quot; prompts scattered all over the place, to no avail.<p>&quot;Asinine&quot; describes my experience perfectly.
  • 0xbadcafebee22 hours ago
    &gt; What happens to the &quot;10X engineer&quot; - the ratio of productivity between the mean and the max engineer? It&#x27;s quite possible that this grows <i>a lot</i>.<p>I was thinking about this the other day as relates to the DevOps movement.<p>The DevOps movement started as a way to accelerate and improve the results of dev&lt;-&gt;ops team dynamics. By changing practices and methods, you get acceleration and improvement. That creates &quot;high-performing teams&quot;, which is the team form of a 10x engineer. Whether or not you believe in &#x27;10x engineers&#x27;, a high-performing team is real. You really can make your team deploy faster, with fewer bugs. You have to change how you all work to accomplish it, though.<p>To get good at using AI for coding, you have to do the same thing: continuous improvement, changing workflows, different designs, development of trust through automation and validation. Just like DevOps, this requires learning brand new concepts, and changing how a whole team works. This didn&#x27;t get adopted widely with DevOps because nobody wanted to learn new things or change how they work. So it&#x27;s possible people won&#x27;t adapt to the &quot;better&quot; way of using AI for coding, even if it would produce a 10x result.<p>If we want this new way of working to stick, it&#x27;s going to require education, and a change of engineering culture.
    • virgilp6 hours ago
      This is an interesting thing that I&#x27;m contemplating. I also do believe that (perhaps with very few exceptions) there are no &quot;10x engineers&quot; by themselves, but engineers that thrive 10x more in a context or another (like, I&#x27;m sure Jeff Dean is an absolutely awesome engineer - but if you took him out of Google and plugged him into IBM - would he have had the same impact?)<p>With that in mind - I think one very unexplored area is &quot;how to make the mixed AI-human teams successful&quot;. Like, I&#x27;m fairly convinced AI changes things, but to get to the industrialization of our craft (which is what management seems to want - and, TBH, something that makes sense from an economic pov), I feel that some big changes need to happen, and nobody is talking about that too much. What are the changes that need to happen? How do we change things, if we are to attempt such industrialization?
  • borroka54 minutes ago
    I am developing a web application for a dictionary that translates words from the national language into the local dialect.<p>Vibe coding and other tools, such as Google Vision, helped me download images published online, compile a PDF, perform OCR (Tesseract and Google Vision), and save everything in text format.<p>The OCR process was satisfactory for a first draft, but the text file has a lot of errors, as you&#x27;d expect when the dictionary has about 30,000 entries: Diacritical marks disappear, along with typographical marks and dashes, lines are moved up and down, and parts of speech (POS) are written in so many different ways due to errors that it is necessary to identify the wrong POS&#x27;s one by one.<p>If the reasoning abilities of LLM-derived coding agents were as advanced as some claim, it would be possible for the LLM to derive the rules that must be applied to the entire dictionary from a sufficiently large set of “gold standard” examples.<p>If only that were the case. Every general rule applied creates other errors that propagate throughout the text, so that for every problem partially solved, two more emerge. What is evident to me is not clear to the LLM, in the sense that it is simple for me, albeit long and tedious, to do the editing work manually.<p>To give an example, if trans.v. (for example) indicates a transitive verb, it is clear to me that .trans.v. is a typographical error. I can tell the coding tool (I used Gemini, Claude, and Codex, with Codex being the best) that, given a standard POS, if there is a “.” before it, it must be deleted because it is a typo. The generalization that comes easily to me but not to the coding agent is that if not one but two periods precede the POS, it means there are two typos, not to delete just one of the two dots.<p>This means that almost all rules have to be specified, whereas I expected the coding agent to generalize from the gigantic corpus on which it was trained (it should “understand” what the POS are, typical typos, the language in which the dictionary is written, etc.).<p>The transition from text to json to webapp is almost miraculous, but what is still missing from the mix is human-level reasoning and common sense (in part, I still believe that coding agents are fantastic, to be clear).
  • daxfohl1 hour ago
    Now that it&#x27;s real, is there a minimum bar of non-AI-generated code that should be required in any production product? Like if 100% of the code is AI generated (or even doom-tabbed) and something goes wrong in prod, (crash, record corruption, data leak, whatever) <i>then what</i>? 99%? 50%? What&#x27;s the bar where the risk starts outweighing the reward? When do we look around and say &quot;maybe we should start slowing down before we do something that destroys our company&quot;?<p>Granted it&#x27;s not a one-size-fits-all problem, but I&#x27;m curious if any teams have started setting up additional concrete safeguards or processes to mitigate that specific threat. It feels like a ticking time bomb.<p>It almost begs the question, what even <i>is</i> the reward? A degradation of your engineering team&#x27;s engineering fundamentals, in return for...are we actually shipping faster?
    • cagenut1 hour ago
      obviously you&#x27;re not a devops eng, I think you&#x27;re wildly under-estimating how much of business critical code pre-ai is completely orphaned anyway.<p>the people who wrote it were contractors long gone, or employees that have moved companies&#x2F;departments&#x2F;roles, or of projects that were long since wrapped up, or of people who got laid off, or the people who wrote it simply barely understood it in the first place and certainly don&#x27;t remember what they were thinking back then now.<p>basically &quot;what moron wrote this insane mess... oh me&quot; is the default state of production code anyway. there&#x27;s really no quality bar already.
      • daxfohl1 hour ago
        I am a devops engineer and understand your point. But there&#x27;s a huge difference: legacy code doesn&#x27;t change. Yeah occasionally something weird will happen and you&#x27;ve got to dig into it, but it&#x27;s pretty rare, and usually something like an expired certificate, not a logic bug.<p>What we&#x27;re entering, if this comes to fruition, is a whole new era where massive amounts of code changes that engineers are vaguely familiar with are going to be deployed at a much faster pace than anything we&#x27;ve ever seen before. That&#x27;s a whole different ballgame than the management of a few legacy services.
  • bob10297 hours ago
    I would agree that OAIs GPT-5 family of models is a phase change over GPT-4.<p>In the ChatGPT product this is not immediately obvious and many people would strongly argue their preference for 4. However, once you introduce several complex tools and make tool calling mandatory, the difference becomes stark.<p>I&#x27;ve got an agent loop that will fail nearly every time on GPT-4. It works sometimes, but definitely not enough to go to production. GPT-5 with reasoning set to minimal works 100% of the time. $200 worth of tokens and it still hasn&#x27;t failed to select the proper sequence of tools. It sometimes gets the arguments to the tools incorrect, but it&#x27;s always holding the right ones now.<p>I was very skeptical based upon prior experience but flipping between the models makes it clear there has been recent stepwise progress.<p>I&#x27;ll probably be $500 deep in tokens before the end of the month. I could barely go $20 before I called bullshit on this stuff last time.
    • alansaber7 hours ago
      Pretty sure there wasn&#x27;t extensive training on tooling beforehand. I mean, god, during GPT-3 even getting a reliable json output was a battle and there were dedicated packages for json inference.
  • jimbokun1 day ago
    I&#x27;m pretty happy with Copilot in VS Code. Type what change I want Claude to make in the Copilot panel, and then use the VS Code in context diffs to accept or reject the proposed changes. While being able to make other small changes on my own.<p>So I think this tracks with Karpathy&#x27;s defense of IDEs still being necessary ?<p>Has anyone found it practical to forgo IDEs almost entirely?
    • everfrustrated18 hours ago
      I&#x27;ve found copilot chat is able to do everything I need. I tried the Claude plugin for vscode and it was a noticeably worse experience for me.<p>Mind you copilot has only supported agent mode relatively recently.<p>I really like the way copilot does changes in such a way you can accept or reject and even revert to point in time in the chat history without using git. Something about this just fits right with how my brain works. Using Claude plugin just felt like I had one hand tied behind my back.
      • thunfischtoast12 hours ago
        I find Claude Code in VS Code is sometimes horribly inefficient. I tell it to replace some print-statements with proper logging in the one file I have open and it first starts burning tokens to understand the codebase for the 13th time today, despite not needing to and having it laid out in the CLAUDE.md already.
    • vmbm1 day ago
      I have been assigning issues to copilot in Github. It will then create a pull request and work on and report back on the issue in the PR. I will pull the code and make small changes locally using VSCode when needed.<p>But what I like about this setup is that I have almost all the context I need to review the work in a single PR. And I can go back and revisit the PR if I ever run into issues down the line. Plus you can run sessions in parallel if needed, although I don&#x27;t do that too much.
    • simonw22 hours ago
      Are you letting it run your tests and run little snippets of code to try them out (like &quot;python -c &#x27;import module; print(module.something())&#x27;&quot;) or are you just using it to propose diffs for you to accept or reject?<p>This stuff gets a whole lot more interesting when you let it start making changes and testing them by itself.
    • maxdo1 day ago
      Coplilot is not on par with cc or cursor even
      • jimbokun1 day ago
        I use it to access Claude. So what&#x27;s the difference?
        • nsingh223 hours ago
          This stuff is a little messy and opaque, but the performance of the same model in different harnesses depends a lot on how context is managed. The last time I tried Copilot, it performed markedly worse for similar tasks compared to Claude Code. I suspect that Copilot was being very aggressive in compressing context to save on token cost, but I&#x27;m not 100% certain about this.<p>Also note that with Claude models, Copilot might allocate a different number of thinking tokens compared to Claude Code.<p>Things may have changed now compared to when I tried it out, these tools are in constant flux. In general I&#x27;ve found that harnesses created by the model providers (OpenAI&#x2F;Codex CLI, Anthropic&#x2F;Claude Code, Google&#x2F;Gemini CLI) tend to be better than generalist harnesses (cheaper too, since you&#x27;re not paying a middleman).
        • walthamstow22 hours ago
          Different harnesses and agentic environments produce different results from the same model. Claude Code and Cursor are the best IME and Copilot is by far the worst.
      • WA1 day ago
        Why not? You can select Opus 4.5, Gemini 3 Pro, and others.
        • spaceman_20201 day ago
          Claude Code is a CLI tool which means it can do complete projects in a single command. Also has fantastic tools for scaffolding and harnessing the code. You can define everything from your coding style to specific instructions for designing frontpages, integrating payments, etc.<p>It&#x27;s not about the model. It&#x27;s about the harness
          • binarycrusader21 hours ago
            <i>Claude Code is a CLI tool which means it can do complete projects in a single command</i><p><a href="https:&#x2F;&#x2F;github.com&#x2F;features&#x2F;copilot&#x2F;cli&#x2F;" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;features&#x2F;copilot&#x2F;cli&#x2F;</a>
          • piker22 hours ago
            This would make some sense if VS Code didn&#x27;t have a terminal built into it. The LLMs have the same bash capabilities in either form.
          • sandos9 hours ago
            Huh? There is nothing stopping copilot from doing an entire project in one go.<p>Ive done it 10s of times.
        • maxdo1 day ago
          it&#x27;s not a model limit anymore, it&#x27;s tools , skills, background agents, etc. It&#x27;s an entire agentic environment.
          • illnewsthat1 day ago
            Github copilot has support for this stuff as well. Agent skills, background&#x2F;subagents, etc.
            • Miraste3 hours ago
              Implementation differences do matter. I haven&#x27;t found Copilot to have as many issues as people say it does, but they are there. Their Gemini implementation is unusable, for example, and it&#x27;s not because of the underlying models. They work fine in other harnesses.
  • rubzah4 hours ago
    The Slopocalypse - an unexpected variant of Gray Goo:<p><a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Gray_goo" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Gray_goo</a>
    • AnimalMuppet3 hours ago
      Well, it may consume the AI environment. Maybe even the internet. It&#x27;s not going to consume a PC with g++, though (at least if the PC doesn&#x27;t update g++ any more once g++ starts accepting AI contributions).<p>There may come a point where having a &quot;survivor machine&quot; with auto-update turned off may be a really good idea.
      • Applejinx38 minutes ago
        I already do this, in the form of survivor machines made to do initial coding on a retro platform so the result will translate across all possible platforms. Got to, as I&#x27;m an Apple coder primarily, so if I want to target older machines I can only do it through a survivor machine: support is always pruned out of Xcode and it would be insane to try and patch it to keep everything in scope.
    • direwolf201 hour ago
      Aslopalypse, a slop ellipse.
  • 1970-01-012 hours ago
    &gt;Tenacity<p>I&#x27;ve seen the exact opposite with Claude. It literally ditched my request mid-analysis when doing a root cause analysis. It decided I was tired of the service failing and then gave me some restart commands to &#x27;just get it working&#x27;
  • jwilliams16 hours ago
    &gt; It&#x27;s so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day. It&#x27;s a &quot;feel the AGI&quot; moment to watch it struggle with something for a long time just to come out victorious 30 minutes later.<p>This is true... Equally I&#x27;ve seen it dive into a rabbit hole, make some changes that probably aren&#x27;t the right direction... and then keep digging.<p>This is way more likely with Sonnet, Opus seems to be better at avoiding it. Sonnet would happily modify every file in the codebase trying to get a type error to go away. If I prompt &quot;wait, are you off track?&quot; it can usually course correct. Again, Opus seems way better at that part too.<p>Admittedly this has improved a lot lately overall.
    • gregjor9 hours ago
      I don&#x27;t understand why anyone finds it interesting that a machine, or chatbot, never tires or gets demoralized. You have to anthromorphize the LLM before you can even think of those possibilities. A tractor never tires or gets demoralized either, because it can&#x27;t. Chatbots don&#x27;t &quot;dive into a rabbit hole ... and then keep digging&quot; because they have superhuman tenacity, they do it because that&#x27;s what software does. If I ask my laptop to compute the millionth Fibonacci number it doesn&#x27;t sigh and complain, and I don&#x27;t think it shows any special qualities unless I compare it to a person given the same job.
      • akoboldfrying8 hours ago
        You&#x27;re a machine. You&#x27;re literally a wet, analog <i>device</i> converting some forms of energy into other forms just like any other machine as you work, rest, type out HN comments, etc. There is nothing special about the carbon atoms in your body -- there&#x27;s no metadata attached to them marking them out as belonging to a Living Person. Other living-person-machines treat &quot;you&quot; differently than other clusters of atoms only because evolution has taught us that doing so is a mutually beneficial social convention.<p>So, since you&#x27;re just a machine, any text you generate should be uninteresting to me -- correct?<p>Alternatively, could it be that a <i>sufficiently complex and intricate</i> machine can be interesting to observe in its own right?
        • gregjor29 minutes ago
          Wrong level of abstraction. And not the definition of machine.<p>I might feel awe or amazement at what human-made machines can do -- the reason I got into programming. But I don&#x27;t attribute human qualities to computers or software, a category error. No computer ever looked at me as interesting or tenacious.
        • spopejoy2 hours ago
          Humans and all other organisms are &quot;literally&quot; not machines or devices by the simple fact that those terms refer to works made for a purpose.<p>Even as an analogy &quot;wet machine&quot; fails again and again to adequately describe anything interesting or useful in life sciences.
        • suddenlybananas7 hours ago
          If humans are machines, they are still a subset of machines and they (among other animals) are the only ones who can be demotivated and so it is still a mistake to assume an entirely different kind of machine would have those properties.<p>&gt;Other living-person-machines treat &quot;you&quot; differently than other clusters of atoms only because evolution has taught us that doing so is a mutually beneficial social convention<p>Evolution doesn&#x27;t &quot;teach&quot; anything. It&#x27;s just an emergent property of the fact that life reproduces (and sometimes doesn&#x27;t). If you&#x27;re going to have this radically reductionist view of humanity, you can&#x27;t also treat evolution as having any kind of agency.
          • sponaugle3 hours ago
            &quot;If humans are machines, they are still a subset of machines and they (among other animals) are the only ones who can be demotivated and so it is still a mistake to assume an entirely different kind of machine would have those properties.&quot;<p>Yet.
            • suddenlybananas3 hours ago
              Sure but the entire context of the discussion is surprisial that they <i>don&#x27;t</i>.
              • sponaugle3 hours ago
                Agreed - There is no guarantee of what will happen in the future. I&#x27;m not for or against the outcome, but certainly curious to see what it is.
  • strogonoff1 day ago
    LLM coding splits up engineers based on those who primarily like building and those who primarily like code reviews and quality assessment. I definitely don’t love the latter (especially when reviewing decisions not made by a human with whom I can build long-term personal rapport).<p>After certain experience threshold of making things from scratch, “coding” (never particularly liked that term) has always been 99% building, or <i>architecture</i>, and I struggle to see how often a well-architected solution today, with modern high-level abstractions, requires so much code that you’d save significant time and effort by not having to just type, possibly with basic deterministic autocomplete, exactly what you mean (especially considering you would have to also spend time and effort reviewing whatever was typed for you if you used a non-deterministic autocomplete).
    • OkayPhysicist1 day ago
      See, I don&#x27;t take it that extreme: LLMs make <i>fantastic</i>, never-before seen quality autocompletes. I hacked together a Neovim plugin that prompts an LLM to &quot;finish this function&quot; on command, and it&#x27;s a big time save for the menial plumbing type operations. Think things like &quot;this api I use expects JSON that encodes some subset of SQL, I want all the dogs with Ls in their name that were born on a Tuesday&quot;. Given an example of such API (or if the documentation ended up in its training), LLMs will consistently one-shot stuff like that.<p>Asking it to do entire projects? Dumb. You end up with spaghetti, unless you hand-hold it to a point that you might as well be using my autocomplete method.
      • gverrilla16 hours ago
        Depends on the scope of the project. If it&#x27;s small, and you direct it correctly, it can one-shot yes. Or 2-3-shot.
    • cmrdporcupine4 hours ago
      <i>&quot;those who primarily like code reviews and quality assessment&quot;</i> -- I don&#x27;t <i>love</i> those. In fact I find it tedious and love it when I can work on my own without them.<p><i>Except</i> after 25 years of working I know how imperative they are, how easily a project can disintegrate into confused silos, and am frustrated as heck with these tools being pushed without attention to this problem.
  • kshri2417 hours ago
    Agree with Karpathy&#x27;s take. Finally a down to Earth analysis from a respected source in the AI space. I guess I&#x27;ll be using slopocalypse a lot more now :)<p>&gt; I am bracing for 2026 as the year of the slopacolypse across all of github, substack, arxiv, X&#x2F;instagram, and generally all digital media<p>It has arrived. Github will be most affected thanks to git-terrorists at Apna College refusing to take down that stupid tutorial. IYKYK.
    • ActorNightly13 hours ago
      The respect is unwarranted.<p>He ran Teslas ML division, but still doesnt know what a simple kalman filter is (in the sense where he claimed that lidar would be hard to integrate with cameras).
      • akoboldfrying7 hours ago
        The Kalman filter examples I&#x27;ve seen always involve estimating a very simple quantity, like the location of a single 3D point, from noisy sensors. It&#x27;s clear how multiple estimates can be combined into a new estimate.<p>I&#x27;d guess that cameras on a self-driving car are trying to estimate something much more complex, something like 3D <i>surfaces</i> labeled with categories (&quot;person&quot;, &quot;traffic light&quot;, etc.). It&#x27;s not obvious to me how estimates of such things from multiple sensors and predictions can be sensibly and efficiently combined to produce a better estimate. For example, what if there is a near red object in front of a distant red background, so that the camera estimates just a single object, but the lidar sees two?
  • noisy_boy5 hours ago
    &gt; I am bracing for 2026 as the year of the slopacolypse across all of github, substack, arxiv, X&#x2F;instagram, and generally all digital media.<p>2026 is just when it picks up - it&#x27;ll get exponentially worse.<p>I think 2026 is the year of Business Analysts who were unable to code. Now CC et all are good enough that they can realize the vision as long as one knows exactly the requirements (software design not that important). Programmers who didn&#x27;t know business could get by so far. Not anymore, because with these tools, the guy who knows business can now code fairly well.
    • sponaugle3 hours ago
      &quot;I think 2026 is the year of Business Analysts who were unable to code.&quot; This is interesting - I have seen far more BAs losing jobs as a result of the &#x27;work&#x27; they did being replaced by tools (both AI and AI-generated). I logically see the connection from AI tools giving BAs far more direct ability to produce something, but I don&#x27;t see it actually happening. It is possible it is too early in the AI curve for the quality of a BA built product to be sufficient. CC and Opus45 are relatively new.<p>It could also be BAs being lazy and not jumping ahead of the train that is coming towards them. It feels like in this race the engineer who is willing to learn business will still have an advantage over the business person who learns tech. At least for a little while.
    • HugoDz5 hours ago
      Agree here, the code barrier (creating software) was hiding the real mountain: creating software <i>business</i>. The two are very different beasts.
    • kitd5 hours ago
      <i>with these tools, the guy who knows business can now code fairly well.</i><p>... until CC doesn&#x27;t get it quite right and the guy who knows business doesn&#x27;t know code.
      • rubzah4 hours ago
        The future of the programmer profession: This AI-generated mess of a codebase does 80% of what I want. Now fix the last 20%, should be easy, right?
        • AnimalMuppet3 hours ago
          Apart from the &quot;AI-generated mess&quot; part, that&#x27;s too often been the past of the programmer profession, too.
  • gloosx12 hours ago
    So what is he even coding there all the time?<p>Does anybody have any info on what he is actually working on besides all the vibe-coding tweets?<p>There seems to be zero output from they guy for the past 2 years (except tweets)
    • ayewo10 hours ago
      &gt; There seems to be zero output from they guy for the past 2 years (except tweets)<p>Well, he made Nanochat public recently and has been improving it regularly [1]. This doesn&#x27;t preclude that he might be working on other projects that aren&#x27;t public yet (as part of his work at Eureka Labs).<p>1: <a href="https:&#x2F;&#x2F;github.com&#x2F;karpathy&#x2F;nanochat" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;karpathy&#x2F;nanochat</a>
      • gloosx7 hours ago
        So, it&#x27;s generative pre-trained transformers again?
    • originalvichy9 hours ago
      Helper scripts for APIs for applications and tools I know well. LLMs have made my work bearable. Many software providers expose great apis, but expert use cases require data output&#x2F;input that relies on 50-500 line scripts. Thanks to the models post gpt4.5 most requirements are solvable in 15 minutes when they could have taken multiple workdays to write and check by hand. The only major gap is safe ad-hoc environments to run these in. I provide these helper functions for clients that would love to keep the runtime in the same data environment as the tool, but not all popular software support FaaS style environments that provide something like a simple python env.
    • beng-nl9 hours ago
      He&#x27;s building Eureka Labs[1], an AI-first education company (can&#x27;t wait to use it). He&#x27;s both a strong researcher[2] and an unusually gifted technical communicator. His recent videos[3] are excellent educational material.<p>More broadly though: someone with his track record sharing firsthand observations about agentic coding shouldn&#x27;t need to justify it by listing current projects. The observations either hold up or they don&#x27;t.<p>[1] <a href="https:&#x2F;&#x2F;x.com&#x2F;EurekaLabsAI" rel="nofollow">https:&#x2F;&#x2F;x.com&#x2F;EurekaLabsAI</a><p>[2] PhD in DL, early OpenAI, founding head of AI at Tesla<p>[3] <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;@AndrejKarpathy&#x2F;videos" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;@AndrejKarpathy&#x2F;videos</a>
      • direwolf201 hour ago
        If LLM coding is a 10x productivity enhancer, why aren&#x27;t we seeing 10x more software of the same quality level, or 100x as much shitty software?
    • ruszki11 hours ago
      I don’t know, but it’s interesting that he and many others come up with this “we should act like LLMs are junior devs”. There is a reason why most junior devs work on fairly separate parts of products, most of the time parts which can be removed or replaced easily, and not an integral part of products: because their code is usually quite bad. Like every few lines contains issues, suboptimal solutions, and full with architectural problems. You basically never trust junior devs with core product features. Yet, we should pretend that an “LLM junior dev” is somehow different. These just signal to me that these people don’t work on serious code.
    • augment_me11 hours ago
      This is the first question I ask, and every time I get the answer of some monolith that supposedly solves something. Imo, this is completely fine for any personal thing, I am happy when someone says they made an API to compare weekly shopping prices from the stores around them, or some recipe, this makes sense.<p>However more often than not, someone is just building a monolithic construction that will never be looked at again. For example, someone found that HuggingFace dataloader was slow for some type of file size in combination with some disk. What does this warrant? A 300000+ line non-reviewed repo to fix this issue. Not a 200-line PR to HuggingFace, no you need to generate 20% of the existing repo and then slap your thing on there.<p>For me this is puzzling, because what is this for? Who is this for? Usually people built these things for practice, but now its generated, so its not for practice because you made very little effort on it. The only thing I can see that its some type of competence signaling, but here again, if the engineer&#x2F;manager looking knows that this is generated, it does not have the type of value that would come with such signaling. Either I am naive and people still look at these repos and go &quot;whoa this is amazing&quot;, or it&#x27;s some kind of induced egotrip&#x2F;delusion where the LLM has convinced you that you are the best builder.
  • oxag3n20 hours ago
    &gt; Atrophy. I&#x27;ve already noticed that I am slowly starting to atrophy my ability to write code manually... &gt; Largely due to all the little mostly syntactic details involved in programming, you can review code just fine even if you struggle to write it.<p>Until you struggle to review it as well. Simple exercise to prove it - ask LLM to write a function in familiar programming language, but in the area you didn&#x27;t invest learning and coding yourself. Try reviewing some code involving embedding&#x2F;SIMD&#x2F;FPGA without learning it first.
    • sleazebreeze20 hours ago
      People would struggle to review code in a completely unfamiliar domain or part of the stack even before LLMs.
      • piskov19 hours ago
        That’s why you need to write code to learn it.<p>No-one has ever learned skill just by reading&#x2F;observing
        • sponaugle3 hours ago
          &quot;No-one has ever learned skill just by reading&#x2F;observing&quot; - Except of course all of those people in Cosmology who, you know, observe.
          • direwolf202 hours ago
            what skill do they have? making stars? no they are skilled at observing, which is what they do.
            • sponaugle2 hours ago
              I think understanding stellar processes and then using that understanding to theorize about other observations is a skill. My point was that observing can be a fantastic way to build a skill.. not all skills, but certainly some skills. Learning itself is as much an observation as a practice.
      • AstroBen17 hours ago
        How would you find yourself in that situation before AI?
      • chrisjj19 hours ago
        No, because they wouldn&#x27;t be so foolish as to try it.
  • einrealist1 day ago
    &gt; It&#x27;s so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day. It&#x27;s a &quot;feel the AGI&quot; moment to watch it struggle with something for a long time just to come out victorious 30 minutes later.<p>Somewhere, there are GPUs&#x2F;NPUs running hot. You send all the necessary data, including information that you would never otherwise share. And you most likely do not pay the actual costs. It might become cheaper or it might not, because reasoning is a sticking plaster on the accuracy problem. You and your business become dependent on this major gatekeeper. It may seem like a good trade-off today. However, the personal, professional, political and societal issues will become increasingly difficult to overlook.
    • cyode21 hours ago
      This quote stuck out to me as well, for a slightly different reason.<p>The “tenacity” referenced here has been, in my opinion, the key ingredient in the secret sauce of a successful career in tech, at least in these past 20 years. Every industry job has its intricacies, but for every engineer who earned their pay with novel work on a new protocol, framework, or paradigm, there were 10 or more providing value by putting the myriad pieces together, muddling through the ever-waxing complexity, and crucially never saying die.<p>We all saw others weeded out along the way for lacking the tenacity. Think the boot camp dropouts or undergrads who changed majors when first grappling with recursion (or emacs). The sole trait of stubbornness to “keep going” outweighs analytical ability, leetcode prowess, soft skills like corporate political tact, and everything else.<p>I can’t tell what this means for the job market. Tenacity may not be enough on its own. But it’s the most valuable quality in an employee in my mind, and Claude has it.
      • noosphr19 hours ago
        There is an old saying back home: an idiot never tires, only sweats.<p>Claude isn&#x27;t tenacious. It is an idiot that never stops digging because it lacks the meta cognition to ask &#x27;hey, is there a better way to do this?&#x27;. Chain of thought&#x27;s whole raison d&#x27;etre was so the model could get out of the local minima it pushed itself in. The issue is that after a year it still falls into slightly deeper local minima.<p>This is fine when a human is in the loop. It isn&#x27;t what you want when you have a thousand idiots each doing a depth first search on what the limit of your credit card is.
        • Havoc19 hours ago
          &gt; it lacks the meta cognition to ask &#x27;hey, is there a better way to do this?&#x27;.<p>Recently had an AI tell me this code (that it wrote) is a mess and suggested wiping it and starting from scratch with a more structure plan. That seems to hint at some meta cognition outlines
          • zzrrt18 hours ago
            Haha, it has the human developer traits of thinking all old code is garbage, failing to identify oneself as the dummy who wrote this particular code, and wanting to start from scratch.
            • dpkirchner18 hours ago
              It&#x27;s like NIH syndrome but instead &quot;not invented here <i>today</i>&quot;. Also a very human thing.
              • globular-toast11 hours ago
                More like NIITS: Not Invented in this Session.
          • rurp15 hours ago
            Perhaps. I&#x27;ve had LLMs tell me some code is deeply flawed garbage that should be rewritten about code that exact same LLM wrote minutes before. It could be a sign of deep meta cognition, or it might be due to some cognitive gaps where it has no idea why it did something a minute ago and suddenly has a different idea.
          • lbrito17 hours ago
            Someone will say &quot;you just need to instruct Claude.md to be more meta and do a wiggum loop on it&quot;
          • teaearlgraycold16 hours ago
            I asked Claude to analyze something and report back. It thought for a while said “Wow this analysis is great!” and then went back to thinking before delivering the report. They’re auto-sycophantic now!
          • hyperadvanced18 hours ago
            Metacognition As A Service, you say?
            • guy426116 hours ago
              Running on the Meta Cognition Protocol server near you.
              • baxtr13 hours ago
                You’ll get sued by Meta for this!
            • r-w16 hours ago
              I think that’s called “consulting”.
          • karlgkk13 hours ago
            lol no it doesn’t. It hints at convincing language models
        • samusiam18 hours ago
          I mean, not always. I&#x27;ve seen Claude step back and reconsider things after hitting a dead end, and go down a different path. There are also workflows, loops that can increase the likelihood of this occurring.
        • cocacolacowboy14 hours ago
          [dead]
      • BeetleB20 hours ago
        This is a major concern for junior programmers. For many senior ones, after 20 (or even 10) years of tenacious work, they realize that such work will always be there, and they long ago stopped growing on that front (i.e. they had already peaked). For those folks, LLMs are a life saver.<p>At a company I worked for, lots of senior engineers become managers <i>because</i> they no longer want to obsess over whether their algorithm has an off by one error. I think fewer will go the management route.<p>(There was always the senior tech lead path, but there are far more roles for management than tech lead).
        • codyb16 hours ago
          I feel like if you&#x27;re really spending a ton of time on off by one errors after twenty years in the field you haven&#x27;t actually grown much and have probably just spent a ton of time in a single space.<p>Otherwise you&#x27;d be senior staff to principle range and doing architecture, mentorship, coordinating cross team work, interviewing, evaluating technical decisions, etc.<p>I got to code this week a bit and it&#x27;s been a tremendous joy! I see many peers at similar and lower levels (and higher) who have more years and less technical experience and still write lots of code and I suspect that is more what you&#x27;re talking about. In that case, it&#x27;s not so much that you&#x27;ve peaked, it&#x27;s that there&#x27;s not much to learn and you&#x27;re doing a bunch of the same shit over and over and that&#x27;s of course tiring.<p>I think it also means that everything you interact with outside your space does feel much harder because of the infrequency with which you have interacted with it.<p>If you&#x27;ve spent your whole career working the whole stack from interfaces to infrastructure then there&#x27;s really not going to be much that hits you as unfamiliar after a point. Most frameworks recycle the same concepts and abstractions, same thing with programming languages, algorithms, data management etc.<p>But if you&#x27;ve spent most of your career in one space cranking tickets, those unknown corners are going to be as numerous as the day you started and be much more taxing.
        • rishabhaiover20 hours ago
          That&#x27;s just sad. Right when I found love in what I do, my work has no value anymore.
          • jasonfarnon19 hours ago
            Aren&#x27;t you still better off than the rest of us who found what they love + invested decades in it before it lost its value. Isn&#x27;t it better to lose your love when you still have time to find a new one?
            • josephg17 hours ago
              I don&#x27;t think so. Those of us who found what we love and invested decades into it got to spend decades getting paid well to do what we love.
            • sponaugle3 hours ago
              &quot;it lost its value&quot;.<p>It has not lost its value yet, but the future will shift that value. All of the past experience you have is an asset for you to move with that shift. The problem will not be you losing value, it will be you not following where the value goes.<p>It might be a bit more difficult to love where the shift goes, but that is no different than loving being a artist which often shares a bed with loving being poor. What will make you happier?
            • pesus19 hours ago
              Depends on if their new love provides as much money as their old one, which is probably not likely. I&#x27;d rather have had those decades to stash and invest.
              • jasonfarnon18 hours ago
                A lot of pre-faang engineers dont have the stash you&#x27;re thinking about. What you meant was &quot;right when I found a lucrative job that I love&quot;. What was going on in tech these last 15 years, unfortunately, probably was once in a lifetime.
                • WarmWash17 hours ago
                  It&#x27;s crazy to think back in the 80&#x27;s programmers had &quot;mild&quot; salaries despite programming back then being worlds more punishing. No libraries, no stack exchange, no forums, no endless memory and infinite compute. If you had a challenging bug you better also be proficient in reading schematics and probing circuits.
                  • lurking_swe11 hours ago
                    on the bright side software evolved much more slowly in the 80s. You could go very far by being an expert in 1 thing.<p>People had real offices with actual quiet focus time.<p>User expectations were also much lower.<p>pros and cons i guess?
            • nfredericks18 hours ago
              This is genuinely such a good take
              • dugidugout15 hours ago
                Especially on the topic of value! We are all intuitively aware that value is highly contextual, but get in a knot trying to rationalize value long past genuine engagement!
        • test655420 hours ago
          Imagine a senior dev who just approves PRs, approves production releases, and prioritizes bug reports and feature requests. LLM watches for errors ceaslessly, reports an issue. Senior dev reviews the issue and assigns a severity to it. Another LLM has a backlog of features and errors to go solve, it makes a fix and submits a PR after running tests and verifying things work on its end.
      • techgnosis18 hours ago
        Why are we pretending like the need for tenacity will go away? Certain problems are easier now. We can tackle larger problems now that also require tenacity.
        • samusiam18 hours ago
          Even right at this very moment where we have a high-tenacity AI, I&#x27;d argue that working with the AI -- that is to say, doing AI coding itself and dealing with the novel challenges that brings requires a lot of stubborn persistence.
      • mykowebhn13 hours ago
        Fittingly, George Hinton toiled away for years in relative obscurity before finally being recognized for his work. I was always quite impressed by his &quot;tenacity&quot;.<p>So although I don&#x27;t think he should have won the Nobel Prize because not really physics, I felt his perseverance and hard work should merit something.
        • direwolf201 hour ago
          ... The person who embezzled from the SDC in 2018? <a href="https:&#x2F;&#x2F;eu.jsonline.com&#x2F;story&#x2F;news&#x2F;investigations&#x2F;2024&#x2F;04&#x2F;19&#x2F;george-hinton-resigns-from-sdc-after-funds&#x2F;73383279007&#x2F;" rel="nofollow">https:&#x2F;&#x2F;eu.jsonline.com&#x2F;story&#x2F;news&#x2F;investigations&#x2F;2024&#x2F;04&#x2F;19...</a>
    • daxfohl23 hours ago
      I still find in these instances there&#x27;s at least a 50% chance it has taken a shortcut somewhere: created a new, bigger bug in something that just happened not to have a unit test covering it, or broke an &quot;implicit&quot; requirement that was so obvious to any reasonable human that nobody thought to document it. These can be subtle because you&#x27;re not looking for them, because no human would ever think to do such a thing.<p>Then even if you do catch it, AI: &quot;ah, now I see exactly the problem. just insert a few more coins and I&#x27;ll fix it for real this time, I promise!&quot;
      • gtowey22 hours ago
        The value extortion plan writes itself. How long before someone pitches the idea that the models explicitly <i>almost</i> keep solving your problem to get you to keep spending? Would you even know?
        • password432119 hours ago
          First time I&#x27;ve seen this idea, I have a tingling feeling it might become reality sooner rather than later.
        • sailfast21 hours ago
          That’s far-fetched. It’s in the interest of the model builders to solve your problem as efficiently as possible token-wise. High value to user + lower compute costs = better pricing power and better margins overall.
          • d0mine20 hours ago
            &gt; far-fetched<p>Remember Google?<p>Once it was far-fetched that they would make the search worse just to show you more ads. Now, it is a reality.<p>With tokens, it is even more direct. The more tokens users spend, the more money for providers.
            • retsibsi16 hours ago
              &gt; Now, it is a reality.<p>What are the details of this? I&#x27;m not playing dumb, and of course I&#x27;ve noticed the decline, but I thought it was a combination of losing the battle with SEO shite and leaning further and further into a &#x27;give the user what you think they want, rather than what they actually asked for&#x27; philosophy.
              • SetTheorist52 minutes ago
                As recently as 15 years ago, Google _explicitly_ stated in their employee handbook that they would NOT, as a matter of principle, include ads in the search results. (Source: worked there at that time.)<p>Now, they do their best to deprioritize and hide non-ad results...
              • supriyo-biswas15 hours ago
                <a href="https:&#x2F;&#x2F;www.wheresyoured.at&#x2F;the-men-who-killed-google&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.wheresyoured.at&#x2F;the-men-who-killed-google&#x2F;</a>
            • throwthrowuknow19 hours ago
              Only if you are paying per token on the API. If you are paying a fixed monthly fee then they lose money when you need to burn more tokens and they lose customers when you can’t solve your problems within that month and max out your session limits and end up with idle time which you use to check if the other providers have caught up or surpassed your current favourite.
              • layla5alive14 hours ago
                Indeed, unlimited plan seems like the only way that makes sense to not have it be guaranteed to be abused by the provider
          • lelanthran57 minutes ago
            &gt; It’s in the interest of the model builders to solve your problem as efficiently as possible token-wise. High value to user + lower compute costs = better pricing power and better margins overall.<p>It&#x27;s only in the interests of the model builders to do that <i>IFF</i> the user can actually tell that the model is giving them the best value for a single dollar.<p>Right now you can&#x27;t tell.
            • fragmede45 minutes ago
              Why not? Seems like you&#x27;d just build the same app on each of the models you want to test and judge how they did.
              • lelanthran40 minutes ago
                &gt; Why not? Seems like you&#x27;d just build the same app on each of the models you want to test and judge how they did.<p>I tried that on a few problems; even on the <i>same</i> model the results have too much variation.<p>When comparing different models, repeating the experiment gives you different results.
          • xienze20 hours ago
            &gt; It’s in the interest of the model builders to solve your problem as efficiently as possible token-wise.<p>Unless you’re paying by the token.
        • Fnoord19 hours ago
          I was thinking more of deliberate backdoor in code. RCE is an obvious example, but another one could be bias. &quot;I&#x27;m sorry ma&#x27;am, computer says you are ineligable for a bank account.&quot; These ideas aren&#x27;t new. They were there in 90s already when we still thought about privacy and accountability regarding technology, and dystopian novels already described them long, long ago.
        • fragmede22 hours ago
          The free market proposition is that competition (especially with Chinese labs and grok) means that Anthropic is welcome to do that. They&#x27;re even welcome to illegally collude with OpenAi such that ChatGPT is similarly gimped. But switching costs are pretty low. If it turns out I can one shot an issue with Qwen or Deepseek or Kimi thinking, Anthropic loses not just my monthly subscription, but everyone else&#x27;s I show that too. So no, I think that&#x27;s some grade A conspiracy theory nonsense you&#x27;ve got there.
          • coffeefirst21 hours ago
            It’s not that crazy. It could even happen by accident in pursuit of another unrelated goal. And if it did, a decent chunk of the tech industry would call it “revealed preference” because usage went up.
            • hnuser12345621 hours ago
              LLMs became sycophantic and effusive because those responses were rated higher during RLHF, until it became newsworthy how obviously eager-to-please they got, so yes, being highly factually correct and &quot;intelligent&quot; was already not the only priority.
          • bandrami18 hours ago
            &gt; But switching costs are pretty low<p>Switching costs are <i>currently</i> low. Once you&#x27;re committed to the workflow the providers will switch to prepaying for a year&#x27;s worth of tokens.
          • daxfohl20 hours ago
            To be clear I don&#x27;t think that&#x27;s what they&#x27;re doing intentionally. Especially on a subscription basis, they&#x27;d rather me maximize my value per token, or just not use them. Lulling users into using tokens unproductively is the worst possible option.<p>The way agents work right now though just sometimes feels that way; they don&#x27;t have a good way of saying &quot;You&#x27;re probably going to have to figure this one out yourself&quot;.
          • jrflowers21 hours ago
            This is a good point. For example if you have access to a bunch of slot machines, one of them is guaranteed to hit the jackpot. Since switching from one slot machine to another is easy, it is trivial to go from machine to machine until you hit the big bucks. That is why casinos have such large selections of them (for our benefit).
            • krupan20 hours ago
              &quot;for our benefit&quot; lol! This is the best description of how we are all interacting with LLMs now. It&#x27;s not working? Fire up more &quot;agents&quot; ala gas town or whatever
              • direwolf201 hour ago
                gas is the transaction fees in Ethereum. It&#x27;s a fitting name.
            • robotmaxtron15 hours ago
              last time I was at a casino I checked to see what company built the machines, imagine my surprise that it was (by my observation) a single vendor.
          • thunderfork21 hours ago
            As a rational consumer, how would you distinguish between some intentional &quot;keep pulling the slot machine&quot; failure rate and the intrinsic failure rate?<p>I feel like saying &quot;the market will fix the incentives&quot; handwaves away the lack of information on internals. After all, look at the market response to Google making their search less reliable - sure, an invested nerd might try Kagi, but Google&#x27;s still the market leader by a long shot.<p>In a market for lemons, good luck finding a lime.
            • krupan20 hours ago
              FWIW, kagi is better than Google
              • direwolf202 hours ago
                yes, that was their point. Everyone uses Google anyway.
        • chanux15 hours ago
          Is this from a page of dating apps playbook?
      • wvenable21 hours ago
        &gt; These can be subtle because you&#x27;re not looking for them<p>After any agent run, I&#x27;m always looking the git comparison between the new version and the previous one. This helps catch things that you might otherwise not notice.
        • teaearlgraycold15 hours ago
          And after manually coding I often have an LLM review the diff. 90% of the problems it finds can be discounted, but it’s still a net positive.
      • einrealist12 hours ago
        And there is this paradox where it becomes harder to detect the problems as the models &#x27;improve&#x27;.
      • charcircuit22 hours ago
        You are using it wrong, or are using a weak model if your failure rate is over 50%. My experience is nothing like this. It very consistently works for me. Maybe there is a &lt;5% chance it takes the wrong approach, but you can quickly steer it in the right direction.
        • testaccount2822 hours ago
          you are using it on easy questions. some of us are not.
          • meowface13 hours ago
            A lot of people are getting good results using it on hard things. Obviously not perfect, but &gt; 50% success.<p>That said, more and more people seem to be arriving at the conclusion that if you want a fairly large-sized, complex task in a large existing codebase done right, you&#x27;ll have better odds with Codex GPT-5.2-Codex-XHigh than with Claude Code Opus 4.5. It&#x27;s far slower than Opus 4.5 but more likely to get things correct, and complete, in its first turn.
          • mikkupikku21 hours ago
            I think a lot of it comes down to how well the user understands the problem, because that determines the quality of instructions and feedback given to the LLM.<p>For instance, I know some people have had success with getting claude to do game development. I have never bothered to learn much of anything about game development, but have been trying to get claude to do the work for me. Unsuccessful. It works for people who understand the problem domain, but not for those who don&#x27;t. That&#x27;s my theory.
            • samrus20 hours ago
              It works for hard problems when the person already solves it and just needs the grunt work done<p>It also works for problems that have been solved a thousand times before, which impresses people and makes them think it is actually solving those problems
              • daxfohl20 hours ago
                Which matches what they are. They&#x27;re first and foremost pattern recognition engines extraordinaire. If they can identify some pattern that&#x27;s out of whack in your code compared to something in the training data, or a bug that is similar to others that have been fixed in their training set, they can usually thwack those patterns over to your latent space and clean up the residuals. If comparing pattern matching alone, they are superhuman, significantly.<p>&quot;Reasoning&quot;, however, is a feature that has been bolted on with a hacksaw and duct tape. Their ability to pattern match makes reasoning seem more powerful than it actually is. If your bug is within some reasonable distance of a pattern it has seen in training, reasoning can get it over the final hump. But if your problem is too far removed from what it has seen in its latent space, it&#x27;s not likely to figure it out by reasoning alone.
                • charcircuit19 hours ago
                  &gt;&quot;Reasoning&quot;, however, is a feature that has been bolted on with a hacksaw and duct tape.<p>What do you mean by this? Especially for tasks like coding where there is a deterministic correct or incorrect signal it should be possible to train.
                  • direwolf201 hour ago
                    it&#x27;s meant in the literal sense but with metaphorical hacksaws and duct tape.<p>Early on, some advanced LLM users noticed they could get better results by forcing insertion of a word like &quot;Wait,&quot; or &quot;Hang on,&quot; or &quot;Actually,&quot; and then running the model for a few more paragraphs. This would increase the chance of a model noticing a mistake it made.<p>Reasoning is basically this.
                    • charcircuit44 minutes ago
                      It&#x27;s not just force inserting a word. Reasoning is integrated into the training process of the model.
              • thunky17 hours ago
                &gt; It also works for problems that have been solved a thousand times before<p>So you mean it works on almost all problems?
          • baq21 hours ago
            Don’t use it for hard questions like this then; you wouldn’t use a hammer to cut a plank, you’d try to make a saw instead
    • fooker23 hours ago
      &gt; It might become cheaper or it might not<p>If it does not, this is going to be first technology in the history of mankind that has not become cheaper.<p>(But anyway, it already costs half compared to last year)
      • ctoth22 hours ago
        &gt; But anyway, it already costs half compared to last year<p>You could not have bought Claude Opus 4.5 at any price one year ago I&#x27;m quite certain. The things that were available cost half of what they did then, and there are new things available. These are both true.<p>I&#x27;m agreeing with you, to be clear.<p>There are two pieces I expect to continue: inference for existing models will continue to get cheaper. Models will continue to get better.<p>Three things, actually.<p>The &quot;hitting a wall&quot; &#x2F; &quot;plateau&quot; people will continue to be loud and wrong. Just as they have been since 2018[0].<p>[0]: <a href="https:&#x2F;&#x2F;blog.irvingwb.com&#x2F;blog&#x2F;2018&#x2F;09&#x2F;a-critical-appraisal-of-deep-learning.html" rel="nofollow">https:&#x2F;&#x2F;blog.irvingwb.com&#x2F;blog&#x2F;2018&#x2F;09&#x2F;a-critical-appraisal-...</a>
        • simianwords22 hours ago
          interesting post. i wonder if these people go back and introspect on how incorrect they have been? do they feel the need to address it?
          • fooker22 hours ago
            No, people do not do that.<p>This is harmless when it comes to tech opinions but causes real damage in politics and activism.<p>People get really attached to ideals and ideas, and keep sticking to those after they fail to work again and again.
            • simianwords22 hours ago
              i don&#x27;t think it is harmless or we are incentivising people to just say whatever they want without any care for truth. people&#x27;s reputations should be attached to their predictions.
          • cogogo21 hours ago
            Some people definitely do but how do they go and address it? A fresh example in that it addresses pure misinformation. I just screwed up and told some neighbors garbage collection was delayed for a day because of almost 2ft of snow. Turns out it was just food waste and I was distracted checking the app and read the notification poorly.<p>I went back to tell them (do not know them at all just everyone is chattier digging out of a storm) and they were not there. Feel terrible and no real viable remedy. Hope they check themselves and realize I am an idiot. Even harder on the internet.
          • maest17 hours ago
            Do _you_ do that?
        • teaearlgraycold15 hours ago
          As a user of LLMs since GPT-3 there was noticeable stagnation in LLM utility after the release of GPT-4. But it seems the RLHF, tool calling, and UI have all come together in the last 12 months. I used to wonder what fools could be finding them so useful to claim a 10x multiplier - even as a user myself. These days I’m feeling more and more efficiency gains with Claude Code.
          • HNisCIS13 hours ago
            That&#x27;s the thing people are missing, the models plateaued a while ago, still making minor gains to this day, but not huge ones. The difference is now we&#x27;ve had time to figure out the tooling. I think there&#x27;s still a ton of ground to cover there and maybe the models will improve given that the extra time, but I think it&#x27;s foolish to consider people who predicted that completely wrong. There are also a lot of mathematical concerns that will cause problems in the near and distant future. Infinite progress is far from a given, we&#x27;re already way behind where all the boosters thought we&#x27;d be my now.
            • teaearlgraycold9 hours ago
              I believe Sam Altman, perhaps the greatest grifter in today’s Silicon Valley, claimed that software engineering would be obsolete by the end of last year.
        • bsder20 hours ago
          &gt; The &quot;hitting a wall&quot; &#x2F; &quot;plateau&quot; people will continue to be loud and wrong. Just as they have been since 2018[0].<p>Everybody who bet against Moore&#x27;s Law was wrong ... until they weren&#x27;t.<p>And AI is the reaction to Moore&#x27;s Law having broken. Nobody gave one iota of damn about trying to make programming easier until the chips couldn&#x27;t double in speed anymore.
          • twoodfin19 hours ago
            This is exactly backwards: Dennard scaling stopped. Moore’s Law has continued and it’s what made training and running inference on these models practical at interactive timescales.
            • bsder18 hours ago
              You are technically correct. The best kind of correct.<p>However, most people don&#x27;t know the difference between the proper Moore&#x27;s Law scaling (the cost of a transistor halves every 2 years) which is still continuing (sort of) and the colloquial version (the speed of a transistor doubles every 2 years) which got broken when Dennard scaling ran out. To them, Moore&#x27;s Law just broke.<p>Nevertheless, you are reinforcing my point. Nobody gave a damn about improving the &quot;programming&quot; side of things until the hardware side stopped speeding up.<p>And rather than try to apply some human brainpower to fix the &quot;programming&quot; side, they threw a hideous number of those free (except for the electricity--but we don&#x27;t mention that--LOL) transistors at the wall to create a broken, buggy, unpredictable machine simulacrum of a &quot;programmer&quot;.<p>(Side note: And to be fair, it looks like even the strong form of Moore&#x27;s Law is finally slowing down, too)
              • twoodfin18 hours ago
                If you can turn a few dollars of electricity per hour into a junior-level programmer who never gets bored, tired, or needs breaks, that fundamentally changes the economics of information technology.<p>And in fact, the agentic looped LLMs are executing much better than that today. They could stop advancing right now and still be revolutionary.
      • peaseagee22 hours ago
        That&#x27;s not true. Many technologies get more expensive over time, as labor gets more expensive or as certain skills fall by the wayside, not everything is mass market. Have you tried getting a grandfather clock repaired lately?
        • willio5822 hours ago
          Repairing grandfather clocks isn&#x27;t more expensive now because it&#x27;s gotten any harder; it&#x27;s because the popularity of grandfather clocks is basically nonexistent compared to anything else to tell time.
          • direwolf201 hour ago
            Doesn&#x27;t need to be any particular reason to disprove the notion that technology only gets cheaper.
        • simianwords22 hours ago
          &quot;repairing a unique clock&quot; getting costlier doesn&#x27;t mean technology hasn&#x27;t gotten cheaper.<p>check out whether clocks have gotten cheaper in general. the answer is that it has.<p>there is no economy of scale here in repairing a single clock. its not relevant to bring it up here.
          • ipaddr21 hours ago
            Clocks prices have gone up since 2020. Unless a cheaper better way to make clocks has emerged inflation causes prices to grow.
            • fooker21 hours ago
              Luxury watches have gone up, &#x27;clocks&#x27; as a technology is cheaper than ever.<p>You can buy one for 90 cents on temu.
              • ipaddr20 hours ago
                The landing cost for that 90 cent watch has gone way up. Shipping and to some degree taxes has pushed the price higher.
                • pas19 hours ago
                  that&#x27;s not the technology<p>of course it&#x27;s silly to talk about manufacturing methods and yield and cost efficiency without having an economy to embed all of this into, but ... technology got cheaper means that we have practical knowledge of how to make cheap clocks (given certain supply chains, given certain volume, and so and so)<p>we can make very cheap very accurate clocks that can be embedded into whatever devices, but it requires the availability of fabs capable of doing MEMS components, supply materials, etc.
            • simianwords21 hours ago
              not true, clocks have gone down after accounting for inflation. verified using ChatGPT.
              • ipaddr20 hours ago
                You can&#x27;t account for inflation because the price increase is inflation.
                • pas19 hours ago
                  you can look at a basket of goods that doesn&#x27;t have your specific product and compare directly<p>but inflation is the general price level increase, this can be used as a deflator to get the price of whatever product in past&#x2F;future money amount to see how the price of the product changed in &quot;real&quot; terms (ie. relative to the general price level change)
                • simianwords20 hours ago
                  this is not true
        • esafak22 hours ago
          Instead of advancing tenuous examples you could suggest a realistic mechanism by which costs could rise, such as a Chinese advance on Taiwan, effecting TSMC, etc.
        • emtel21 hours ago
          Time-keeping is vastly cheaper. People don&#x27;t want grandfather clocks. They want to tell time. And they can, more accurately, more easily, and much cheaper than their ancestors.
        • epidemiology18 hours ago
          Or riding in an uber?
        • groby_b22 hours ago
          No. You don&#x27;t get to make &quot;technology gets more expensive over time&quot; statements for deprecated technologies.<p>Getting a bespoke flintstone axe is also pretty expensive, and has also absolutely no relevance to modern life.<p>These discussions must, if they are to be useful, center in a population experience, not in unique personal moments.
          • ipaddr21 hours ago
            I purchased a 5T drive in 2019 and the price is higher now despite newer better drives going on the market since.<p>Not much has down in price over the last few years.
            • groby_b19 hours ago
              Price volatility exists.<p>Meanwhile the overall price of storage has been going down consistently: <a href="https:&#x2F;&#x2F;ourworldindata.org&#x2F;grapher&#x2F;historical-cost-of-computer-memory-and-storage" rel="nofollow">https:&#x2F;&#x2F;ourworldindata.org&#x2F;grapher&#x2F;historical-cost-of-comput...</a>
          • solomonb21 hours ago
            okay how about the Francis Scott Key Bridge?<p><a href="https:&#x2F;&#x2F;marylandmatters.org&#x2F;2025&#x2F;11&#x2F;17&#x2F;key-bridge-replacement-costs-soar-as-high-as-5-2-billion-opening-delayed-to-2030&#x2F;" rel="nofollow">https:&#x2F;&#x2F;marylandmatters.org&#x2F;2025&#x2F;11&#x2F;17&#x2F;key-bridge-replacemen...</a>
            • groby_b19 hours ago
              You will get a different bridge. With very different technology. Same as &quot;I can&#x27;t repair my grandfather clock cheaply&quot;.<p>In general, there are several things that are true for bridges that aren&#x27;t true for most technology:<p>* Technology has massively improved, but most people are not realizing that. (E.g. the Bay Bridge cost significantly more than the previous version, but that&#x27;s because we&#x27;d like to not fall down again in the next earthquake) * We still have little idea how to reason about the cost of bridges in general. (Seriously. It&#x27;s an active research topic) * It&#x27;s a tiny market, with the major vendors forming an oligopoly * It&#x27;s infrastructure, not a standard good * The buy side is almost exclusively governments.<p>All of these mean expensive goods that are completely non-repeatable. You can&#x27;t build the same bridge again. And on top of that, in a distorted market.<p>But sure, the cost of &quot;one bridge, please&quot; has gone up over time.
              • solomonb18 hours ago
                This seems largely the same as any other technology. The prices of new technologies go down initially as we scale up and optimize it&#x27;s production, but as soon as demand fades, due to newer technology or whatever, the cost of that technology goes up again.
              • fooker19 hours ago
                &gt; But sure, the cost of &quot;one bridge, please&quot; has gone up over time.<p>Even if you adjust for inflation?
          • arthurbrown21 hours ago
            Bought any RAM lately? Phone? GPU in the last decade?
            • ipaddr21 hours ago
              The latest iphone has gone down in price? It&#x27;s double. I guess the marketing is working.
              • xnyan19 hours ago
                &quot;Pens are not cheaper, look at this Montblanc&quot; is not a good faith response.<p>&#x27;84 Motorola DynaTAC - ~$12k AfI (adjusted for inflation)<p>&#x27;89 MicroTAC ~$8k AfI<p>&#x27;96 StarTAC ~$2k AfI<p>`07 iPhone ~$673 AfI<p>The current average smartphone sells for around $280. Phones are getting cheaper.
      • InsideOutSanta22 hours ago
        Sure, running an LLM is cheaper, but the way we use LLMs now requires way more tokens than last year.
        • fooker22 hours ago
          10x more tokens today cost less than than half of X tokens from ~mid 2024.
        • simianwords22 hours ago
          ok but the capabilities are also rising. what point are you trying to make?
          • oytis22 hours ago
            That it&#x27;s not getting cheaper?
            • jstummbillig22 hours ago
              But it is, capability adjusted, which is the only way it makes sense. You can definitely produce last years capability at a huge discount.
            • simianwords22 hours ago
              you are wrong. <a href="https:&#x2F;&#x2F;epoch.ai&#x2F;data-insights&#x2F;llm-inference-price-trends" rel="nofollow">https:&#x2F;&#x2F;epoch.ai&#x2F;data-insights&#x2F;llm-inference-price-trends</a><p>this is accounting for the fact that more tokens are used.
              • techpression21 hours ago
                The chart shows that they’re right though. Newer models cost more than older models. Sure they’re better but that’s moot if older models are not available or can’t solve the problem they’re tasked with.
                • simianwords21 hours ago
                  this is incorrect. the cost to achieve the same task by old models is way higher than by new models.<p>&gt; Newer models cost more than older models<p>where did you see this?
                  • techpression21 hours ago
                    On the link you shared, 4o vs 3.5 turbo price per 1m tokens.<p>There’s no such thing as ”same task by old model”, you might get comparable results or you might not (and this is why the comparison fail, it’s not a comparison), the reason you pick the newer models is to increase chances of getting a good result.
                    • simianwords21 hours ago
                      &gt; The dataset for this insight combines data on large language model (LLM) API prices and benchmark scores from Artificial Analysis and Epoch AI. We used this dataset to identify the lowest-priced LLMs that match or exceed a given score on a benchmark. We then fit a log-linear regression model to the prices of these LLMs over time, to measure the rate of decrease in price. We applied the same method to several benchmarks (e.g. MMLU, HumanEval) and performance thresholds (e.g. GPT-3.5 level, GPT-4o level) to determine the variation across performance metrics<p>This should answer. In your case, GPT-3.5 definitely is cheaper per token than 4o but much much less capable. So they used a model that is cheaper than GPT-3.5 that achieved better performance for the analysis.
                • fooker21 hours ago
                  OpenAI has always priced newer models lower than older ones.
                  • simianwords21 hours ago
                    not true! 4o was costlier than 3.5 turbo
                  • techpression21 hours ago
                    <a href="https:&#x2F;&#x2F;platform.openai.com&#x2F;docs&#x2F;pricing" rel="nofollow">https:&#x2F;&#x2F;platform.openai.com&#x2F;docs&#x2F;pricing</a><p>Not according to their pricing table. Then again I’m not sure what OpenAI model versions even mean anymore, but I would assume 5.2 is in the same family as 5 and 5.2-pro as 5-pro
                    • fooker21 hours ago
                      Check GPT 5.2 vs it&#x27;s predecessor the &#x27;o&#x27; series of reasoning models.
      • fulafel14 hours ago
        I don&#x27;t think computation is going to become more expensive, but there are techs that have become so: Nuclear power plants. Mobile phones. Oil extraction.<p>(Oil rampdown is a survival imperative due to the climate catastrophe so there it&#x27;s a very positive thing of course, though not sufficient...)
      • root_axis20 hours ago
        Not true. Bitcoin has continued to rise in cost since its introduction (as in the aggregate cost incurred to run the network).<p>LLMs will face their own challenges with respect to reducing costs, since self-attention grows quadratically. These are still early days, so there remains a lot of low hanging fruit in terms of optimizations, but all of that becomes negligible in the face of quadratic attention.
        • namcheapisdumb15 hours ago
          &gt; bitcoin<p>so close! that is a commodity
        • twoodfin19 hours ago
          For Bitcoin that’s by design!
      • krupan20 hours ago
        There are plenty of technologies that have not become cheaper, or at least not cheap enough, to go big and change the world. You probably haven&#x27;t heard of them because obviously they didn&#x27;t succeed.
      • asadotzler20 hours ago
        cheaper doesnt mean cheap enough to be viable after the bills come due
      • ak_11121 hours ago
        Concorde?
      • runarberg14 hours ago
        Supersonic jet engines, rockets to the moon, nuclear power plants, etc. etc. all have become more expensive. Superconductors were discovered in 1911, and we have been making them for as long as we have been making transistors in the 1950s, yet superconductors show no sign of becoming cheaper any time soon.<p>There have been plenty of technologies in history which do not in fact become cheaper. LLMs are very likely to become such, as I suspect their usefulness will be superseded by cheaper (much cheaper in fact) specialized models.
    • redox9920 hours ago
      &gt; And you most likely do not pay the actual costs.<p>This is one of the weakest anti AI postures. &quot;It&#x27;s a bubble and when free VC money stops you&#x27;ll be left with nothing&quot;. Like it&#x27;s some kind of mystery how expensive these models are to run.<p>You have open weight models right now like Kimi K2.5 and GLM 4.7. These are very strong models, only months behind the top labs. And they are not very expensive to run at scale. You can do the math. In fact there are third parties serving these models for profit.<p>The money pit is training these models (and not that much if you are efficient like chinese models). Once they are trained, they are served with large profit margins compared to the inference cost.<p>OpenAI and Anthropic are without a doubt selling their API for a lot more than the cost of running the model.
    • bob102915 hours ago
      Humans run hot too. Once you factor in the supply chain that keeps us alive, things become surprisingly equivalent.<p>Eating burgers and driving cars around costs a lot more than whatever # of watts the human brain consumes.
      • bbor13 hours ago
        I mean, “equivalent” is an understatement! There’s a reason Claude Code costs less than hiring a full time software engineer…
    • crazygringo20 hours ago
      &gt; <i>Somewhere, there are GPUs&#x2F;NPUs running hot.</i><p>Running at their designed temperature.<p>&gt; <i>You send all the necessary data, including information that you would never otherwise share.</i><p>I&#x27;ve never sent the type of data that isn&#x27;t already either stored by GitHub or a cloud provider, so no difference there.<p>&gt; <i>And you most likely do not pay the actual costs.</i><p>So? Even if costs double once investor subsidies stop, that doesn&#x27;t change much of anything. And the entire history of computing is that things tend to get cheaper.<p>&gt; <i>You and your business become dependent on this major gatekeeper.</i><p>Not really. Switching between Claude and Gemini or whatever new competition shows up is pretty easy. I&#x27;m no more dependent on it than I am on any of another hundred business services or providers that similarly mostly also have competitors.
    • chasebank14 hours ago
      I don’t understand this pov. Unfortunately, id pay 10k mo for my cc sub. I wish I could invest in anthropic, they’re going to be the most profitable company on earth
    • moooo9911 hours ago
      My agent struggled for 45 minutes because it tried to do `go run` on a _test.go file, which the compiler repeatedly exited after posting an error message that files named like this cannot be executed using the run command.<p>So yeah, that wasted a lot of GPU cycles for a very unimpressive result, but with a renewed superficial feeling of competence
    • squidbeak7 hours ago
      &gt; you most likely do not pay the actual costs. It might become cheaper or it might not<p>Why would this be the first technology that doesn&#x27;t become cheaper at scale over time?
    • mikeocool20 hours ago
      To me this tenacity is often like watching someone trying to get a screw into board using a hammer.<p>There’s often a better faster way to do it, and while it might get to the short term goal eventually, it’s often created some long term problems along the way.
    • karlgkk13 hours ago
      &gt; And you most likely do not pay the actual costs<p>Oh my lord you absolutely do not. The costs to oai per token inference ALONE are at least 7x. AT LEAST and from what I’ve heard, much higher.
      • tgrowazay13 hours ago
        We can observe how much generic inference providers like deepinfra or together-ai charge for large SOTA models. Since they are not subsidized and they don’t charge 7x of OpenAI, that means OAI also doesn’t have outrageously high per-token costs.
    • hahahahhaah21 hours ago
      It is also amazing seeing Linux kernel work, scheduling threads, proving interrupts and API calls all without breaking a sweat or injuring its ACL.
    • YetAnotherNick22 hours ago
      With optimizations and new hardware, power is almost a negligible cost. You can get 5.5M tokens&#x2F;s&#x2F;MW[1] for kimi k2(=20M&#x2F;KWH=181M tokens&#x2F;$) which is 400x cheaper than current pricing. It&#x27;s just Nvidia&#x2F;TSMC&#x2F;other manufacturers eating up the profit now because they can. My bet is that China will match current Nvidia within 5 years.<p>[1]: <a href="https:&#x2F;&#x2F;developer-blogs.nvidia.com&#x2F;wp-content&#x2F;uploads&#x2F;2026&#x2F;01&#x2F;Figure-38-2-png.webp" rel="nofollow">https:&#x2F;&#x2F;developer-blogs.nvidia.com&#x2F;wp-content&#x2F;uploads&#x2F;2026&#x2F;0...</a>
      • storystarling21 hours ago
        Electricity is negligible but the dominant cost is the hardware depreciation itself. Also inference is typically memory bandwidth bound so you are limited by how fast you can move weights rather than raw compute efficiency.
        • YetAnotherNick15 hours ago
          Yes, because the margin is like 80% for Nvidia, and 80% again for the manufacturers like Samsung and TSMC. Once the fixed cost like R and D is amortized the same node technology and hardware capacity could be just few single digit percent of current.
    • utopiah13 hours ago
      AI genius discover brute forcing... what a time to be alive. &#x2F;s<p>Like... bro that&#x27;s THE foundation of CS. That&#x27;s the principle of The bomb in Turing&#x27;s time. One can still marvel at it but it&#x27;s been with us since the beginning.
  • vinhnx16 hours ago
    Boris Cherny (Claude Code creator) replies to Andrej Karpathy<p><a href="https:&#x2F;&#x2F;xcancel.com&#x2F;bcherny&#x2F;status&#x2F;2015979257038831967" rel="nofollow">https:&#x2F;&#x2F;xcancel.com&#x2F;bcherny&#x2F;status&#x2F;2015979257038831967</a>
  • Macha23 hours ago
    &gt; - What does LLM coding feel like in the future? Is it like playing StarCraft? Playing Factorio? Playing music?<p>Starcraft and Factorio are exactly what it is not. Starcraft has a loooot of micro involved at any level beyond mid level play, despite all the &quot;pro macros and beats gold league with mass queens&quot; meme videos. I guess it could be like Factorio if you&#x27;re playing it by plugging together blueprint books from other people but I don&#x27;t think that&#x27;s how most people play.<p>At that level of abstraction, it&#x27;s more like grand strategy if you&#x27;re to compare it to any video game? You&#x27;re controlling high level pushes and then the units &quot;do stuff&quot; and then you react to the results.
    • kridsdale318 hours ago
      I think the StarCraft analogy is fine, you have to compare it not to macro and micro RTS play, but to INDIVIDUAL UNITS. For your whole career until now, you have been a single Zergling or Probe. Now you are the Commander.
      • TheRoque16 hours ago
        Except that pro starcraft player still micro-manage every single Zergling or probe when necessary, while vibe coders just right click on the ennemy base and hope it&#x27;ll go well
    • zetazzed20 hours ago
      It&#x27;s like the Victoria 3 combat system. You just send an army and a general to a given front and let them get to work with no micro. Easy! But of course some percentage of the time they do something crazy like deciding to redeploy from your existential Franco-Prussian war front to a minor colonial uprising...
  • porise1 day ago
    I wish the people who wrote this let us know what king of codebases they are working on. They seem mostly useless in a sufficiently large codebase especially when they are messy and interactions aren&#x27;t always obvious. I don&#x27;t know how much better Claude is than ChatGPT, but I can&#x27;t get ChatGPT to do much useful with an existing large codebase.
    • CameronBanga1 day ago
      This is an antidotal example, but I released this last week after 3 months of work on it as a &quot;nights and weekdends&quot; project: <a href="https:&#x2F;&#x2F;apps.apple.com&#x2F;us&#x2F;app&#x2F;skyscraper-for-bluesky&#x2F;id6754198379">https:&#x2F;&#x2F;apps.apple.com&#x2F;us&#x2F;app&#x2F;skyscraper-for-bluesky&#x2F;id67541...</a><p>I&#x27;ve been working in the mobile space since 2009, though primarily as a designer and then product manager. I work in kinda a hybrid engineering&#x2F;PM job now, and have never been a particularly strong programmer. I definitely wouldn&#x27;t have thought I could make something with that polish, let alone in 3 months.<p>That code base is ~98% Claude code.
      • bee_rider1 day ago
        I don’t know if “antidotal example” is a pun or a typo but I quite like it.
        • CameronBanga1 day ago
          Lol typing on my phone during lunch and meant anecdotal. But let&#x27;s leave it anyways. :)
        • oasisbob1 day ago
          That is fun.<p>Not sure if it&#x27;s an American pronunciation thing, but I had to stare at that long and hard to see the problem and even after seeing it couldn&#x27;t think of how you could possibly spell the correct word otherwise.
          • bsder20 hours ago
            &gt; Not sure if it&#x27;s an American pronunciation thing<p>It&#x27;s a bad American pronunciation thing like &quot;Febuwary&quot; and &quot;nuculer&quot;.<p>If you pronounce the syllables correctly, &quot;an-ec-dote&quot;, &quot;Feb-ru-ar-y&quot;, &quot;nu-cle-ar&quot; the spellings follow.<p>English has it&#x27;s fair share of spelling stupidities, but if people don&#x27;t even pronounce the words correctly there is no hope.
            • lynguist3 minutes ago
              <a href="https:&#x2F;&#x2F;en.wiktionary.org&#x2F;wiki&#x2F;February" rel="nofollow">https:&#x2F;&#x2F;en.wiktionary.org&#x2F;wiki&#x2F;February</a><p>The pronunciation of the first r with a y sound has always been one of two possible standards, in fact &quot;February&quot; is a re-Latinizing spelling but English doesn’t like the br-r sound so it naturally dissimilates to by-r.
    • CSMastermind14 hours ago
      Are you using Codex?<p>I&#x27;m not sure how big your repos are but I&#x27;ve been effective working with repos that have thousands of files and tens of thousands of lines of code.<p>If you&#x27;re just prototyping it will hit wall when things get unwieldy but that&#x27;s normally a sign that you need to refactor a bit.<p>Super strict compiler settings, static analysis, comprehensive tests, and documentation help a lot. As does basic technical design. After a big feature is shipped I do a refactor cycle with the LLM where we do a comprehensive code review and patch things up. This does require human oversight because the LLMs are still lacking judgement on what makes for good code design.<p>The places where I&#x27;ve seen them be useless is working across repositories or interfacing with things like infrastructure.<p>It&#x27;s also very model-dependent. Opus is a good daily driver but Codex is much better are writing tests for some reason. I&#x27;ll often also switch to it for hard problems that Claude can&#x27;t solve. Gemini is nice for &#x27;I need a prototype in the next 10 minutes&#x27;, especially for making quick and dirty bespoke front-ends where you don&#x27;t care about the design just the functionality.
      • madhadron13 hours ago
        &gt; tens of thousands of lines of code<p>Perhaps this is part of it? Tens of thousands of lines of code seems like a very small repo to me.
    • TaupeRanger1 day ago
      Claude and Codex are CLI tools you use to give the LLM context about the project on your local machine or dev environment. The fact that you&#x27;re using the name &quot;ChatGPT&quot; instead of Codex leads me to believe you&#x27;re talking about using the web-based ChatGPT interface to work on a large codebase, which is completely beside the point of the entire discussion. That&#x27;s not the tool anyone is talking about here.
    • danielvaughn1 day ago
      It&#x27;s important to understand that he&#x27;s talking about a specific set of models that were release around november&#x2F;december, and that we&#x27;ve hit a kind of inflection point in model capabilities. Specifically Anthropic&#x27;s Opus 4.5 model.<p>I never paid any attention to different models, because they all felt roughly equal to me. But Opus 4.5 is really and truly different. It&#x27;s not a qualitative difference, it&#x27;s more like it just finally hit that quantitative edge that allows me to lean much more heavily on it for routine work.<p>I highly suggest trying it out, alongside a well-built coding agent like the one offered by Claude Code, Cursor, or OpenCode. I&#x27;m using it on a fairly complex monorepo and my impressions are much the same as Karpathy&#x27;s.
      • suddenlybananas7 hours ago
        People have said this about every single model release.
        • danielvaughn5 hours ago
          I had the same reaction. So when people were talking about this model back in December, I brushed it off. It wasn&#x27;t until a couple weeks ago that I decided to try it out, and I immediately saw the difference.<p>My opinion isn&#x27;t based on what other people are saying, it&#x27;s my own experience as a fairly AI-skeptical person. Again, I highly suggest you give it an honest try and decide for yourself.
    • keerthiko1 day ago
      Almost always, notes like these are going to be about greenfield projects.<p>Trying to incorporate it in existing codebases (esp when the end user is a support interaction or more away) is still folly, except for closely reviewed and&#x2F;or non-business-logic modifications.<p>That said, it is quite impressive to set up a simple architecture, or just list the filenames, and tell some agents to go crazy to implement what you want the application to do. But once it crosses a certain complexity, I find you need to prompt closer and closer to the weeds to see real results. I imagine a non-technical prompter cannot proceed past a certain prototype fidelity threshold, let alone make meaningful contributions to a mature codebase via LLM without a human engineer to guide and review.
      • reubenmorais1 day ago
        I&#x27;m using it on a large set of existing codebases full of extremely ugly legacy code, weird build systems, tons of business logic and shipping directly to prod at neckbreaking growth over the last two years, and it&#x27;s delivering the same type of value that Karpathy writes about.
      • jjfoooo423 hours ago
        That <i>was</i> true for me, but is no longer.<p>It&#x27;s been especially helpful in explaining and understanding arcane bits of legacy code behavior my users ask about. I trigger Claude to examine the code and figure out how the feature works, then tell it to update the documentation accordingly.
        • chrisjj19 hours ago
          &gt; I trigger Claude to examine the code and figure out how the feature works, then tell it to update the documentation accordingly.<p>And how do you verify its output isn&#x27;t total fabrication?
          • jjfoooo42 hours ago
            I read through it, scanning sections that seem uncontroversial and reading more closely sections that talk about things I&#x27;m less sure about. The output cites key lines of code, which are faster to track down and look at than trying to remember where in a large codebase to look.<p>Inconsistencies also pop up in backtesting, for example if there&#x27;s a point that the llm answers different ways in multiple iterations, that&#x27;s a good candidate to improve docs on.<p>Similar to a coworker&#x27;s work, there&#x27;s a certain amount of trust in the competency involved.
          • _dark_matter_15 hours ago
            Your docs are a contact. You can verify that contract using integration tests
            • chrisjj8 hours ago
              Contract? These docs are information answering user queries. So if you use a chatbot to generate them, I&#x27;d like to be reasonably sure they aren&#x27;t laden with the fabricated misinformation for which these chatbots are famous.
              • jjfoooo42 hours ago
                It&#x27;s a very reasonable concern. My solution is to have the bot classify what the message is talking about as a first pass, and have a relatively strict filtering about what it responds to.<p>For example, I have it ignore messages about code freezes, because that&#x27;s a policy question that probably changes over time, and I have it ignore urgent oncall messages, because the asker there probably wants a quick response from a human.<p>But there&#x27;s a lot of questions in the vein of &quot;How do I write a query for {results my service emits}&quot;, how does this feature work, where automation can handle a lot (and provide more complete answers than a human can off the top of their head)
                • chrisjj49 minutes ago
                  OK, but little of that applies to this use case, to &quot;then tell it to update the documentation accordingly.&quot;
      • 11235813211 day ago
        These models do well changing brownfield applications that have tests because the constraints on a successful implementation are tight. Their solutions can be automatically augmented by research and documentation.
        • mh226616 hours ago
          I don&#x27;t exactly disagree with this but I have seen models simply <i>deleting the tests</i>, or updating the tests to pass and declaring the failures were &quot;unrelated to my changes&quot;, so it helpfully fixed them
          • 11235813213 hours ago
            I’ve had to deal with this a handful of times. You just have to make it restore the test, or keep trying to pass a suite of explicit red-green method tests it wrote earlier.
          • hnben12 hours ago
            Yes. You have to treat the model like an eager yet incompetent worker, i.e. don&#x27;t go full yolo mode and review everything they do.
    • gwd20 hours ago
      For me, in just the golang server instance and the core functional package, `cloc` reports over 40k lines of code, not counting other supporting packages. I spent the last week having Claude rip out the external auth system and replace it with a home-grown one (and having GPT-codex review its changes). If anything, Claude makes it <i>easier</i> on me as a solo founder with a large codebase. Rather than having to re-familiarize myself with code I wrote a year ago, I describe it at a high level, point Claude to a couple of key files, and then tell it to figure out what it needs to do. It can use grep, language server, and other tools to poke around and see what&#x27;s going on. I then have it write an &quot;epic&quot; in markdown containing all the key files, so that future sessions already know the key files to read.<p>I really enjoyed the process. As TFA says, you have to keep a close eye on it. But the whole process was a lot less effort, and I ended up doing mor than I would otherwise have done.
    • ph4te1 day ago
      I don&#x27;t know how big sufficiently large codebase is, but we have a 1mil loc Java application, that is ~10years old, and runs POS systems, and Claude Code has no issues with it. We have done full analyses with output details each module, and also used it to pinpoint specific issues when described. Vibe coding is not used here, just analysis.
    • fy2014 hours ago
      At my dayjob my team uses it on our main dashboard, which is a pretty large CRUD application. The frontend (Vue) is a horrible mess, as it was originally built by people who know just enough to be dangerous. Over time people have introduced new standards without cleaning up the old code - for example, we have three or four different state management techologies.<p>For this the LLM struggles a bit, but so does a human. The main issues are it messes up some state that it didnt realise was used elsewhere, and out test coverage is not great. We&#x27;ve seen humans make exactly the same kind of mistakes. We use MCP for Figma so most of the time it can get a UI 95% done, just a few tweaks needed by the operator.<p>On the backend (Typescript + Node, good test coverage) it can pretty much one-shot - from a plan - whatever feature you give it.<p>We use opus-4.5 mostly, and sometimes gpt-5.2-codex, through Cursor. You aren&#x27;t going to get ChatGPT (the web interface) to do anything useful, switch to Cursor, Codex or Claude Code. And right now it is worth paying for the subscription, you don&#x27;t get the same quality from cheaper or free models (although they are starting to catch up, I&#x27;ve had promising results from GLM-4.7).
    • yasoob15 hours ago
      Another personal example. I spent around a month last year in January on this application: <a href="https:&#x2F;&#x2F;apps.apple.com&#x2F;us&#x2F;app&#x2F;salam-prayer-qibla-quran&#x2F;id6742677863">https:&#x2F;&#x2F;apps.apple.com&#x2F;us&#x2F;app&#x2F;salam-prayer-qibla-quran&#x2F;id674...</a><p>I had never used Swift before that and was able to use AI to whip up a fairly full-featured and complex application with a decent amount of code. I had to make some cross-cutting changes along the way as well that impacted quite a few files and things mostly worked fine with me guiding the AI. Mind you this was a year ago so I can only imagine how much better I would fare now with even better AI models. That whole month was spent not only on coding but on learning Swift enough to fix problems when AI started running into circles and then learning about Xcode profiler to optimize the application for speed and improving perf.
    • BeetleB20 hours ago
      &gt; They seem mostly useless in a sufficiently large codebase especially when they are messy and interactions aren&#x27;t always obvious.<p>What type of documents do you have explaining the codebase and its messy interactions, and have you provided that to the LLM?<p>Also, have you tried giving someone brand new to the team the exact same task and information you gave to the LLM, and how effective were they compared to the LLM?<p>&gt; I don&#x27;t know how much better Claude is than ChatGPT, but I can&#x27;t get ChatGPT to do much useful with an existing large codebase.<p>As others have pointed out, from your comment, it doesn&#x27;t sound like you&#x27;ve used a tool dedicated for AI coding.<p>(But even if you had, it would still fail if you expect LLMs to do stuff without sufficient context).
    • smusamashah21 hours ago
      The code base I work on at $dayjob$ is legacy, has few files with 20k lines each and a few more with around 10k lines each. It&#x27;s hard to find things and connect dots in the code base. Dont think LLMs able to navigate and understand code bases of that size yet. But have seen lots of seemingly large projects shown here lately that involve thousands of files and millions of lines of code.
      • jumploops20 hours ago
        I’ve found that LLMs seem to work better on LLM-generated codebases.<p>Commercial codebases, especially private internal ones, are often messy. It seems this is mostly due to the iterative nature of development in response to customer demands.<p>As a product gets larger, and addresses a wider audience, there’s an ever increasing chance of divergence from the initial assumptions and the new requirements.<p>We call this tech debt.<p>Combine this with a revolving door of developers, and you start to see Conway’s law in action, where the system resembles the organization of the developers rather than the “pure” product spec.<p>With this in mind, I’ve found success in using LLMs to refactor existing codebases to better match the current requirements (i.e. splitting out helpers, modularizing, renaming, etc.).<p>Once the legacy codebase is “LLMified”, the coding agents seem to perform more predictably.<p>YMMV here, as it’s hard to do large refactors without tests for correctness.<p>(Note: I’ve dabbled with a test first refactor approach, but haven’t gone to the lengths to suggest it works, but I believe it could)
        • mh226616 hours ago
          are LLM codebases <i>not</i> messy?<p>Claude by default, unless I tell it not to, will write stuff like:<p><pre><code> &#x2F;&#x2F; we need something to be true somethingPasses = something() if (!somethingPasses) { return false } &#x2F;&#x2F; we need somethingElse to be true somethingElsePasses = somethingElse() if (!somethingElsePasses) { return false } return true </code></pre> instead of the very simple boolean logic that could express this in one line, with the &quot;this code does what it obviously does&quot; comments added all over the place.<p>generally unless you tell it not to, it does things in very verbose ways that most humans would never do, and since there&#x27;s an infinite number of ways that it can invent absurd verbosity, it is hard to preemptively prompt against all of them.<p>to be clear, I am getting a huge amount of value out of it for executing a bunch of large refactors and &quot;modernization&quot; of a (really) big legacy codebase at scale and in parallel. but it&#x27;s not outputting the sort of code that I see when someone prompts it &quot;build a new feature ...&quot;, and a big part of my prompts is screaming at it <i>not</i> to do certain things or to refuse the task if it at any point becomes unsure.
          • jumploops15 hours ago
            Yeah to be clear it will have the same issues as a flyby contributor if prompted to.<p>Meaning if you ask it “handle this new condition” it will happily throw in a hacky conditional and get the job done.<p>I’ve found the most success in having it reason about the current architecture (explicitly), and then to propose a set of changes to accomplish the task (2-5 ways), review, and then implement the changes that best suit the scope of the larger system.
            • dexdal15 hours ago
              The failure mode is missing constraints, not “coding skill”. Treat the model as a generator that must operate inside an explicit workflow: define the invariant boundaries, require a plan&#x2F;diff before edits, run tests and static checks, and stop when uncertainty appears. That turns “hacky conditional” behaviour into controlled change.
              • jumploops14 hours ago
                Yes, exactly.<p>The LLM is onboarding to your codebase with each context window, all it knows is what it’s seen already.
        • olig1519 hours ago
          Surely because LLM generated code is part of the training data for the model, so code&#x2F;patterns it can work with is closer to its training data.
    • tunesmith1 day ago
      If you have a ChatGPT account, there&#x27;s nothing stopping you from installing codex cli and using your chatgpt account with it. I haven&#x27;t coded with ChatGPT for weeks. Maybe a month ago I got utility out of coding with codex and then having ChatGPT look at my open IDE page to give comments, but since 5.2 came out, it&#x27;s been 100% codex.
    • Okkef1 day ago
      Try Claude code. It’s different.<p>After you tried it, come back.
      • I think its not Claude code per se itself but rather the (Opus 4.5 model?) or something in an agentic workflow.<p>I tried a website which offered the Opus model in their agentic workflow &amp; I felt something <i>different</i> too I guess.<p>Currently trying out Kimi code (using their recent kimi 2.5) for the first time buying any AI product because got it for like 1.49$ per month. It does feel a bit less powerful than claude code but I feel like monetarily its worth it.<p>Y&#x27;know you have to like bargain with an AI model to reduce its pricing which I just felt really curious about. The psychology behind it feels fascinating because I think even as a frugal person, I already felt invested enough in the model and that became my sunk cost fallacy<p>Shame for me personally because they use it as a hook to get people using their tool and then charge next month 19$ (I mean really Cheaper than claude code for the most part but still comparative to 1.49$)
    • jwr16 hours ago
      I successfully use Claude Code in a large complex codebase. It&#x27;s Clojure, perhaps that helps (Clojure is very concise, expressive and hence token-dense).
      • culi16 hours ago
        Perhaps it&#x27;s harder to &quot;do Closure wrong&quot; than it is to do JavaScript or Python or whatever other extremely flexible multi-paradigm high-level language
        • wcedmisten15 hours ago
          Having spent 3 years of my career working with Clojure, I think it actually gives you even more rope to shoot yourself with than Python&#x2F;JS.<p>E.g. macros exist in Clojure but not Python&#x2F;JS, and I&#x27;ve definitely been plenty stumped by seeing them in the codebase. They tend to be used in very &quot;clever&quot; patterns.<p>On the other hand, I&#x27;m a bit surprised Claude can tackle a complex Clojure codebase. It&#x27;s been a while since I attempted using an LLM for Clojure, but at the time it failed completely (I think because there is relatively little training data compared to other mainstream languages). I&#x27;ll have to check that out myself
    • epolanski20 hours ago
      1. Write good documentation, architecture, how things work, code styling, etc.<p>2. Put your important dependencies source code in the same directory. E.g. put a `_vendor` directory in the project, in it put the codebase at the same tag you&#x27;re using or whatever: postgres, redis, vue, whatever.<p>3. Write good plans and requirements. Acceptance criteria, context, user stories, etc. Save them in markdown files. Review those multiple times with LLMs trying to find weaknesses. Then move to implementation files: make it write a detailed plan of what it&#x27;s gonna change and why, and what it will produce.<p>4. Write very good prompts. LLMs follow instructions well if they are clear &quot;you should proactively do X&quot;, is a weak instruction if you mean &quot;you must do X&quot;.<p>5. LLMs are far from perfect, and full of limits. Karpathy sums their cons very well in his long list. If you don&#x27;t know their limits you&#x27;ll mismanage the expectations and not use them when they are a huge boost and waste time on things they don&#x27;t cope well with. On top of that: all LLMs are different in their &quot;personality&quot;, how they adhere to instruction, how creative they are, etc.
    • bluGill22 hours ago
      I&#x27;ve been trying Claude on my large code base today. When I give it the requirements I&#x27;d give an engineer and so &quot;do it&quot; it just writes garbage that doesn&#x27;t make sense and doesn&#x27;t seem to even meet the requirements (if it does I can&#x27;t follow how - though I&#x27;ll admit to giving up before I understood what it did, and I didn&#x27;t try it on a real system). When I forced it to step back and do tiny steps - in TDD write one test of the full feature - it did much better - but then I spent the next 5 hours adjusting the code it wrote to meet our coding standards. At least I understand the code, but I&#x27;m not sure it is any faster (but it is a lot easier to see things wrong than come up with green field code).<p>Which is to say you have to learn to use the tools. I&#x27;ve only just started, and cannot claim to be an expert. I&#x27;ll keep using them - in part because everyone is demanding I do - but to use them you clearly need to know how to do it yourself.
      • simonw22 hours ago
        Have you tried showing it a copy of your coding standards?<p>I also find pointing it to an existing folder full of code that conforms to certain standards can work really well.
        • bluGill18 hours ago
          At least some of them that it violated it has seen.
        • bflesch21 hours ago
          Yeah let&#x27;s share all your IP for the vague promise that it will somehow work ;)
          • simonw20 hours ago
            You just gave me a revelation as to why some people report being unable to get decent results out of coding agents!
          • CamperBob219 hours ago
            (Shrug) If you&#x27;re not willing to make that tradeoff, you&#x27;ll be outcompeted by people who are. Your call.
      • rob21 hours ago
        I&#x27;ve been playing around with the &quot;Superpowers&quot; [0] plugin in Claude Code on a new small project and really like it. Simple enough to understand quickly by reading the GitHub repo and seems to improve the output quality of my projects.<p>There&#x27;s basically a &quot;brainstorm&quot; &#x2F;slash command that you go back and forth with, and it places what you came up with in docs&#x2F;plans&#x2F;YYYY-MM-DD-&lt;topic&gt;-design.md.<p>Then you can run a &quot;write-plan&quot; &#x2F;slash command on the docs&#x2F;plans&#x2F;YYYY-MM-DD-&lt;topic&gt;-design.md file, and it&#x27;ll give you a docs&#x2F;plans&#x2F;YYYY-MM-DD-&lt;topic&gt;-implementation.md file that you can then feed to the &quot;execute-plan&quot; &#x2F;slash command, where it breaks everything down into batches, tasks, etc, and actually implements everything (so three &#x2F;slash commands total.)<p>There&#x27;s also &quot;GET SHIT DONE&quot; (GSD) [1] that I want to look at, but at first glance it seems to be a bit more involved than Superpowers with more commands. Maybe it&#x27;d be better for larger projects.<p>[0] <a href="https:&#x2F;&#x2F;github.com&#x2F;obra&#x2F;superpowers" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;obra&#x2F;superpowers</a><p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;glittercowboy&#x2F;get-shit-done" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;glittercowboy&#x2F;get-shit-done</a>
      • gverrilla16 hours ago
        it&#x27;s all about the context. observe what files it opened, etc. good luck
    • datsci_est_201517 hours ago
      Also I never see anyone talking about code reviews, which is one of the primary ways that software engineering departments manage liability. We fired someone recently because they couldn’t explain any of the slop they were trying to get merged. Why tf would I accept the liability of managing code that someone else can’t even explain?<p>I guess this is fine when you don’t have customers or stakeholders that give a shit lol.
    • They build Claude Code fully with Claude Code.
      • Macha23 hours ago
        Which is equal parts praise and damnation. Claude Code does do a lot of nice things that people just kind of don&#x27;t bother for time cost &#x2F; reward when writing TUIs that they&#x27;ve probably only done because they&#x27;re using AI heavily, but equally it has a lot of underbaked edges (like accidentally shadowing the user&#x27;s shell configuration when it tries to install terminal bindings for shift-enter even though the terminal it&#x27;s configuring already sends a distinct shift-enter result), and bugs (have you ever noticed it just stop, unfinished?).
        • simianwords22 hours ago
          i haven&#x27;t used Claude Code but come on.. it is a production level quality application used seriously by millions.
          • xyzsparetimexyz20 hours ago
            Look up the flickering issue. The program was created by dunces.
          • gsk2220 hours ago
            If you haven&#x27;t used it, how can you judge its quality level?
      • vindex1020 hours ago
        Ah, now I understand why @autocomplete suddenly got broken between versions and still not fixed )
    • redox9920 hours ago
      What do you even mean by &quot;ChatGPT&quot;? Copy pasting code into chatgpt.com?<p>AI assisted coding has never been like that, which would be atrocious. The typical workflow was using Cursor with some model of your choice (almost always an Anthropic model like sonnet before opus 4.5 released). Nowadays (in addition to IDEs) it&#x27;s often a CLI tool like Claude Code with Opus or Codex CLI with GPT Codex 5.2 high&#x2F;xhigh.
    • maxdo1 day ago
      chatGPT is not made to write code. Get out of stone age :)
    • spaceman_20201 day ago
      I&#x27;m afraid that we&#x27;re entering a time when the performance difference between the really cutting edge and even the three-month-old tools is vast<p>If you&#x27;re using plain vanilla chatgpt, you&#x27;re woefully, woefully out of touch. Heck, even plain claude code is now outdated
      • shj210523 hours ago
        Why is plain Claude code outdated? I thought that’s what most people are using right now that are AI forward. Is it Ralph loops now that’s the new thing?
        • spaceman_202021 hours ago
          Plain Claude Code doesn’t have enough scaffolding to handle large projects<p>At a base level, people are “upgrading” their Claude Code with custom skills and subagents - all text files saved in .claude&#x2F;agents|skills.<p>You can also use their new tasks primitive to basically run a Ralph-like loop<p>But at the edges, people are using multiple instances, each handling different aspects in parallel - stuff like Gas Town<p>Tbf you can still get a lot of mileage out of vanilla Claude Code. But I’ve found that even adding a simple frontend design skill improves the output substantially
          • duckmysick19 hours ago
            Is there anywhere where we can learn more about creating your own agents&#x2F;skills? Maybe some decent public repos that you could recommend.
            • spaceman_202014 hours ago
              You can just ask Claude to create them. They’re just markdown files<p>Anthropic’s own repo is as good place as any<p><a href="https:&#x2F;&#x2F;github.com&#x2F;anthropics&#x2F;skills" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;anthropics&#x2F;skills</a>
  • toephu221 hours ago
    I think in less than a year writing code manually will be akin to doing arithmetic problems by hand. Sure you can still code manually, but it&#x27;s going to be a lot faster to use an LLM (calculator).
    • adamddev120 hours ago
      People keep using these analogies but I think these are fundamentally different things.<p>1. hand arithmetic -&gt; using a calculator<p>2. assembly -&gt; using a high level language<p>3. writing code -&gt; making an LLM write code<p>Number 3 does not belong. Number 3 is a fundamentally different leap because it&#x27;s not based on deterministic logic. You can&#x27;t depend on an LLM like you can depend on a calculator or a compiler. LLMs are totally different.
      • Havoc18 hours ago
        There are definitely parallels though. eg you could swap out your compiler for a different one that produces slightly different assembly. Similarly a LLM may implement things differently…but if it works do we care? Probably no more than when you buy software you don’t care precisely what compiler optimisation were used. The precise deterministicness isn’t a key feature
        • yojat66114 hours ago
          With the llm, it might work or it might not. If it doesn&#x27;t work, then you have to keep iterating and hand holding it to make it work. Sometimes that process is less optimal than writing the code manually. With a calculator, you can be sure that the first attempt will work. An idiot with a calculator can still produce correct results. An idiot with an llm often cannot outside trivial solutions.
        • adamddev113 hours ago
          &gt; but if it works so we care?<p>It often doesn&#x27;t work. That&#x27;s the point. A calculator works 100% of the time. A LLM might work 95% of the time, or 80%, or 40%, or 99% depending on what you&#x27;re doing. This is difference and a key feature.
          • Havoc5 hours ago
            I see. I’d call that fragility&#x2F;reliability rather than deterministic but semantics I suppose.<p>To me that isn’t a show stopper. Much of the real world works like that. We put very unreliable humans behind the wheel of 2 ton cars. So in a way this is perhaps just programmers aligning with the messy real world?<p>Perhaps a bit like architects can only model things so far eventually you need to build the thing and deal with the surprises and imperfection of dirt
    • AstroBen17 hours ago
      This is true if your calculator sometimes gave the wrong answer and you had to check each time
    • kypro20 hours ago
      I agree, but writing code is so different to calculations that long-term benefits are less clear.<p>It doesn&#x27;t matter how good you are at calculations the answer to 2 + 2 is always 4. There are no methods of solving 2 + 2 which could result in you accidentally giving everyone who reads the result of your calculation write access to your entire DB. But there are different ways to code a system even if the UI is the same, and some of these may neglect to consider permissions.<p>I think a good parallel here would be to imagine that tomorrow we had access to humanoid robots who could do construction work. Would we want them to just go build skyscrapers and bridges and view all construction businesses which didn&#x27;t embrace the humanoid robots as akin to doing arithmetic by hand?<p>You could of course argue that there&#x27;s no problem here so long as trained construction workers are supervising the robots to make sure they&#x27;re getting tolerances right and doing good welds, but then what happens 10 years down the road when humans haven&#x27;t built a building in years? If people are not writing code any more then how can people be expected to review AI generated code?<p>I think the optimistic picture here is that humans just won&#x27;t be needed in the future. In theory when models are good enough we should be able to trust the AI systems more than humans. But the less optimistic side of me questions a future in which humans no longer do, or even know how to do such fundamental things.
  • pron19 hours ago
    People who just let the agent code for them, how big of a codebase are you working on? How complex (i.e. is it a codebase that junior programmers could write and maintain)?
    • bojo19 hours ago
      I&#x27;ve been an EM for the last 10 of my 25 year Software Engineering career. Coding is, frankly, boring to me anymore, even though I enjoyed doing it most of my career. I had this project I wanted to exist in world but couldn&#x27;t be bothered to get started.<p>Decided to figure out what this &quot;vibe coding&quot; nonsense is, and now there&#x27;s a certain level of joy to all of this again. Being able to clearly define everything using markdown contexts before any code is even written has been a great way to brain dump those 25 years of experience and actually watch something sane get produced.<p>Here are the stats Claude Code gave me:<p><pre><code> Overview ┌───────────────┬────────────────────────────┐ │ Metric │ Value │ ├───────────────┼────────────────────────────┤ │ Total Commits │ 365 │ ├───────────────┼────────────────────────────┤ │ Project Age │ 7 days (Jan 20 - 27, 2026) │ ├───────────────┼────────────────────────────┤ │ Open Issues │ 5 │ ├───────────────┼────────────────────────────┤ │ Contributors │ 1 │ └───────────────┴────────────────────────────┘ Lines of Code by Language ┌───────────────────────────┬───────┬────────┬───────────┐ │ Language │ Files │ Lines │ % of Code │ ├───────────────────────────┼───────┼────────┼───────────┤ │ Rust (Backend) │ 94 │ 31,317 │ 51.8% │ ├───────────────────────────┼───────┼────────┼───────────┤ │ TypeScript&#x2F;TSX (Frontend) │ 189 │ 29,167 │ 48.2% │ ├───────────────────────────┼───────┼────────┼───────────┤ │ SQL (Migrations) │ 34 │ 1,334 │ — │ ├───────────────────────────┼───────┼────────┼───────────┤ │ CSS │ — │ 1,868 │ — │ ├───────────────────────────┼───────┼────────┼───────────┤ │ Markdown (Docs) │ 37 │ 9,485 │ — │ ├───────────────────────────┼───────┼────────┼───────────┤ │ Total Source │ 317 │ 60,484 │ 100% │ └───────────────────────────┴───────┴────────┴───────────┘</code></pre>
      • bojo19 hours ago
        In case anyone is curious, here was my epiphany project from 2 weeks ago: <a href="https:&#x2F;&#x2F;github.com&#x2F;boj&#x2F;the-project" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;boj&#x2F;the-project</a><p>I then realized I could feed it everything it ever needed to know. Just create a docs&#x2F;* folder and tell it to read that every session.<p>Through discovery I learned about CLAUDE.md, and adding skills.<p>Now I have an &#x2F;analyst, &#x2F;engineer, and &#x2F;devops that I talk to all day with their own logic and limitations, as well as the more general project CLAUDE.md, and dozens of docs&#x2F;* files we collaborate on.<p>I&#x27;m at the point I&#x27;m running happy.engineering on my phone and don&#x27;t even need to sit in front of the computer anymore.
        • darkwater10 hours ago
          Interesting!<p>I wonder if this line<p>&gt; It will configure an auth_backend.rs and wire up a basic user<p>over a big enough number of projects will lead to at least 2-3 different user names.
        • UnlockedSecrets15 hours ago
          How much did this type of project cost you to make?
          • bojo4 hours ago
            What kind of costs are you thinking?
    • aixpert19 hours ago
      rust compiler and redox operating system with modified Qemu for Mac Vulcan metal pipeline ... probably not junior stuff<p>you might think I&#x27;m kidding but Search redox on github, you will find that project and the anonymous contributions
      • rester32419 hours ago
        I am curious. What do you want us to see in that github repo?
  • &gt; the ratio of productivity between the mean and the max engineer? It&#x27;s quite possible that this grows *a lot*<p>I have a professor who has researched auto generated code for decades and about six months ago he told me he didn&#x27;t think AI would make humans obsolete but that it was like other incremental tools over the years and it would just make good coders even better than other coders. He also said it would probably come with its share of disappointments and never be fully autonomous. Some of what he said was a critique of AI and some of it was just pointing out that it&#x27;s very difficult to have perfect code&#x2F;specs.
    • slfreference1 day ago
      I can sense two classes of coders emerging.<p>Billionaire coder: a person who has &quot;written&quot; billion lines.<p>Ordinary coders : people with only couple of thousands to their git blame.
  • bartoszcki19 hours ago
    Feels like a combination of writing very detailed task descriptions and reviewing junior devs. It&#x27;s horrible. I very much hope this won&#x27;t be my job.
    • woah2 hours ago
      Probably won&#x27;t be if you don&#x27;t get good at it.
  • Aperocky3 hours ago
    Is it really brain atrophy if I never learned to code in ASM in my entire career as compiler has been doing that for me?<p>A part of me really want to say yes and wear it as a badge to have been coding before LLMs were a thing, but at the same time, it&#x27;s not unprecedented.
    • direwolf202 hours ago
      Is it muscle atrophy if you were a weakling since birth? Is it retina degeneration if you were born blind? No, because atrophy is a loss of a prior strength, and not an ever–existing weakness, but it&#x27;s just as bad.
    • ex-aws-dude3 hours ago
      The thing is the compiler does exactly what you want it to 99.999…% of the time so you never have to drop down into ASM<p>That’s not really true in this case<p>I think a person with zero coding knowledge would have a lot tougher time using these tools successfully
  • jliptzin6 hours ago
    The tenacity part is definitely true. I told it to keep trying when it kept getting stuck trying to spin up an Amazon Fargate service. I could feel its pain, and wanted to help, but I wanted to see whether the LLM could free itself from the thorny and treacherous AWS documentation forest. After a few dozen attempts and probably 50 KWh of energy it finally got it working, I was impressed. I could have done it faster myself, but the tradeoff would have been much higher blood pressure. Instead I relaxed and watched youtube while the LLM did its work.
  • nsainsbury21 hours ago
    Touching on the atrophy point, I actually wrote a few thoughts about this yesterday: <a href="https:&#x2F;&#x2F;www.neilwithdata.com&#x2F;outsourced-thinking" rel="nofollow">https:&#x2F;&#x2F;www.neilwithdata.com&#x2F;outsourced-thinking</a><p>I actually disagree with Andrej here re: &quot;Generation (writing code) and discrimination (reading code) are different capabilities in the brain.&quot; and I would argue that the only reason he can read code fluently, find issues, etc. is because he has spent year in a non-AI assisted world writing code. As time goes on, he will become substantially worse.<p>This also bodes incredibly poorly for the next generation, who will mostly in their formative years now avoid writing code and thus fail to even develop a idea of what good code is, how it works&#x2F;why it works, why you make certain decisions, and not others, etc. and ultimately you will see them become utterly dependent on AI, unable to make progress without it.<p>IMO outsourcing thinking is going to have incredibly negative consequences for the world at large.
    • gwd20 hours ago
      Is coding like piloting, where pilots need a certain number of hours of &quot;flight time&quot; to gain skills, and then a certain number of additional hours each year to maintain their skills? Do developers need to schedule in a certain number of &quot;manually written lines of code&quot; every year?
    • thoughtpeddler20 hours ago
      Read your blog post and agree with some of it. Largely I agree with the premise that the 2nd and 3rd order effects of this technology will be more impactful than the 1st order “I was able to code this app I wouldn’t have otherwise even attempted to”. But they are so hard to predict!
    • olafalo16 hours ago
      Thanks, this rings true to me. The struggle is an investment, and it pays off in good judgement and taste. The same goes for individual codebases too. When I see some weird bug and can immediately guess what’s going wrong and why, that’s my time spent in that codebase paying off. I guess LLM-ing a feature is the inverse, incurring some kind of cognitive debt.
    • nicodjimenez3 hours ago
      great article and many great points here
  • elif6 hours ago
    Why am I not surprised that a blog was written about LLM coding going from 20% to 80% useful, yet all of the HN comments are still nit picking about some negative details rather than building positive ideas toward some progress...<p>Is the programmer ego really this fragile? At least luddites had an ideological reasoning, whereas here we just seem to have emotional reflexes.
    • phito6 hours ago
      It&#x27;s because we see a bunch of people completely ignoring the missing 20% and flooding the world with complete slop. The push back is required to keep us sane, we need people reminding others that it&#x27;s not at 100% yet even if it sometimes feels like it.
      • hollowturtle6 hours ago
        Then you have Anthropic that states on his own blog that engineers fully delegate to claude code only from 0 to 20% <a href="https:&#x2F;&#x2F;www.anthropic.com&#x2F;research&#x2F;how-ai-is-transforming-work-at-anthropic" rel="nofollow">https:&#x2F;&#x2F;www.anthropic.com&#x2F;research&#x2F;how-ai-is-transforming-wo...</a><p>The fact that people keep pushing figures like 80% is total bs to me
        • an0malous4 hours ago
          It’s usually people doing side projects or non-programmers who can’t tell the code is slop. None of these vibe coding evangelists ever shares the code they’re so amazed by, even though by their own logic anyone should be able to generate the same code with AI.
      • bob10294 hours ago
        This kind of thought policing is getting to be exhausting. Perhaps we need a different kind of push back.<p>Do you know what my use case is? Do you know what kind of success rate I would actually achieve right now? Please show me where my missing 20% resides.
        • phito3 hours ago
          Thought policing, lol. People are just sharing their perspectives, no need to take it personally. Glad it&#x27;s working well for you.
  • doe889 hours ago
    Are there good guides about how to write <i>Agents</i> or good repos with examples? Also, are there big differences between how you would write one in Codex cli vs Claude code? Can there be run on it interchangeably?
  • fishtoaster1 day ago
    &gt; if you have any code you actually care about I would watch them like a hawk, in a nice large IDE on the side.<p>This is about where I&#x27;m at. I love pure claude code for code I don&#x27;t care about, but for anything I&#x27;m working on with other people I need to audit the results - which I much prefer to do in an IDE.
  • tariky6 hours ago
    I used CC in year age and it was not good. But one month ago I paid for max and started to rebuild my company web shop using it.<p>It is like plowing land with hand one year age and now is like I&#x27;m in brend new John Deere. It&#x27;s amazing.<p>Of course its not perfect but if you understand code and problem it needs to solve then it works really good.
  • jeffreygoesto12 hours ago
    &gt; How much of society is bottlenecked by digital knowledge work?<p>I think not much. The real society bottleneck is that a growing number of peeps try to convince each other that life and society are a zero sum game.<p>They are so much more if we don&#x27;t do that.
  • twa92722 hours ago
    I don&#x27;t see the AI capacity jump in the recent months at all. For me it&#x27;s more the opposite, CC works worse than a few months ago. Keeps forgetting the rules from CLAUDE.md, hallucinates function calls, generates tons of over-verbose plans, generates overengineered code. Where I find it a clear net-positive is pure frontend code (HTML + Tailwind), it&#x27;s spaghetti but since it&#x27;s just visualization, it&#x27;s OK.
    • culi16 hours ago
      Sad to hear this attitude towards front-end code. Front-ends are so often already miswritten and full of accessibility pitfalls and I feel like LLMs are gonna dramatically magnify this problem :(
    • ValentineC22 hours ago
      &gt; <i>Where I find it a clear net-positive is pure frontend code (HTML + Tailwind), it&#x27;s spaghetti but since it&#x27;s just visualization, it&#x27;s OK.</i><p>This makes it sound like we&#x27;re back in the days of FrontPage&#x2F;Dreamweaver WYSIWYG. Goodness.
      • twa92722 hours ago
        Hmm, your comment gave me the idea that maybe we should invent &quot;What You Describe Is What You Get|. To replace HTML+Tailwind spaghetti with prompts generating it.
    • DominikPeters19 hours ago
      Are you using Opus 4.5? Sounds more like Sonnet.
      • twa9279 hours ago
        Yes I&#x27;m using Sonnet 4.5. Thanks for the tip, will try Opus 4.5, although costs might become an issue.
  • nsb122 hours ago
    The best thing I ever told Claude to do was &quot;Swear profusely when discussing code and code changes&quot;. Probably says more about me than Claude, but it makes me snicker.
  • epolanski20 hours ago
    &gt; What happens to the &quot;10X engineer&quot; - the ratio of productivity between the mean and the max engineer? It&#x27;s quite possible that this grows <i>a lot</i>.<p>No doubt that good engineers will know when and how to leverage the tool, both for coding and improving processes (design-to-code, requirement collection, task tracking, basic code reviewal, etc) improving their own productivity and of those around them.<p>Motivated individuals will also leverage these tools to learn more and faster.<p>And yes, of course it&#x27;s not the only tool one should use, of course there&#x27;s still value in talking with proper human experts to learn from, etc, but 90% of the time you&#x27;re looking for info the LLM will dig it from you reading at the source code of e.g. Postgres and its test rather than asking on chats&#x2F;stack overflow.<p>This is a trasformative technology that will make great engineers even stronger, but it will weed out those who were merely valued for their very basic capability of churning <i>something</i> but never cared neither about engineering nor coding, which is 90% of our industry.
  • giancarlostoro19 hours ago
    &gt; IDEs&#x2F;agent swarms&#x2F;fallability. Both the &quot;no need for IDE anymore&quot; hype and the &quot;agent swarm&quot; hype is imo too much for right now.<p>I&#x27;m honestly considering throwing away my JetBrains subscription and this is year 9 or 10 of me having one. I only open Zed and start yappin&#x27; at Claude Code. My employer doesn&#x27;t even want me using ReSharper because some contractor ruined it for everyone else by auto running all code suggestions and checking them in blindly, making for really obnoxious code diffs and probably introducing countless bugs and issues.<p>Meanwhile tasks that I know would take any developers months, I can hand-craft with Claude in a few hours, with the same level of detail, but no endless weeks of working on things that&#x27;ll be done SoonTM.
  • TheGRS23 hours ago
    I do feel a big mood shift after late November. I switched to using Cursor and Gemini primarily and it was big change in my ability to get my ideas into code effectively. The Cursor interface for one got to a place that I really like and enjoy using, but its probably more that the results from the agents themselves are less frustrating. I can deal with the output more now.<p>I&#x27;m still a little iffy on the agent swarm idea. I think I will need to see it in action in an interface that works for me. To me it feels like we are anthropomorphizing agents too much, and that results in this idea that we can put agents into roles and them combine them into useful teams. I can&#x27;t help seeing all agents as the same automatons and I have trouble understanding why giving an agent with different guideliens to follow, and then having them follow along another agent would give me better results than just fixing the context in the first place. Either that or just working more on the code pipeline to spot issues early on - all the stuff we already test for.
  • vibeprofessor1 day ago
    The AGI vibes with Claude Code are real, but the micromanagement tax is heavy. I spend most of my time babysitting agents.<p>I expect interviews will evolve into &quot;build project X with an LLM while we watch&quot; and audit of agent specs
    • maxdo1 day ago
      I&#x27;ve been doing vibe code interviews for nearly a year now. Most people are surprisingly bad with AI tools. We specifically ask them to bring their preferred tool, yet 20–30% still just copy-paste code from ChatGPT.<p>fun stats: corelation is real, people who were good at vibe code, also had offer(s) with other companies that didn&#x27;t run vibe code interviews.
      • xyzsparetimexyz20 hours ago
        Copy pasting from chatgpt is the most secure option.
        • jatari3 hours ago
          Also the method that will result in the higher quality codebase.
        • maxdo14 hours ago
          Not going from home is the most secure way of going out.<p>It doesn’t work you can’t be productive without agent capable of doing queries to db etc
          • xyzsparetimexyz10 hours ago
            &gt;Not going from home is the most secure way of going out.<p>What? I can&#x27;t parse this sentence. Maybe get an ai to rewrite it?
      • bflesch21 hours ago
        Interesting you say that, feels like when people were too stupid to google things and &quot;googling something&quot; was a skill that some had and others didn&#x27;t.
    • From what I&#x27;ve heard, what few interviews there are for software engineers these days, they do have you use models and see how quickly you can build things.
      • iwontberude1 day ago
        The interviews I’ve given have asked about how control for AI slop without hurting your colleagues feelings. Anyone can prompt and build, the harder part, as usual for business, is knowing how and when to say, ‘no.’
    • 0xy1 day ago
      Sounds great to me. Leetcode is outdated and heavily abused by people who share the questions ahead of time in various forums and chats.
  • arh54519 hours ago
    Thank you for the really excellent summation. I echo your thought 1 to 1. I have found it more difficult to learn new languages or coding skills, because I am no longer forced to go through the painful slow grind of learning.
    • gregjor9 hours ago
      Painful slow grind? I have always found the learning part what I enjoy most about programming. I don&#x27;t intend to outsource that a chatbot.
    • ed_mercer9 hours ago
      Does one ever still need to learn new languages or coding skills if an AI will be able to do it?
      • dag118 hours ago
        This question makes me unbelievably sad. Why should anyone learn anything?<p>I&#x27;m not disagreeing.
      • FeteCommuniste8 hours ago
        Probably not. But as someone who has learned a few languages, having to outsource a conversation to a machine will never not feel incredibly lame.<p>I doubt most people feel the same, though.
  • thomassmith6520 hours ago
    <p><pre><code> Slopacolypse. I am bracing for 2026 as the year of the slopacolypse across all of github, substack, arxiv, X&#x2F;instagram, and generally all digital media. </code></pre> Did he coin the term <i>&quot;slopacolypse&quot;</i>? It&#x27;s a useful one.
    • chrisjj19 hours ago
      I prefer slopocalypse.
      • direwolf2058 minutes ago
        not aslopalypse?
      • rvz16 hours ago
        That works better.<p>“slopacolypse” does not make any sense both in writing and pronunciation.
  • alexose20 hours ago
    It&#x27;s refreshing to see one of the top minds in AI converge on the same set of thoughts and frustrations as me.<p>For as fast as this is all moving, it&#x27;s good to remember that most of us are actually a lot closer to the tip of the spear than we think.
  • siliconc0w20 hours ago
    Not sure how he is measuring, I&#x27;m still closer to about a 60% success rate. It&#x27;s more like 20% is an acceptable one-shot, this goes to 60% acceptable with some iteration, but 40% either needs manual intervention to succeed or such significant iteration that manual is likely faster.<p>I can supervise maybe three agents in parallel before a task requiring significant hand-holding means I&#x27;m likely blocking an agent.<p>And the time an agent is &#x27;restlessly working&#x27; on something in usually inversely correlated with the likelihood to succeed. Usually if it&#x27;s going down a rabbit hole, the correct thing to do is to intervene and reorient it.
  • longhaul19 hours ago
    Am working on an iPhone app and impressed with how well Claude is able to generate decent&#x2F;working code with prompts in plain English. I don’t have previous experience in building apps or swift but have a C++ background. Working in smaller chunks and incrementally adding features rather than a large prompt for the whole app seems more practical, is easier to review and build confidence.<p>Adding&#x2F;prompting features one by one, reviewing code and then testing the resulting binary feels like the new programming workflow<p>Prompt&#x2F;REview&#x2F;Test - PRET.
  • axus18 hours ago
    Finally, literate programming!<p><a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Literate_programming" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Literate_programming</a>
  • TrackerFF7 hours ago
    Minor nitpick: The original measure of a 10x programmer was not the productivity multiplier max&#x2F;mean, but rather max&#x2F;min.
  • jopsen1 day ago
    &gt; - How much of society is bottlenecked by digital knowledge work?<p>Any qualified guesses?<p>I&#x27;m not convinced more traders on wall street will allocate capital more effectively leading to economic growth.<p>Will more programmers grow the economy? Or should we get real jobs ;)
    • iwontberude1 day ago
      Most of this countries challenges are strictly political. The pittance of work software can contribute is most likely negligible or destructive (e.g. software buttons in cars or palantir). In other words were picked all the low hanging fruit and all that left is to hang ourselves.
      • js81 day ago
        I actually disagree. Having software (AI) that can cut through the technological stuff faster will make people more aware of political problems.
      • iwontberude23 hours ago
        edit: country&#x27;s* all that is left*
  • rschick1 day ago
    Great point about expansion vs speedup. I now have time to build custom tools, implement more features, try out different API designs, get 100% test coverage.. I can deliver more quickly, but can also deliver more overall.
  • daxfohl23 hours ago
    I&#x27;m curious to see what effect this change has on leadership. For the last two years it&#x27;s been &quot;put everything you can into AI coding, or else!&quot; with quotas and firings and whatever else. Now that AI is at the stage where it can actually output whole features with minimal handholding, is there going to be a Frankenstein moment where leadership realizes they now have a product whose codebase is running away from their engineering team&#x27;s ability to support it? Does it change the calculus of what it means to be underinvested vs overinvested in AI, and what are the implications?
  • forrestthewoods22 hours ago
    HN should ban any discussion on “things I learned playing with AI” that don’t include direct artifacts of the thing built.<p>We’re about a year deep into “AI is changing everything” and I don’t see 10x software quality or output.<p>Now don’t get me wrong I’m a big fan of AI tooling and think it does meaningfully increase value. But I’m damn tired of all the talk with literally nothing to show for it or back it up.
  • erelong13 hours ago
    &gt; 80% agent coding<p>A lot of these things sound cool but sometimes I&#x27;m curious what they&#x27;re actually building<p>Like, is their bottleneck creativity now then? Are they building naything interedting or using agents to build... things that don&#x27;t appeal to me, anyway?
    • ewidar11 hours ago
      I guess it depends what appeal to you.<p>As an example finding myself in a similar 80% situation, over the last few months I built<p>- a personal website with my projects and poems<p>- an app to rework recipes in a format I like from any source (text, video,...)<p>- a 3d visual version of a project my nephew did for work<p>- a gym class finder in my area with filters the websites don&#x27;t provide<p>- a football data game<p>- working on a saas for work so typical saas stuff<p>I was never that productive on personal projects, so this is great for me.<p>Also the coding part of these projects was not very appealing to me, only the output, so it fits well with AI using.<p>In the meanwhile I did Advent of Code as usual for the fun of code. Different objectives.
  • dzonga6 hours ago
    maybe its just me doing stuff that&#x27;s out the usual loop<p>even dealing with api&#x27;s that have MCP servers the so called agents make a mess of everything.<p>my stuff is just regular data stuff - ingest data from x - transform it | make it real time - then pipe it to y
  • svara3 hours ago
    Basically mirrors my experience.<p>Interestingly, when you point out this ...<p>&gt; IDEs&#x2F;agent swarms&#x2F;fallability. Both the &quot;no need for IDE anymore&quot; hype and the &quot;agent swarm&quot; hype is imo too much for right now. The models definitely still make mistakes and if you have any code you actually care about I would watch them like a hawk, in a nice large IDE on the side.<p>... here on HN [0] you get a bunch of people telling you to get with the times, grandpa.<p>Really makes me wonder: Who are these people and why are they doing that?<p>[0] <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=46745039">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=46745039</a>
  • energy12315 hours ago
    A big wow moment coming up is going to be GPT 5.* in Codex with Cerebras doing inference. The inference speed is going to be a big unlock, because many tasks are intrinsically serial.<p>It&#x27;s going to feel literally like playing God, where you type in what you want and it happens ~instantly.
    • brcmthrowaway14 hours ago
      When?
      • energy12313 hours ago
        I don&#x27;t know when but I&#x27;m going off:<p>- &quot;OpenAI is partnering with Cerebras to add 750MW of ultra low-latency AI compute&quot;<p>- Sam Altman saying that users want faster inference more than lower cost in his interview.<p>- My understanding that many tasks are serial in nature.
        • cactusplant73744 hours ago
          Speed is really important to me but also I would like higher weekly limits -- which means lower cost I suppose. Building out complex projects can take 6 months to a year on a Pro plan.
          • energy1231 hour ago
            Same experience with Pro.<p>My trick is to attach the codebase as a txt file to 5-10 different GPT 5.2 Thinking chats, paste in the specs, and then get hard work done there, then just copy paste the final task list into codex to lower codex usage.
  • all2well21 hours ago
    What particular setups are getting folks these sorts of results? If there’s a way I could avoid all the babysitting I have to do with AI tools that would be welcome
    • geraneum21 hours ago
      &gt; If there’s a way I could avoid all the babysitting I have to do with AI tools that would be welcome<p>OP mentions that they are actually doing the “babysitting”
    • spongebobstoes21 hours ago
      i use codex cli. work on giving it useful skills. work on the other instruction files. take Karpathy tips around testing and declarativeness<p>use many simultaneously, and bounce between them to unblock them as needed<p>build good tools and tests. you will soon learn all the things you did manually -- script them all
  • upghost4 hours ago
    tl;dr - All this AI stuff is just Universal Paperclips[1]<p>I see a lot of comments about folks being worried about going soft, getting brain rot, or losing the fun part of coding.<p>As far as I&#x27;m concerned this is a bigger (albeit kinda flakey) self-driving tractor. Yeah I&#x27;d be bored if I just stuck to my one little cabbage patch I&#x27;d been tilling by hand. But my new cabbage patch is now a megafarm. Subjectively, same level of effort.<p>[1]: <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Universal_Paperclips" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Universal_Paperclips</a>
  • dubeye8 hours ago
    the xcancel link is amusing.<p>9&#x2F;10 of the most important social media users use X, like or loath it
  • ositowang20 hours ago
    It’s a great and insightful review—not over-hyping the coding agent, and not underestimating it either. It acknowledges both its usefulness and its limitations. Embracing it and growing with it is how I see it too.
  • tomlockwood19 hours ago
    Oh wow! Guy who&#x27;s current project depends on AI being good is talking about AI being good.<p>Interesting.
  • maximedupre1 day ago
    &gt; It hurts the ego a bit but the power to operate over software in large &quot;code actions&quot; is just too net useful<p>It does hurt, that&#x27;s why all programmers now need an entrepreneurial mindset... you become if you use your skills + new AI power to build a business.
    • jetsetk18 hours ago
      That is motivational content, but not economics. Most startups will be noise, even more so than before. The value of being a founder ceases when everyone is a founder, when it becomes universal. You will need customers. Nobody wants to buy re-invented-the-wheel-74.0. It lacks character, it lacks soul. Without it, your product will be nothing but noise in a noisy world.
      • maximedupre6 hours ago
        Cope. If you create something that genuinely solves a problem, people will buy no matter what.<p>Look entrepreneurship has never been easy. In fact it&#x27;s always been one of the hardest thing ever. I&#x27;m just saying... *you don&#x27;t have to do it*. Do whatever you want lol<p>Happy to hear what&#x27;s your solution to avoid becoming totally replaceable and obsolete.
    • xyzsparetimexyz20 hours ago
      What about the people who dont want to be entrepreneurs?
      • maximedupre20 hours ago
        They have to pivot to something else
        • maximedupre19 hours ago
          Or stay ahead of the curve as long as possible, e.g. work on the loop&#x2F;ralphing
      • webdevver7 hours ago
        permanent underclass...
  • globular-toast2 hours ago
    &gt; LLM coding will split up engineers based on those who primarily liked coding and those who primarily liked building.<p>Who doesn&#x27;t like building? Building without any thought is literally a toy, like Lego or paint by numbers. That&#x27;s the entire reason those things are popular. But a game is not a job. Sometimes I feel like half the people in this career are children. Never had any real responsibility. &quot;Oh, everyone writes bugs, who tf cares&quot;. &quot;Move fast, break stuff&quot; was literally and unironically the tag line for a company that should have been taking far more responsibility.<p>This trend isn&#x27;t limited to programmers either. Wherever I look I see people not taking responsibility. Lots of children in adult bodies. I do hope there are some adults who are really pulling the strings somewhere...
  • lingrush43 hours ago
    No idea why the poster wants to deprive this author of engagement on his post, but here&#x27;s the original link: <a href="https:&#x2F;&#x2F;x.com&#x2F;karpathy&#x2F;status&#x2F;2015883857489522876" rel="nofollow">https:&#x2F;&#x2F;x.com&#x2F;karpathy&#x2F;status&#x2F;2015883857489522876</a>
  • appstorelottery20 hours ago
    &gt; Atrophy. I&#x27;ve already noticed that I am slowly starting to atrophy my ability to write code manually.<p>I&#x27;ve been increasingly using LLM&#x27;s to code for nearly two years now - and I can definitely notice my brain atrophy. It bothers me. Actually over the last few weeks I&#x27;ve been looking at a major update to a product in production &amp; considered doing the edits manually - at least typing the code from the LLM &amp; also being much more granular with my instructions (i.e. focus on one function at a time). I feel in some ways like my brain is turning into slop &amp; I&#x27;ve been coding for at least 35 years... I feel validated by Karpathy.
    • epolanski20 hours ago
      Don&#x27;t be too worried about it.<p>1. Manual coding may be less relevant (albeit ability to read code, interpret it and understand it will be more) in the future. Likely already is.<p>2. Any skill you don&#x27;t practice becomes &quot;weaker&quot;. Gonna give you an example. I play chess since my childhood, but sometimes I go months without playing it, even years. When I get back I start losing elo fast. If I was in the top 10% of chess.com, I drop to top 30% in the weeks after. But after few months I&#x27;m back at top 10%. Takeaway: your relative ability is more or less the same compared to other practitioners, you&#x27;re simply rusty.
      • appstorelottery19 hours ago
        Thanks for your comment, it set me at ease. I know from experience that you&#x27;re right on point 2. As for point one, I also tend to agree. AI is such a paradigm shift &amp; rapid&#x2F;massive change doesn&#x27;t come without stress. I just need to stay cool about it all ;-)
  • philipwhiuk22 hours ago
    &gt; It&#x27;s so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day. It&#x27;s a &quot;feel the AGI&quot; moment to watch it struggle with something for a long time just to come out victorious 30 minutes later.<p>The bits left unsaid:<p>1. Burning tokens, which we charge you for<p>2. My CPU does this when I tell it to do bogosort on a million 32-bit integers, it doesn&#x27;t mean it&#x27;s a good thing
  • tintor21 hours ago
    &quot;you can review code just fine even if you struggle to write it.&quot;<p>Well, merely approving code takes no skill at all.
    • roblh21 hours ago
      Seriously, that’s a completely nonsense line.
  • hollowturtle1 day ago
    &gt; Coding workflow. Given the latest lift in LLM coding capability, like many others I rapidly went from about 80% manual+autocomplete coding and 20% agents in November to 80% agent coding and 20% edits+touchups in December<p>Anyone wondering what exactly is he actually building? What? Where?<p>&gt; The mistakes have changed a lot - they are not simple syntax errors anymore, they are subtle conceptual errors that a slightly sloppy, hasty junior dev might do.<p>I would LOVE to have jsut syntax errors produced by LLMs, &quot;subtle conceptual errors that a slightly sloppy, hasty junior dev might do.&quot; are neither subtle nor slightly sloppy, they actually are serious and harmful, and no junior devs have no experience to fix those.<p>&gt; They will implement an inefficient, bloated, brittle construction over 1000 lines of code and it&#x27;s up to you to be like &quot;umm couldn&#x27;t you just do this instead?&quot;<p>Why just not hand write 100 loc with the help of an LLM for tests, documentation and some autocomplete instead of making it write 1000 loc and then clean it up? Also very difficult to do, 1000 lines is a lot.<p>&gt; Tenacity. It&#x27;s so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day.<p>It&#x27;s a computer program running in the cloud, what exactly did he expected?<p>&gt; Speedups. It&#x27;s not clear how to measure the &quot;speedup&quot; of LLM assistance.<p>See above<p>&gt; 2) I can approach code that I couldn&#x27;t work on before because of knowledge&#x2F;skill issue. So certainly it&#x27;s speedup, but it&#x27;s possibly a lot more an expansion.<p>mmm not sure, if you don&#x27;t have domain knowledge you could have an initial stubb at the problem, what when you need to iterate over it? You don&#x27;t if you don&#x27;t have domain knowledge on your own<p>&gt; Fun. I didn&#x27;t anticipate that with agents programming feels <i>more</i> fun because a lot of the fill in the blanks drudgery is removed and what remains is the creative part.<p>No it&#x27;s not fun, eg LLMs produce uninteresting uis, mostly bloated with react&#x2F;html<p>&gt; Atrophy. I&#x27;ve already noticed that I am slowly starting to atrophy my ability to write code manually.<p>My bet is that sooner or later he will get back to coding by hand for periods of time to avoid that, like many others, the damage overreliance on these tools bring is serious.<p>&gt; Largely due to all the little mostly syntactic details involved in programming, you can review code just fine even if you struggle to write it.<p>No programming it&#x27;s not &quot;syntactic details&quot; the practice of programming it&#x27;s everything but &quot;syntactic details&quot;, one should learn how to program not the language X or Y<p>&gt; What happens to the &quot;10X engineer&quot; - the ratio of productivity between the mean and the max engineer? It&#x27;s quite possible that this grows <i>a lot</i>.<p>Yet no measurable econimic effects so far<p>&gt; Armed with LLMs, do generalists increasingly outperform specialists? LLMs are a lot better at fill in the blanks (the micro) than grand strategy (the macro).<p>Did people with a smartphone outperformed photographers?
    • TaupeRanger1 day ago
      Lots of very scared, angry developers in these comment sections recently...
      • hollowturtle1 day ago
        Not angry nor scared, I value my hard skills a lot, I&#x27;m just wondering why people believe religiously everything AI related. Maybe I&#x27;m a bit sick with the excessive hype
      • hollowturtle1 day ago
        Also note that I&#x27;m a heavy LLM user, not anti ai for sure
      • crystal_revenge19 hours ago
        There&#x27;s no fear (a bit of anger I must admit). I suspect nearly all of the reaction against this comes from a similar place to where mine does:<p>All of the real world code I have had to review created by AI is buggy slop (often with subtle, but weird bugs that don&#x27;t show up for a while). But on HN I&#x27;m told &quot;this is because your co-workers don&#x27;t know how to <i>AI</i> right!!!!&quot; Then when someone who supposedly <i>must</i> be an expert in getting things done with AI posts, it&#x27;s always big claims with hand-wavy explanations&#x2F;evidence.<p>Then the comments section is littered with no effort comments like this.<p>Yet oddly whenever anyone asks &quot;show me the thing you built?&quot; Either it looks like every other half-working vibe coded CRUD app... or it doesn&#x27;t exist&#x2F;can&#x27;t be shown.<p>If you tell me you have discovered a miracle tool, just some me the results. Not taking increasingly ridiculous claims at face value is not &quot;fear&quot;. What I don&#x27;t understand is where comments like yours come from? What makes you <i>need</i> this to be more than it is?
      • Banditoz19 hours ago
        This is extremely reductive and incredibly dismissive of everything they wrote above.
        • crystal_revenge19 hours ago
          It&#x27;s because they don&#x27;t have a substantive response to it, so they resort to ad hominems.<p>I&#x27;ve worked extensively in the AI space, and believe that it is extremely useful, but these weird claims (even from people I respect a lot) that &quot;something big and mysterious is happening, I just can&#x27;t show you yet!&quot; set of my alarms.<p>When sensible questions are met with ad hominems by supporters it further sets of alarm bells.
      • thr5918261722 hours ago
        I see way more hype that is boosted by the moderators. The scared ones are the nepo babies who founded a vaporware AI company that will be bought by daddy or friends through a VC.<p>They have to maintain the hype until a somewhat credible exit appears and therefore lash out with boomer memes, FOMO, and the usual insane talking points like &quot;there are builders and coders&quot;.
        • simianwords22 hours ago
          i&#x27;m not sure what kind of conspiracy you are hallucinating. do you think people have to &quot;maintain the hype&quot;? it is doing quite well organically.
          • hollowturtle21 hours ago
            So well that they&#x27;re losing billions and OpenAI may go bankrupt this year
    • simianwords22 hours ago
      This is a low quality curmudgeonly comment
      • hollowturtle21 hours ago
        Now that you contributed zero net to the discussion and learned a new word you can go out and play with toys! Good job
      • potatogun22 hours ago
        You learned a new adjective? If people move beyond &quot;nice&quot;, &quot;mean&quot; and &quot;curmudgeonly&quot; they might even read Shakespeare instead of having an LLM producing a summary.
        • simianwords22 hours ago
          cool.<p>&gt;Anyone wondering what exactly is he actually building? What? Where?<p>this is trivially answerable. it seems like they did not do even the slightest bit of research before asking question after question to seem smart and detailed.
          • hollowturtle21 hours ago
            I asked many question and you focused on only one, btw yes I did my research, and I know him because I followed almost every tutorial he has on YouTube, and he never mentions clearly what weekend project worked on to make him conclude with such claims. I had a very high respect of him if not that at some point started acting like the Jesus Christ of LLMs
            • simianwords21 hours ago
              its not clear why you asked that question if you knew the answer to it?
  • shawabawa31 day ago
    It&#x27;s been a bit like the boiling frog analogy for me<p>I started by copy pasting more and more stuff in chatgpt. Then using more and more in-IDE prompting, then more and more agent tools (Claude etc). And suddenly I realise I barely hand code anymore<p>For sure there&#x27;s still a place for manual coding, especially schemas&#x2F;queries or other fiddly things where a tiny mistake gets amplified, but the vast majority of &quot;basic work&quot; is now just prompting, and honestly the code quality is _better_ that it was before, all kinds of refactors I didn&#x27;t think about or couldn&#x27;t be bothered with have almost automatically<p>And people still call them stochastic parrots
    • Macha23 hours ago
      I&#x27;ve had the opposite experience, it&#x27;s been a long time listening to people going &quot;It&#x27;s really good now&quot; before it developed to a permutation that was actually worth the time to use it.<p>ChatGPT 3.5&#x2F;4 (2023-2024): The chat interface was verbose and clunky and it was just... wrong... like 70+% of the time. Not worth using.<p>CoPilot autocomplete and Gitlab Duo and Junie (late 2024-early 2025): Wayyy too aggressive at guessing exactly what I wasn&#x27;t doing and hijacked my tab complete when pre-LLM type-tetris autocomplete was just more reliable.<p>Copilot Edit&#x2F;early Cursor (early 2025): Ok, I can sort of see uses here but god is picking the right files all the time such a pain as it really means I need to have figured out what I wanted to do in such detail already that what was even the point? Also the models at that time just quickly descended into incoherency after like three prompts, if it went off track good luck ever correcting it.<p>Copilot Agent mode &#x2F; Cursor (late 2025): Ok, great, if the scope is narrowly scoped, and I&#x27;m either going to write the tests for it or it&#x27;s refactoring existing code it could do something. Like something mechanical like the library has a migration where we need to replace the use of methods A&#x2F;B&#x2F;C and replace them with a different combination of X&#x2F;Y&#x2F;Z. great, it can do that. Or like CRUD controller #341. I mean, sure, if my boss is going to pay for it, but not life changing.<p>Zed Agent mode &#x2F; Cursor agent mode &#x2F; Claude code (early 2026): Finally something where I can like describe the architecture and requirements of a feature, let it code, review that code, give it written instructions on how to clean it up &#x2F; refactor &#x2F; missing tests, and iterate.<p>But that was like 2 years of &quot;really it&#x27;s better and revolutionary now&quot; before it actually got there. Now maybe in some languages or problem domains, it was useful for people earlier but I can understand people who don&#x27;t care about &quot;but it works now&quot; when they&#x27;re hearing it for the sixth time.<p>And I mean, what one hand gives the other takes away. I have a decent amount of new work dealing with MRs from my coworkers where they just grabbed the requirements from a stakeholder, shoved it into Claude or Cursor and it passed the existing tests and it&#x27;s shipped without much understanding. When they wrote them themselves, they tested it more and were more prepared to support it in production...
    • ed_mercer1 day ago
      I find myself even for small work, telling CC to fix it for me is better as it usually belongs to a thread of work, and then it understands the big picture better.
    • phailhaus1 day ago
      &gt; And people still call them stochastic parrots<p>Both can be true. You&#x27;re tapping into every line of code publicly available, and your day-to-day really isn&#x27;t that unique. They&#x27;re really good at this kind of work.
  • felineflock16 hours ago
    xcancel? What is the purpose or benefit of providing a free mirror to x? Doesn&#x27;t it end up sparing the x servers and causing their costs to decrease?
    • moss_dog15 hours ago
      I prefer xcancel in part because Twitter doesn&#x27;t let you view replies etc when not logged in.
    • yojat66114 hours ago
      Guessing x loses ad revenue when traffic goes to xcancel.
    • tryauuum13 hours ago
      my screen is 60 percent banners about cookies and account creation when I use x
  • solarized11 hours ago
    Next milestone: solving authoritarian LLM dependencies. We can’t always get trapped in local minima. Or is that actually okay?
  • jbjbjbjb7 hours ago
    &gt; do generalists outperform specialists?<p>Depends what we mean by specialist. If it frontend vs backend then maybe. If it general dev vs some specialist scientific programmer or other field where a generalist won’t have a clue then this seems like a recipe for disaster (literal disasters included).
  • cmrdporcupine5 hours ago
    Right on especially on two things -- 1) the tools doing a disservice by not interviewing and seeking input and 2) The 2026 &quot;Slopocalypse&quot;<p>I&#x27;m hopeful that 2026 will be the year that the biggest adopters are forced to deal with the mass of product they&#x27;ve created that they don&#x27;t fully understand, and a push for better tooling is the result.<p>Today&#x27;s agentic tools are crude from a UX POV from where I am <i>hoping</i> they will end up.
  • sota_pop17 hours ago
    &gt; Slopacolypse Really… <i>REALLY</i> not looking forward to getting this word spammed at me the next 6-12 months… even less so seeing the actual manifestation.<p>&gt; TLDR This should be at the start?<p>I actually have been thinking of trying out ClaudeCode&#x2F;OpenCode over this past week… can anyone provide experience, tips, tricks, ref docs?<p>My normal workflow is using Free-tier ChatGPT to help me interrogate or plan my solution&#x2F; approach or to understand some docs&#x2F;syntax&#x2F;best practice of which I’m not familiar. then doing the implementation myself.
    • gverrilla15 hours ago
      Claude code official docs are quite nice - that&#x27;s where I started.
  • nadis1 day ago
    The section on IDEs&#x2F;agent swarms&#x2F;fallibility resonated a lot for me; I haven&#x27;t gone quite as far as Karpathy in terms of power usage of Claude Code, but some of the shifts in mistakes (and reality vs. hype) analysis he shared seems spot on in my (caveat: more limited) experience.<p>&gt; &quot;IDEs&#x2F;agent swarms&#x2F;fallability. Both the &quot;no need for IDE anymore&quot; hype and the &quot;agent swarm&quot; hype is imo too much for right now. The models definitely still make mistakes and if you have any code you actually care about I would watch them like a hawk, in a nice large IDE on the side. The mistakes have changed a lot - they are not simple syntax errors anymore, they are subtle conceptual errors that a slightly sloppy, hasty junior dev might do. The most common category is that the models make wrong assumptions on your behalf and just run along with them without checking. They also don&#x27;t manage their confusion, they don&#x27;t seek clarifications, they don&#x27;t surface inconsistencies, they don&#x27;t present tradeoffs, they don&#x27;t push back when they should, and they are still a little too sycophantic. Things get better in plan mode, but there is some need for a lightweight inline plan mode. They also really like to overcomplicate code and APIs, they bloat abstractions, they don&#x27;t clean up dead code after themselves, etc. They will implement an inefficient, bloated, brittle construction over 1000 lines of code and it&#x27;s up to you to be like &quot;umm couldn&#x27;t you just do this instead?&quot; and they will be like &quot;of course!&quot; and immediately cut it down to 100 lines. They still sometimes change&#x2F;remove comments and code they don&#x27;t like or don&#x27;t sufficiently understand as side effects, even if it is orthogonal to the task at hand. All of this happens despite a few simple attempts to fix it via instructions in CLAUDE . md. Despite all these issues, it is still a net huge improvement and it&#x27;s very difficult to imagine going back to manual coding. TLDR everyone has their developing flow, my current is a small few CC sessions on the left in ghostty windows&#x2F;tabs and an IDE on the right for viewing the code + manual edits.&quot;
  • rileymichael1 day ago
    &gt; LLM coding will split up engineers based on those who primarily liked coding and those who primarily liked building<p>as the former, i&#x27;ve never felt _more ahead_ than now due to all of the latter succumbing to the llm hype
  • neuralkoi22 hours ago
    &gt; The most common category is that the models make wrong assumptions on your behalf and just run along with them without checking.<p>If current LLMs are ever deployed in systems harboring the big red button, they WILL most definitely somehow press that button.
    • arthurcolle22 hours ago
      US MIC are already planning on integrating fucking Grok into military systems. No comment.
      • Havoc18 hours ago
        Including classified systems. What could possibly go wrong
      • blibble19 hours ago
        the US is going to stop the chinese by mass production of illegal pornography?
    • groby_b22 hours ago
      fwiw, the same is true for humans. Which is why there&#x27;s a whole lot of process and red tape around that button. We <i>know</i> how to manage risk. We can choose to do that for LLM usage, too.<p>If instead we believe in fantasies of a single all-knowing machine god that is 100% correct at all times, then... we really just have ourselves to blame. Might as well just have spammed that button by hand.
  • poszlem9 hours ago
    I keep thinking about the TechnoCore from Dan Simmons&#x27; Hyperion, where the AIs were serving humans but secretly that was a parasitic relation, where they&#x27;ve been secretly using human brains as distributed processing nodes, essentially harvesting humanity&#x27;s neural activity for their own computational needs without anyone&#x27;s knowledge.<p>I know this is SF, but to me working with those LLMs feels more and more like that, and the atrophy part is real. Not that the model is literally using our brains as compute, but the relationship can become lopsided.
  • wellpast18 hours ago
    I coded up a crossword puzzle game using agentic dev this weekend. Claude and Codex&#x2F;GPT. Had to seriously babysit and rewrite much of it, though, sure, I found it “cool” what it <i>could</i> do.<p>Writing code in many cases is faster to me than writing English (that <i>is</i> how PLs are designed, btw!) LLM&#x2F;agentic is very “neat” but still a toy to the professional, I would say. I doubt reports like this one. For those of us building real world products with shelf-lives (Is Andrej representative of this archetype?), I just don’t see the value-add touted out there. I’d love to be proven wrong. But writing code (in code, not English), to me and many others, is still faster than reading&#x2F;proving it.<p>I think there’s a combination of fetishizing and Stockholm syndroming going on in these enthusiastic self-reports. PMW.
    • jofla_net3 hours ago
      &gt;Writing code in many cases is faster to me than writing English<p>True, I feel as though i&#x27;d have to become Stienbeck to get it to do what i &quot;really&quot; wanted, with all the true nuance.
  • jedisct14 hours ago
    Claude is good at writing code, not so good at reasoning, and I would never trust or deploy to production something solely written by Claude.<p>GPT-5.2 is not as good for coding, but much better at thinking and finding bugs, inconsistencies and edge cases.<p>The only decent way I found to use AI agents is by doing multiple steps between Claude and GPT, asking GPT to review every step of every plan and every single code change from Claude, and manually reviewing and tweaking questions and responses both way, until all the parties, including myself, agree. I also sometimes introduce other models like Qwen and K2 in the mix, for a different perspective.<p>And gosh, by doing so you immediately realize how dumb, unreliable and dangerous code generated by Claude alone is.<p>It&#x27;s a slow and expensive process and at the end of the day, it doesn&#x27;t save me time at all. But, perhaps counterintuitively, it gives me more confidence in the end result. The code is guaranteed to have tons of tests and assurance for edge cases that I may not have thought about.
  • superze21 hours ago
    I don&#x27;t know about you guys but most of the time it&#x27;s spitting nonsense models in sqlalchemy and I have to constantly correct it to the point where I am back at writing the code myself. The bugs are just astonishing and I lose control of the codebase after some time to the point where reviewing the whole thing just takes a lot of time.<p>On the contrary if it was for a job in a public sector I would just let the LLM spit out some output and play stupid, since salary is very low.
  • randoglando20 hours ago
    Senpai has taken the words out of my mouth and put them on the page.
  • uejfiweun1 day ago
    Honestly, how long do you guys think we have left as SWEs with high pay? Like the SWE job will still exist, but with a much lower technical barrier of entry, it strikes me that the pay is going to decrease a lot. Obviously BigCo codebases are extremely complex, more than Claude Code can handle right now, but I&#x27;d say there&#x27;s definitely a timer running here. The big question for my life personally is whether I can reach certain financial milestones before my earnings potential permanently decreases.
    • jerf1 day ago
      It&#x27;s counterintuitive but something becoming easier doesn&#x27;t necessarily mean it becomes cheap. Programming has arguably been the easiest engineering discipline to break into by sheer force of will for the past 20+ years, and the pay scales you see are adapted to that reality already.<p>Empowering people to do 10 times as much as they could before means they hit 100 times the roadblocks. Again, in a lot of ways we&#x27;ve already lived in that reality for the past many years. On a task-by-task basis programming today is already a lot easier than it was 20 years ago, and we just grew our desires and the amount of controls and process we apply. Problems arise faster than solutions. Growing our velocity means we&#x27;re going to hit a lot more problems.<p>I&#x27;m not saying you&#x27;re wrong, so much as saying, it&#x27;s not the whole story and the only possibility. A lot of people today are kept out of programming just because they don&#x27;t want to do that much on a computer all day, for instance. That isn&#x27;t going to change. There&#x27;s still going to be skills involved in being better than other people at getting the computers to do what you want.<p>Also on a long term basis we may find that while we can produce entry-level coders that are basically just proxies to the AI by the bucketful that it may become very difficult to advance in skills beyond that, and those who are already over the hurdle of having been forced to learn the hard way may end up with a very difficult to overcome moat around their skills, especially if the AIs plateau for any period of time. I am concerned that we are pulling up the ladder in a way the ladder has never been pulled up before.
    • spaceman_20201 day ago
      I think the senior devs will be fine. They&#x27;re like lawyers at this point - everyone is too scared they&#x27;ll screw up and will keep them around<p>The juniors though will radically have to upskill. The standard junior dev portfolio can be replicated by claude code in like three prompts<p>The game has changed and I don&#x27;t think all the players are ready to handle it
    • daxfohl1 day ago
      Supply and demand. There will continue to be a need for engineers to manage these systems and get them to do the thing you actually want, to understand implications of design tradeoffs and help stakeholders weigh the pros and cons. Some people will be better at it than others. Companies will continue to pay high premiums for such people if their business depends on quality software.
    • tietjens1 day ago
      I think to give yourself more context you should ask about the patterns that led to SWEs having such high pay in the last 10-15 years and why it is you expected it to stay that way.<p>I personally think the barrier is going to get higher, not lower. And we will be back expected to do more.
    • q3k20 hours ago
      I think the pay is going to skyrocket for senior devs within a few years, as training juniors that can graduate past pure LLM usage becomes more and more difficult.<p>Day after day the global quality of software and learning resources will degrade as LLM grey goo consumes every single nook and cranny of the Internet. We will soon see the first signs of pure cargo cult design patterns, conventions and schemes that LLMs made up and then regurgitated. Only people who learned before LLMs became popular will know that they are not to be followed.<p>People who aren&#x27;t learning to program without LLMs today are getting left behind.
      • strange_quark16 hours ago
        Yeah, all of this. Plus companies have avoided hiring and training juniors for 3 or 4 years now (which is more related to interest rates than AI). Plus existing seniors who deskill themselves by outsourcing their brain to AI. Seniors who know actually what they&#x27;re doing are going to be in greater demand.<p>That is assuming that LLMs plateau in capability, if they haven&#x27;t already, which I think is highly likely.
    • riku_iki22 hours ago
      &gt; like the SWE job will still exist, but with a much lower technical barrier of entry<p>its opposite, now in addition to all other skills, you need skill how to handle giant codebases of viobe-coded mess using AI.
  • lofaszvanitt18 hours ago
    The whole thing is about getting rid of experts and let the entry level idiots do all the work. The coders become expendable. And people do not see the chasm staring back at them :D. LLMs in their current form redistributes &quot;intelligence&quot; and expertise to the average joes for mere pennies. It should be much much more expensive, or it will disrupt the whole ecosystem. If it becomes even more intelligent it must be bludgeoned to death a.k.a. regulated like hell, otherwise the ensuing disruption will kill the job market and in the long term human values.<p>As an added plus: those, who already have wealth will benefit the most, instead of the masses. Since the distribution and dissemination of new projects is at the same level as before, meaning you would need a lot of money. So no matter how clever you are with an llm, if you don&#x27;t have the means to distribute it you will be left in the dirt.
  • ares62319 hours ago
    Imagine taking career advice from people who will never need to be employed again in order to survive.
    • fragmede18 hours ago
      Yes, typically you take since from people who&#x27;ve been successful at their career. Are you suggesting we should be taking career advice from high school freshmen instead?
      • ares62318 hours ago
        I&#x27;m nitpicking on the atrophy bit. He can afford to have his skills or his brain atrophied. His followers though?<p>Nevermind the fact he became successful _because_ of his skills and his brain.
  • DeathArrow1 day ago
    &gt;LLM coding will split up engineers based on those who primarily liked coding and those who primarily liked building.<p>Quite insightful.
  • Madmallard1 day ago
    Are game developers vibe coding with agents?<p>It&#x27;s such a visual and experiential thing that writing true success criteria it can iterate on seems like borderline impossible ahead of time.
    • dysoco6 hours ago
      It might be biased to Reddit&#x2F;Twitter users but from what I&#x27;ve seen game developers seem to be much more averse towards using AI (even for coding) than other fields.<p>Which is curious since prototyping helps a lot in gamedev.
    • I don&#x27;t &quot;vibe code&quot; but when I use an LLM with a game I usually branch out into several experiments which I don&#x27;t have to commit to. Thus, it just makes that iteration process go faster.<p>Or slower, when the LLM doesn&#x27;t understand what I want, which is a bigger issue when you spawn experiments from scratch (and have given limited context around what you are about to do).
    • TheGRS23 hours ago
      I&#x27;m trying it out with Godot for my little side projects. It can handle writing the GUI files for nodes and settings. The workflow is asking cursor to change something, I review the code changes, then load up the game in Godot to check out the changes. Works pretty well. I&#x27;m curious if any Unity or Unreal devs are using it since I&#x27;m sure its a similar experience.
    • ex-aws-dude12 hours ago
      A big problem is that a lot of game logic is done in visual scripting (e.g unreal blueprints) which AI tools have no idea about
    • redox991 day ago
      Vibe coding in Unreal Engine is of limited use. It obviously helps with C++, but so much of your time is doing things that are not C++. It hurts a lot that UE relies heavily on blueprints, if they were code you could just vibecode a lot of that.
  • cyanydeez1 day ago
    So I&#x27;m curious, whats the actual quality control.<p>Like, do these guys actually dog food real user experience, or are they all admins with the fast lane to the real model while everyone outside the org has to go through the 10 layers of model sheding, caching and other means and methods of saving money.<p>We all know these models are expensive as fuck to run and these companies are degrading service, A+B testing, and the rest. Do they actually ponder these things directly?<p>Just always seems like people are on drugs when they talk about the capabilities, and like, the drugs could be pure shit (good) or ditch weed, and we call just act like the pipeline for drugs is a consistent thing but it&#x27;s really not, not at this stage where they&#x27;re all burning cash through infrastructure. Definitely, like drug dealers, you know they&#x27;re cutting the good stuff with low cost cached gibberish.
    • quinnjh1 day ago
      &gt; Definitely, like drug dealers, you know they&#x27;re cutting the good stuff with low cost cached gibberish.<p>Can confirm. My partner&#x27;s chatGPT wouldnt return anything useful for her given a specific query involving web use, while i got the desired result sitting side by side. She contacted support and they said nothing they can do about it, her account is in an A&#x2F;B test group without some features removed. I imagine this saves them considerable resources despite still billing customers for them.<p>how much this is occurring is anyones guess
    • bigwheels1 day ago
      If you access a model through an openrouter provider it might be quantized (akin to being &quot;cut with trash&quot;), but when you go directly to Anthropic or OpenAI you are getting access to the same APIs as everyone else. Even top-brass folks within Microsoft use Anthropic and OpenAI proper (not worth the red-tape trouble to go directly through Azure). Also, the creator and maintainer of Claude, Boris Cherny, was a bit of an oddball but one of the comparatively nicer people at Anthropic, and he indicated he primarily uses the same Anthropic APIs as everyone else (which makes sense from a product development perspective).<p>The underlying models are all actually really undifferentiated under the covers except for the post-training and base prompts. If you eliminate the base prompts the models behave near identically.<p>A conspiracy would be a helluva lot more interesting and fun, but I&#x27;ve spoken to these folks firsthand and it seems they already have enough challenges keeping the beast running.
  • MORPHOICES8 hours ago
    [dead]
  • huflungdung11 hours ago
    [dead]
  • MarginalGainz6 hours ago
    [dead]
  • wkh1298571 day ago
    [flagged]
    • yojat66114 hours ago
      I don&#x27;t know if it&#x27;s fair to call him an ai addict or deduce that his ego is bruised. But I do wonder whether karpathy&#x27;s agentic llm experiences are based on actual production code or pet projects. Based on a few videos I have seen of his, I am guessing it&#x27;s the latter. Also, he is a research scientist (probably a great one), not a software developer. I agree with the op that karpathy should not be given much attention in this topic i.e llms for software development.
    • soganess1 day ago
      &quot;addict&quot;<p>Great idea! Le&#x27;s pathalogize another thing! I love quickly othering whole concepts and putting them in my brain&#x27;s &quot;bad&quot; box so I can feel superior.
    • reducesuffering23 hours ago
      <a href="https:&#x2F;&#x2F;github.com&#x2F;karpathy&#x2F;nanochat" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;karpathy&#x2F;nanochat</a><p><a href="https:&#x2F;&#x2F;github.com&#x2F;karpathy&#x2F;llm.c" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;karpathy&#x2F;llm.c</a><p>The proof is in the pudding. Let&#x27;s see your code
      • rvz16 hours ago
        You just proved the parent’s point.<p>He said <i>“…who has never written any production software…”</i> yet you show toy projects instead.<p>Well done.
      • jackling21 hours ago
        I don&#x27;t agree with the parent commenters characterization of Karpathy, but these projects are just simple toy projects. They&#x27;re educational material, not production level software.
    • lomase21 hours ago
      [dead]
  • spaceman_20201 day ago
    Once again, 80% of the comments here are from boomers.<p>HN used to be a proper place for people actually curious about technology
    • vardalab23 hours ago
      I&#x27;m almost a boomer and I agree. THis dichotomy is weird. I am retired EE and I love the ability to just have AI do whatever I want for me. I have it manage a 10 node proxmox cluster in my basement via ansible and terraform. I can finally do stuff I always wanted but had no time. I got sick of editing my kids sports videos for highlights in Davinci Resolve so just asked claude to write a simple app for me and then use all my random video cards in my boxes to render clips in parallel and so on. Tech is finally fun again when I do not have to dedicate days to understand some new framework. It does feel a little like late 1990&#x27;s computing when everyone was making geocities webpages but those days were more fun. Now with local llms getting strong as well and speaking to my PC instead of typing it feels like SciFi, so yeah, I do not get this hacker news hand wringing about code craft.
      • kakapo56723 hours ago
        Same demographic, same experience. AI has been incredibly liberating for me. I get all sorts of things done now that before were previosly impossible for all practical purposes. Among other things, it cuts through the noise of all the layers of detail, and allows me to focus on ideas, design, and just getting stuff built asap.<p>I also don&#x27;t get all the hand-wringing. AI is an amazing tool. Use it and be happy.<p>Even less do I get all the cope about it not being effective, or even useless at some level. When I read posts such as that, it feels like a different planet. Just not my experience at all.
      • kejaed22 hours ago
        So what is your workflow now with this app for kids sports highlights?
        • vardalab48 minutes ago
          Well, it&#x27;s not really a full-blown app yet. Claude wrote a plugin for MPV. So now when I watch video I just push a button to mark in and out of highlights similar to how it works in DaVinci Resolve. Then I have a command line tool that takes those timestamps in a video file and cuts it up into individual clips and then re-renders those clips and creates a highlight reel. Another command line tool takes three or four large MP4 files that the camera generates and downloads them and combines them in the actual game video on my desktop and also uploads it to my archive and transcodes into a bunch of different formats and uploads to YouTube. And for transcoding, again, it divvies it out to the video cards, which works pretty well. I think I have five or six encoders available so it chunks it up and then reassembles. All in all, it&#x27;s nothing fancy, but it reduced quite a bit the friction of coming home after games and getting a video up on YouTube for grandparents.
        • zennit20 hours ago
          Also interested
    • weirdmantis691 day ago
      Ya it&#x27;s so weird lol
  • themafia21 hours ago
    Instead of a 17 paragraph twitter post with a baffling TLDR at the end why not just record your screen and _demonstrate_ all of what you&#x27;re describing?<p>Otherwise, I think you&#x27;re incidentally right, your &quot;ego&quot; &#x2F;is&#x2F; bruised, and you&#x27;re looking for a way out by trying to prognosticate on the future of the technology. You&#x27;re failing in two different ways.