There will be many more things like this and it’s an elephant in the room for the supposed mass replacement of people with AI.<p>Some human still has to be accountable. Someone has to get fired / go to jail when something screws up.<p>You can make humans more productive but for the foreseeable future you can’t take the human out of the loop to have an AI implementation that’s not a disaster/lawsuit waiting to happen. That, probably more than anything else, is why companies just aren’t seeing the much promised mass step change in productivity from AI and why so many companies are now saying they see zero ROI from AI efforts.<p>The lowest hanging fruit will be low value rote repetitive tasks like the whole India offshoring industry, which will be the first to vaporize if AI does start replacing humans. But until companies see success on the lowest of lowest hanging fruit on en-mass labor replacement with AI things higher up on the value chain will remain relatively safe.<p>PS: Nearly every mass layoff recently citing “AI productivity” hasn’t withstood scrutiny. They all seem to be just poorly performing companies slashing staff after overhiring, which management looking for any excuse other than just admitting that.
I think this is an even clearer case than usual. With software engineers and office work you don’t have legal limitations on who can perform the work, but they exist for lawyers and doctors for example.<p>So if this is a tool, the fault lies fully in the user, and if this is treated as “another persons work” then the user knowingly passed the work onto someone not authorized to do it. Both end up in the user being guilty.
> With software engineers and office work you don’t have legal limitations on who can perform the work<p>Technically true, but if you want the IP to be covered by copyright you better make sure they're not using AI or you'll find out that there are some serious legal limitations in your future when you aim to either pick up investment or sell your IP.
> So if this is a tool, the fault lies fully in the user, and if this is treated as “another persons work” then the user knowingly passed the work onto someone not authorized to do it. Both end up in the user being guilty.<p>I am particularly against this point of view, because we as a community have long touted how computers can do the job better and faster, and that computers don’t make mistakes. When there are bugs, they’re seen as flaws in the system and rectified, by programmers.<p>When there are gaps between user expectations and how the software works, it’s our job to manage those gaps and reduce the gap.<p>In the case of AI, we are somehow, probably because we know it’s non-deterministic, turning that social contract we had developed with users on its head.<p>Now, that’s just the way it is and it’s up to them to know if the computer is lying to them. We have absolved ourselves of both the technical and the non-technical responsibilities to ensure the computer doesn’t lie to the user, or subverts their expectations, or acts in a way contrary to human logic.<p>AI may be different to us in that it’s non-deterministic, but that’s all the more reason that we’re responsible to ensure AI adoption aligns to the social contract we created with users. If we can’t do that with AI then it’s up to us to stop chasing endless dollars and be forthright with users that facts are optional when it comes to AI.
>Some human still has to be accountable. Someone has to get fired / go to jail when something screws up.<p>I remember growing up and always hearing "The computer is down" as an excuse for why things were cancelled/offices closed/buses and trains not running/ad infinitum.<p>At some point I read a article that pointed out that the reason the computer was down was because a person made a [coding] error: the computer itself was fine.<p>I've yet to read about how a person who caused the computer to be down was disciplined.
You are running on a outdated model of the world. That one of only discipline keeps people working, keeps them productive, keeps the in line.<p>We saw how that worked out in Soviet Russia and the culture it gave birth to in its aftermath. Artificially held up discipline by institutions and hierarchies is worthless. It only encourages subversion and thus most of the productivity is wasted on hunting for laziness and updating of ever more intricate behavioral programing rules, which make the organization ever more unable to react fast and decisive.<p>The only discipline worth a damn is intrinsic. People who want something, want to get somewhere. They need no shepards and prison guards, they need only a support harness, they need resources and people concerned about them. The culture that produces such people is required for things to succeed. Any culture that does not, can not succeed and is basically a parasite to cultures who do.
And here perhaps was the greatest mistake the software profession made! Not making ourselves into a real profession, with actual accountability. It was terribly convenient for so long not to have consequences when things went wrong. It's less convenient now.
Why does a person need to be disciplined because they made a mistake?
We should have more hygiene when it comes to AI.<p>Text coming out of an LLM should be in a special codeblock of Unicode, so we can see it is generated by AI.<p>Failing to do so (or tampering with it) should be considered bad hygiene, and should be treated like a doctor who doesn't wash their hands before surgery.
> Text coming out of an LLM should be in a special codeblock of Unicode, so we can see it is generated by AI.<p>That's exactly my proposed solution:<p><a href="https://jacquesmattheij.com/classes-of-originality/" rel="nofollow">https://jacquesmattheij.com/classes-of-originality/</a>
What will that accomplish? Does it give license to developers to check in code that they don't understand/trust fully?<p>Ultimately, people should be responsible for the code they commit, no matter how it was written. If AI generates code that is so bad that it warrants putting up warning sign, it shouldn't be checked in.
It could be useful for downstream/AI processes. Eg hand-written code only requires 70% code coverage because the cost for higher coverage is significantly higher, while AI generated code requires 90% coverage because the cost of getting coverage is lower.<p>Especially if the prompt is attached to the metadata. Then reviewers could note how you could have changed the prompt or potentially point an AI at the bug and ask it to add something to AGENTS.md to prevent that in the future.
> Text coming out of an LLM should be in a special codeblock of Unicode, so we can see it is generated by AI.<p>Why not start with manual tagging, like "Ad"?
I don't believe most countries hold judges accountable for bad ruling at all even before AI era.<p>"Check and balance, except judiciary."
In the UK lower court judges are sometimes removed for misconduct.<p>Only the king (at the petition of parliament) can remove a high court or appeal court judge, and that's only ever happened once, in 1830.
It wasn't just a bad ruling. It was judicial misconduct.
In the US, local/state judges often are elected (probably varies by state). Federal judges can be impeached.
>why so many companies are now saying they see zero ROI from AI efforts.<p>I strongly suspect this is because workers are pocketing the gains for themselves. Report XYZ usually takes a week to write. It now takes a day. The other 4 days are spent looking busy.<p>The MIT report that found all these companies were getting nowhere with AI, also found that almost every worker was using AI almost daily. But using their personal account rather than the corporate one.
If that were the case, this site and certain subreddits would have a lot of posts and comments with people crowing about how much time they are getting back. I haven’t seen that, but I haven’t gone looking for it either.
Quiet the contrary, companies layoff all roles (frontend, backend, qa, devops, even ui/ux) and handle a project to one competent dev. And asks them to deliver it in 1/3rd the time it would have took with a proper team. It's happening at places I know. This thread on reddit is 100% same: <a href="https://www.reddit.com/r/developersIndia/s/EIksvB15tm" rel="nofollow">https://www.reddit.com/r/developersIndia/s/EIksvB15tm</a><p>I can't even imagine the stress from context switching, and since people don't realize this is still work, they do this late into the night as well.
Every downturn you see the same thing - mass layoffs blamed on whatever the latest fad is. In the end it was the economy not the fad.<p>When it isn't the economy the gains are used to build more / better, not get rid of people. (not all fads have real gains, but when they do)
SWE's are a minority of the white collar workforce.
While not dispositive of your idea, I think some portion of people using their personal accounts is because we collectively lack good feedback loops on the effectiveness of “AI addons” like RAG. The corporate accounts can be legitimately less useful than a “stock” account because the AI team integrates everything under the sun to show value, but the integrations become a net negative.<p>Ie ones that index entire company wikis. It ends up regurgitating rejected or not implemented RFCs, or docs from someone’s personal workflow that requires setting up a bunch of stuff locally to work, or etc.<p>A lot of tasks are not dependent on internal documentation, and it just ends up polluting the context with irrelevant, outdated or just wrong information.
That’s certainly a … convenient … explanation.
Because the amount of ai slop code from peers and the amount of ai slop emails to read from management has exploded.
> You can make humans more productive<p>If productivity is 10x unless work increases 10x jobs will be gone.
In about 1930, Keynes wrote "Economic Possibilities for our Grandchildren" [1] wherein he wrote:<p>"I believe that this is a wildly mistaken interpretation of what is happening to us.<p>We are suffering, not from the rheumatics of old age, but from the
growing-pains of over-rapid changes, from the painfulness of readjustment
between one economic period and another. The increase of technical efficiency
has been taking place faster than we can deal with the problem of labour
absorption; the improvement in the standard of life has been a little too quick ...<p>We are being afflicted with a new disease of which some readers may not yet have heard the name, but of which they will hear a great deal in the years to come--namely, technological unemployment. This means unemployment due to our discovery of means of economising the use of labour outrunning the pace at which we can find new uses for labour."<p>While there's no guarantee that what Smith got wrong then is the same as now, it can be a reasonable outcome that "the jobs" won't just disappear.<p>----<p>Keynes also speculated on what to do with newfound time as a result of investment returns on the back of productivity [1]:<p>"Let us, for the sake of argument, suppose that a hundred years hence we are all
of us, on the average, eight times better off in the economic sense than we are to-day. Assuredly there need be nothing here to surprise us ... Thus for the first time since his creation man will be faced with his real, his permanent problem-how to use his freedom from pressing economic cares, how to occupy the leisure, which science and compound interest will have won for him, to live wisely and agreeably and well."<p>The modern FIRE movement shows that living at a dated "standard of living" for 10-15 years can free one from work forever. Yet that's not what most people do today. I would suggest that there are deeper aspects of human drive, psychology, and varying concepts of "morality" that are actually bigger factors in what happens to "jobs".<p>[1] <a href="http://www.econ.yale.edu/smith/econ116a/keynes1.pdf" rel="nofollow">http://www.econ.yale.edu/smith/econ116a/keynes1.pdf</a>
Counterpoint: No one ever gets fired or goes to jail when big tech firms break the law. Companies will put out an apology, pay whatever small fine is imposed, and continue with illegal AI usage at scale.
<i>> Someone has to get fired / go to jail when something screws up.</i><p>In law, someone <i>always</i> hangs. I think a number of American lawyers have been sanctioned for using AI slop.<p>In other vocations ... <i>not so much</i>. I think that one of the reasons that insurance likes AI so much, is that they can say that it was "the computer" that made the decision that killed Little Timmy.
Or, AI is going to be like when land lines became unnecessary when cellphones showed up in India. India may get to skip an entire intellectual generation due to the ability of a cheap model to educate (in any language).<p>The narrative that an entire population are “worth” less, paid less , know less, live less …<p>Fuck this <i>less</i> shit, embrace the paradigm shift. God is finally providing the remedial support through the miracle of AI.
We've had YouTube for two decades now. Cheap education was already available for those who wanted it.
Or you are proven wrong entirely, again and again.
And it turns out, that the legacy, unimportant and derelict of the past - culture is all decisive It turns out that only some cultures can generate high trust societies, capable to form institutions. And you prolonged suffering, by declaring that all cultures are created equal. History may write you down a monster.
I don't know if you've ever been to India, but one of its characteristic features is that it has lots of local languages. LLMs are awful at almost all of them. Plus, there's 20ish% of the population that falls below the literacy threshold. It's hard to imagine how those people would be educated by LLMs even if that was a good idea and they all had reliable Internet access, which they often don't.
Your comment raises question, if you have ever been to India. Most of those 20% are old people. K-12 education need to be improved but literacy is not a major problem. Also India has cheapest internet in the world.
Why’s it hard to imagine? More training data will solve whatever language lapses it has. The next miracle is that TTS is perfect now, so they don’t need to be able to read.<p>You can convey abstract concepts as alternate abstractions, explain like I’m five but on turbosteroids. It’s the ultimate teaching tool and it’s about to be ubiquitous.
What training data? Many of these languages have very little digitized literature. Even if we assume they have sizeable extant corpuses (e.g. Tibetic/Bhoti), that's not enough. LLMs are still pretty garbage at English prose, for example.
Some people are worth more than others.<p>Some cultures are better than others.
> Some human still has to be accountable. Someone has to get fired / go to jail when something screws up.<p>The turning point will be when threatening an AI with being unplugged for screwing up works in motivating it to stop making things up.<p>Some people will rightly point out that is kind of what the training process is already. If we go around this loop enough times it will get there.
You are making a lot of assumptions here. You assume, among other things, that AI has self-preservation drive, can be threatened, can be motivated, and above all that we know how to accomplish that and are already doing so. I would dispute all of that.
For now maybe not. (Maybe).<p>But just as evolution in nature, isn’t it likely that in the future the AIs that have a preservation drive are the ones that survive and proliferate? Seeing they optimize for their survival and proliferation, and not blindly what they were trained on.<p>I am not discounting this happening already, not by the LLMs necessarily being sentient but at least being intelligent enough to emulate sentience. It’s just that for now, humanity is in control of what AI models are being deployed.
Claude does this if you keep pestering it about something, it will go from friendly to shooing away you.
Is this an expectation you have towards, say, NPC:s in games?
Put an LLM inside the NPCs in an open world RPG full of dangerous enemies. The LLMs that are more prone to emulate self-preservation will be more likely to survive over ones that have a lesser drive.<p>We should not act surprised if that generalizes to some degree to for example AI agents. Ones that emulate self-preservation might optimize for behavior that results in those models becoming more successful, more popular. And this feedback loop might embed more such properties into future iterations of the models.
Isn't just the issue stemming simply from not using the right tool? When the stakes are high and you should be checking details, the right tools are grounded Ai solutions like nouswise and notebooklm and not the general purpose chatbots that almost everyone knows they might hallucinate. I also do believe that this use case is definitely a low hanging fruit to automat a lot of manual work but it comes with new requirements like transparency to help with verifying the responses.
> She had no intention to misquote or misrepresent the rulings and that "the mistake occurred solely due to the reliance on an automatic source", the high court wrote<p>I don't think the intention matters here. Its the same deal with every profession using llm to "automate" their work. The onus in on the professional, not the llm. Arstechnica case could have been justified by same manner otherwise.<p>Not knowing the law isnt execuse to break law, so why is not knowing the tool an excuse to blame the tool.
Using an LLM to automate is simply the newer cheaper outsourcing with much of the same entertainment, but less food poisoning and air travel.<p>Over the last 20 years a lot of engineering (proper eng, not software) work in the west has been outsourced to cheaper places, with the certified engineers simply signing off on the work done elsewhere. This results in a cycle of doing things ever faster/more cheaply and safeguards disappearing under the pressure to go ever cheaper and faster.<p>As someone else pointed out, LLMs have just really exposed what a degraded state we have headed into rather than being a cause of it themselves. It's going to be very tough for people with no standards - they'll enjoy cheap stuff for a while and then it will all go away. Surprised Pikachu faces all round.<p>(I'm pro AI btw, just be responsible.)
LLMs also solve the timezone and language challenges. Sadly one problem that remains is that they too tell you they have understood something even if they haven't.
At least that's the story LLM labs leaders wanna tell everyone, just happen to be a very good story if you wanna hype your valuation before investment rounds.<p>Working with LLM on a daily basis I would say that's not happening, not as they're trying to sell it. You can get rid of a 5 vendor headcount that execute a manual process that should have been automated 10 years ago, you're not automating the processes involving high paying people with a 1% error chance where an error could cost you +10M in fines or jail time.<p>When I see Amodei or Sam flying on a vibe coded airplane is the day I believe what they're talking about.
Intentionality normally has to be taken into account in common law countries.<p>That doesn't mean she hasn't done something wrong, but obviously it's more serious to do something intentionally than it is to do it carelessly or recklessly.
> excuse to blame the tool<p>The issue is ultimately blaming people doesn't really solve things. Unless its genuinely a one-of-a-kind case. But if this happened once its probably going to happen again, and this isn't the first such case of LLM hallucinations in law.<p>It's weird to think this way, because its easy to just point at a person for a specific instance, but when you see something repeat over and over again you need to consider that if your ultimate goal is to stop something from happening you have to adjust the tools even if the people using them were at fault in every case.
They cannot even claim they weren't aware of the danger. LLM hallucinations have been a discussed topic, not some obscure failure mode. Almost every article on problems with AI mentions this.<p>So the judge was lazy, incompetent, or both.
Or she was conniving like Skylar in Breaking Bad as she convinced the investigator that she got hired because she seduced the owner.
I do think that for this particular situation we need to step outside of our tech bubble a little bit.<p>I am still having regular conversations with people that either don't know about hallucinations or think they are not a big problem. There is a ton of money in these companies pushing that their tools are reliable and its working for the average user.<p>I mean there are people that legitimately think these tools are conscious or we already have AGI.<p>So I am not fully sure if I would jump too quick to attack the judge when we see the marketing we are up against.
I find it hard to believe the people who use AI haven't read a single article about AI. That would also disqualify this judge, if it were true.<p>This exceeds the tech bubble.<p>My local newspaper, completely clueless about tech, runs an article about AI trouble, hallucinations and whatnot every other week. Completely missing most of the nuances, of course, but my point is that this has entered the public discourse.
It may have entered public discourse but it is not being talked about as much outside of tech spaces, and we are up against the companies pushing the complete opposite narrative.<p>All I can say is that I am having conversations with non technical people regularly that are not aware of the issue or think it is a largely solved issue.
Not just discussed, but under every chat interface explicitely mentioned "This tool can make misstakes"<p>(Sure, more honest would be "this tool makes stuff up in a convincing way")
It’s well understood that humans do not instinctively grasp statistics, are bad at knowing when they’re being lied to, and are hard wired to take shortcuts.<p>AI companies gave everyone a button that does their job for them 99.9% of the time. And then 0.1% of the time it gets them fired. That’s irresponsible, no matter how many disclaimers you add to the bottom of the screen.
This is why LLMs won't replace humans wholesale in any profession: you can't hold a machine accountable. Most of the chatbot experiences I have with various support channels always end up with human intervention anyway when it involves money.<p>Maybe true general intelligence would solve these issues, but LLMs aren't meeting that threshold anytime soon, imo. Stochastic parrots won't rule the world.
This is exactly why LLMs will replace humans: even if the work is crap, nobody will be accountable for the crap work, and it saves money.
Work where "crap" is an acceptable level of quality is work that probably doesn't need to be done.<p>So I think it's more likely that LLMs unravels the "bullshit jobs" entirely, rather than replacing them with crap. Once people realize it didn't matter if the output sucked, they'll realize the output wasn't needed in the first place.
Even ‘true general intelligence’ (if we count humans as that) screws up frequently, sometimes (often?) intentionally for it’s own benefit - which is why accountability is such a necessary element.<p>If someone won’t be held liable for the end result at <i>some</i> point, then there is no reason to ensure an even somewhat reasonable end result. It’s fundamental.<p>Which is also why I suspect so many companies are pushing ‘AI’ so hard - to be able to do unreasonable things while having a smokescreen to avoid being penalized for the consequences.
> to be able to do unreasonable things while having a smokescreen<p>Maybe, but I feel like the calculus remains unchanged for professions that already lack accountability (police, military, C-suite, three letter agencies, etc.); LLMs are yet another tool in their toolbox to obfuscate but they were going to do that anyway.<p>Peons will continue to face consequences and sanctions if they screw up by using hallucinated output.
all of those professions definitely have accountability - per the nominal rules of the system. Often extremely severe accountability.<p>The actual systems do everything they can to avoid that accountability, including often violating the rules themselves, or corrupting enforcement, for exactly the reasons why corporations are trying to avoid accountability too.<p>Accountability is expensive, and way less convenient than doing whatever you want whenever you want.
> Not knowing the law isn't excuse to break law,<p>Yeah, about that ...<p><a href="https://metro.co.uk/2016/07/03/rapist-struck-again-after-deportation-order-was-overturned-5982886/" rel="nofollow">https://metro.co.uk/2016/07/03/rapist-struck-again-after-dep...</a><p>> A Somalian rapist who had his deportation overturned went on to rape two more women after he was freed.<p>> But he had his deportation overturned after serving his time because he didn’t know it was unacceptable in the UK.
How many of these cases do we have to have before lawyers realise that they need to check that the things an LLM tells them are actually true?
It doesn't matter, because any process that seems right most of the time but occasionally is wrong in subtle, hard to spot ways is basically a machine to lull people into not checking, so stuff will always slip through.<p>It's just like the cars driving themselves but you need to be able to jump in if there is a mistake, humans <i>are not</i> going to react as fast as if they were driving, because they aren't going to be engaged, and no one can stay as engaged as they were when they were doing it themselves.<p>We need to stop pretending we can tell people they "just" need to check things from LLMs for accuracy, it's a process that inevitably leads to people not checking and things slipping through. Pretending it's the people's fault when essentially everyone using it would eventually end up doing that is stupid and won't solve the core problem.
> won't solve the core problem.<p>what's the core problem tho? Because if the core problem is "using ai", then it's an inevitable outcome - ai will be used, and there are always incentive to cut costs maximally.<p>So realistically, the solution is to punish mistakes. We do this for bridges that collapse, for driver mistakes on roads, etc. The "easy" fix is to make punishment harsher for mistakes - whether it's LLM or not, the pedigree of the mistake is irrelevant.
The core problem is that the tool provides output that looks right and is right a lot of the time, but also slips in incorrect stuff in a hard to notice way.<p>Punishment isn't a problem because it doesn't work. If you create a system that lulls people into a sense of security, no punishment will stop them because they aren't doing it thinking "it's worth the risk", it's that they don't see the risk. There are so many examples of this, it's weird people still think this actually works.<p>Furthermore, it becomes a liability-washing tool: companies will tell employees they have to take the time to check things, but then not give them the time required to actually check everything, and then blame employees when they do the only thing they can: let stuff slip.<p>If you want to use LLMs for this kind of thing, you need to create systems around them that make it <i>hard</i> to make the mistakes. As an example (obviously not a complete solution, just one part): if they cite a source, there should be a mandated automatic check that goes to that source, validates it exists, and that the cited text is actually there, not using LLMs. Exact solutions will vary based on the specific use case.<p>An example from outside LLMs: we told users they should check the URL bar as a solution to phishing. In theory a user could always make sure they were on the right page and stop attacks. In practice people were always going to slip up. The correct solution was automated tooling that validates the URL (e.g: password managers, passkeys).
> The correct solution was automated tooling that validates the URL<p>that's because this particular problem has a solution.<p>The issue here is that there's no such a tool to automatically validate the output of the LLM - at least, not yet, and i don't see the theoretical way to do it either.<p>And you're making the punishment as being getting fired from the job - which is true, but the company making the mistake also gets punished (or should be, if regulatory capture hasn't happened...). This results in direct losses for the company and shareholders (in the form of a fine, recalls and/or replacements etc).
> The issue here is that there's no such a tool to automatically validate the output of the LLM - at least, not yet, and i don't see the theoretical way to do it either.<p>Yeah, it's never going to be possible to validate everything automatically, but you may be able to make the tool valuable enough to justify using it if you can make errors easier to spot. In all cases you need to ask if there is actually any gain from using the LLM and checking it, or if doing so well enough actually takes enough time that it loses it's value. My point is that just blaming the user isn't a good solution.<p>> And you're making the punishment as being getting fired from the job - which is true, but the company making the mistake also gets punished (or should be, if regulatory capture hasn't happened...). This results in direct losses for the company and shareholders (in the form of a fine, recalls and/or replacements etc).<p>Yes, regulation needs to be strong because companies can accept these things as a cost of doing business and will do so, but people losing their jobs can be life destroying. If companies are going to not give people the time and tools to check this stuff, then the buck should stop with them not the employees that they are forcing to take risks.
<i>The human is responsible.</i> That's the fix. I don't care if you got the results from an LLM or from reading cracks in the sidewalk; you are responsible for what you say, and especially for what you say professionally. I mean, that's almost the definition of a professional.<p>And if you can't play by those rules, then maybe you aren't a professional, even if you happened to sneak your way into a job where professionalism is expected.
This doesn't solve the problem, because companies <i>will</i> force people to use these tools and demand they work faster, eventually resulting in people slipping.<p>People will have to choose between being fired for being "too slow", or taking the risk they end up liable. Most people can't afford to just lose their job, and will end up being pressured into taking the risk, then the companies will liability-wash by giving them the responsibility.<p>You <i>need</i> regulation that ensures companies can't just push the risk onto employees who can be rotated out to take the blame for mistakes.
As someone who has done QA on white collar work it's tiring looking for little errors in work reports. Most people are not cut out for it.
Probably worth including a "bibliography" section of citations that can be automatically checked that they actually exist then
Even disregarding self driving features, it seems like the smarter we make cars the dumber the drivers are. DRLs are great, until they allow you to drive around all night long with no tail lights and dim front lighting because you’re not paying enough attention to what’s actually turned on.
I'm continually amazed at how much faith people have in them. I guess since they can sound like people and output really authoritative and confident text it just overrides any skepticism subconsciously?
Much as I like them, I do frequently remind myself of two things:<p>1) <a href="https://en.wikipedia.org/wiki/Clever_Hans" rel="nofollow">https://en.wikipedia.org/wiki/Clever_Hans</a><p>2) <a href="https://archive.org/details/nextgen-issue-26" rel="nofollow">https://archive.org/details/nextgen-issue-26</a> as an example of how in the 90s we has rapid cycles of a new tech (3d graphics) astounding us with how realistic each new generation was compared to the previous one, and forgetting with each new (game engine) how we'd said the same and felt the same about (graphics) we now regarded as pathetic.<p>So yes, they do sound "authoritative and confident text it just overrides any skepticism subconsciously", but you shouldn't be amazed, we've always been like this.
The advertising campaign is incredible.
Yes, just as with politicians. And LLMs have been thoroughly tuned to appear that
<a href="https://en.wikipedia.org/wiki/ELIZA_effect" rel="nofollow">https://en.wikipedia.org/wiki/ELIZA_effect</a>
It's mind boggling how much people claim to like LLMs when you would never design any other piece of software to operate like LLMs do. Designing a system that interact with the user through natural text creates an awful experience. It slows down every interaction as you dig through all the prose to get to the key information. It turns every computer interaction into a school math word problem.
It doesn't matter anymore.<p>LLMs just revealed what a decadent society we have setup for ourselves worldwide.
It’s worse than that. We’re hearing about the lawyers and Ars Technica because the consequences are public and the errors are egregious.<p>It’s likely happening to everyone.
Just this week I tracked down the citations of a scientific paper (whose authors could very well be here) where 25% of the citations were made up and 50% of the remaining ones were wrong, taking ArXiv papers and citing them as belonging to (say) IJCLR.<p>It's not just lawyers.
This whole thing is silly, LLMs can automate reference validation.<p>If someone is a lawyer, accountant, doctor, teacher, surgeon, engineer etc, and is regurgitating answers that were pumped out with with GPT-5-extra-low or whatever mediocre throttled model they are using, they should just be fired and de-credentialed. Right now this is easy.<p>The real problem is ahead: 99.999% of future content that exists will be made using generative AI. For many people using Facebook, Instagram, TikTok, or some other non-sequential, engagement weighted feed, 50%+ of the content they consume today is fake. As that stuff spreads in to modern culture it's going to be an endless battle to keep it out of stuff that should not be publishing fake content (e.g. the New York Times or Wall Street Journal; excluding scientific journals who seem to abandoned validation and basic statistics a long time ago.)<p>Much of the future value and profit margins might just be in valid data?
> Right now this is easy.<p>Easy? In the US you need house impeachment to fire a judge. In some countries judges are completely immune unless they are sentenced for crimes.
To fire a federal judge. Local judges, which are the vast majority, can be fired by their colleagues or replaced in elections.
Do you need impeachment to fire a lawyer, accountant, doctor, teacher, surgeon or engineer?
> This whole thing is silly, LLMs can automate reference validation.<p>Can they though with 100% accuracy and no hallucinations? Wouldn't you still need to validate that they validated correctly?
Do we see this a lot in the US? This seems to be more unique to India.
It’s happening A LOT in the US too. Mainstream media just doesn’t seem to find it that newsworthy.<p><a href="https://arstechnica.com/tech-policy/2026/02/randomly-quoting-ray-bradbury-did-not-save-lawyer-from-losing-case-over-ai-errors/" rel="nofollow">https://arstechnica.com/tech-policy/2026/02/randomly-quoting...</a>
From the article:<p>> In October, two federal judges in the US were called out for the use of AI tools which led to errors in their rulings. In June 2025, the High Court of England and Wales warned lawyers not to use AI-generated case material after a series of cases cited fictitious or partially made up rulings.
today: <a href="https://news.ycombinator.com/item?id=47231189">https://news.ycombinator.com/item?id=47231189</a>
What kind of AI is this that you constantly need a human to check its job? Do you think Jean-Luc Piccard had to constantly check the output of the Enterprise computer? No he didn't. If AI is not better than humans, then what the heck is the point? You might as well just use humans.
The high court also advocated for the "exercise of actual intelligence over artificial intelligence". Hehe.
This is going to be a huge problem in every sector. I have been exploring solution in this space for fintech and so far what Resemble AI is doing [1] is probably the best way to defend.<p>The surface level for us is not just LLM generated text, it is also the combination of AI augmented audio (for incoming calls) and then for our own voice agents being able to protect and identify services cloning our own agent voices with watermarking.<p>It's not fun, as we are constantly catching up.<p>[1] <a href="https://www.resemble.ai/detect/" rel="nofollow">https://www.resemble.ai/detect/</a>
It's sad that it took the highest court in the country to point out lack of professionalism and misconduct.<p>The judge took no personal responsibility.<p>> She told the court that this was her first time using an AI tool and she had believed the citations to be "genuine". She had no intention to misquote or misrepresent the rulings and that "the mistake occurred solely due to the reliance on an automatic source", the high court wrote.<p>She had one job. And that was to read the citations. Instead of owning up to the mistake of being lazy all she wanted to talk about "intentions".<p>The high court also took no responsibility.<p>> In its order, the high court said that "the citations may be non-existent, but if the learned trial court has considered the correct principles of law and its application to the facts of the case is also correct,<p>This line of reasoning is questionable and attempt to gaslight everyone. Judges cite other cases in their judgement. But if the junior judge had no clue that the references were fake what correct principles was she applying?<p>End of the day maybe the judgement is correct but this overall bullshit.<p>Given that this is happening all over the world people seem to have a convenient excuse - The AI made me do it.
There will be loads of papers and publications with fake citation. AI will be trained on these. In the end, we'll have more and more hallucinated information that true content on the internet.
This is a big problem in the US and UK too. Lawyers are not technical at all and they need a robust system of governance, since currently they're (directly editing, not even diffing) documents with a chatbot which makes these mistakes inevitable. See <a href="https://insights.doughtystreet.co.uk/post/102mi96/38-uk-cases-involving-hallucinations-ai-or-otherwise-judicial-caution-in-the-f" rel="nofollow">https://insights.doughtystreet.co.uk/post/102mi96/38-uk-case...</a>
> She had no intention to misquote or misrepresent the rulings and that "the mistake occurred solely due to the reliance on an automatic source"<p>Next: gunman pleads death occured solely due to reliance on an automatic weapon.
I feel like this points out a very general problem with the law: it generates a lot of boilerplate text. Lawyers don't really read it; they skim it for the relevant bits.<p>Obviously lawyers should not be cheating with AI, especially when they don't even check it. But it does sound to me as if this is an opportunity to re-factor the process. We're carrying forward some ideas originally implemented in Latin, and which can be dramatically simplified.<p>I'm not a lawyer; I know this only in passing. And I am aware that there are big differences between law and code. But every time I encounter the law, and hear about cases like this, what I see are vast oceans of text that can surely be made more rigorous. AI is not the problem; it's pointing out the opportunity.
> problem with the law: it generates a lot of boilerplate text<p>I think the problem fundamentally is that matters of law require thorough, precise language, and unambiguous context. If you remove "the boilerplate" then you introduce a vast gray area left to interpretation.<p>Usually attempts (by humans or computers) to "summarize" or frame things in "plain language" will apply a bias since it intentionally omits all the myriad context and legal/societal "gray areas" that will inform one perspective or another.<p><i>Legalese</i> exists the way it is because it is an attempt to remove <i>doubt</i>. And even then, doubt still creeps in.
This is only the case when you care more about the letter of the law than the spirit of the law, which is, I suppose, most of the world. It doesn't have to be this way, it's a choice that society has made.<p>When I bought my house, in an alternate universe the paperwork <i>could</i> have been one sheet of paper that said "[My name] purchases home at [address] from [Seller's name] for [price]." and we'd all rely on our shared understanding of what it means to buy something and shared cultural expectations around home ownership and commerce. But our society did not make that choice, we don't live in that universe, so I had to sign a 300 page stack of papers 30 times.
law texts feel like a layering problem, like just decoration around decoration to avoid breaking existing 'code' without ever simplifying it
Okay so let’s try simplifying it.<p>We’ll change the existing murder legislation to “Killing someone is a crime”. It’ll save us thousands of pages.<p>But does that mean a soldier shooting an enemy is a crime? What about shooting someone who is raping you? What if you shoot someone by mistake, thinking they’re going to kill you? What if you hit them with a car? What if you fail to provide safety equipment which eventually results in their accidental death?<p>Oopsie woopsie, I guess we need to add another thousand pages of exceptions back to our simplistic laws. It turns out people didn’t just write them for the fun of it.
Next token prediction and Hallucination as a bug. This should be of deep concern to all Frontier labs, who think Integrity and Trust is optional when LLMs are used this way in places where it's most important.
I wonder how many similar cases are happening in the engineering or software development sector that go unnoticed, and it seems no one is caring enough, only waiting for a disaster to happen so we can start seeing some regulation preventing the use of AI in engineering/coding industry.
In Australia, our universities are finding that a large proportion of Indian students have been using GenAI for cheating. Often they get away with it. I'm not saying that people other than Indian overseas students cheat, but it does seem more entrenched. I'd love to know why. It doesn't actually help in the long term!
I'm sure there's GenAI cheating from most communities. But the amount may vary based on the culture of learning.<p>Some people have the perspective that you're attending school in order to learn stuff, and the degree demonstrates you learned the stuff; some people have the perspective that you're attending school in order to get the degree and it doesn't matter so much how you check the boxes to get the degree.<p>This difference in perspective certainly didn't start with AI; it's been around for a long time. Some education cultures push more rote learning and some push more mastery of the subject. There's pros and cons, and pursing rote learning doesn't preclude mastery, and mastery often involves requires some amount of rote learning.<p>When you transplant a contingent of students between philosophies, you get conflict where there's differences.
How unserious/serious are the universities? Heard of diploma mills in Canada taking international students, letting them spend most of their time waiting at coffee shops and award them MBAs so they can be full time waiters and citizens.
In the United States, cheating via AI is now rampant regardless of ethnicity. I know little of Australian Universities but I would assume it’s similar over there.
>The number of international students studying in Australia totalled 833,041 for the January-October 2025 period<p>>The United States hosts the highest number of international students on record, with approximately 1.1 to 1.2 million<p>The US has 32% more students than Australia and 1121% more people. Imagine if the US took on 13 million foreign college students per year lol<p>It does help them in the long run, because it ensures they get to reside in australia. after 4 years they get permanent residence rights and benefits, etc
Indian students have embraced GenAI at a rate significantly higher than the global average, with nearly 90% of students in some surveys actively using these tools.<p>Government Policy and National Initiatives: The National Education Policy (NEP 2020) has shifted the focus toward digital literacy. The government has introduced AI as a skill subject for younger grades and launched programs like AI for All to promote nationwide awareness.
I'd imagine they are just being worse at hiding it. GenAI is rampant pretty much everywhere in school system of most countries
I imagine even a slight impediment in terms of being able to parse and express yourself in a language that you don't know as well as your mother tongue makes LLM usage much more tantalizing.<p>And not knowing the language quite as well as native speakers would also make you more likely to be discovered as having used an LLM to do coursework.
Citation needed. I have seen these kinds of assertion all my life without any evidence to back them up. For example, when I moved to the US, I was told, again without any evidence, Chinese students cheat a lot. It's always a couple of faculty who extrapolate their experiences with a few students and then slap racial labels on the entire student body.
They are not there for the knowledge - knowledge is cheap and abundant. They are there for the credentials and subsequent potential access to offshore jobs.
Hate to break it to you but <i>everyone</i> is cheating using AI.
Andhra is like silicon valley of India. Wouldn't blame the poor judge.
The scary thing is that Indian juduciary is infamous for being incapable of tolerating any kind of criticism against it and not hesitating to put people in jail for "contempt" for just calling out corruption. Imagine the official courts of 1.4B+ people being run by such braindead narcissists, now unhindered with having to even pretend to do their jobs as they just offload everything to AI tools.
> The scary thing is that Indian juduciary is infamous for being incapable of tolerating any kind of criticism against it and not hesitating to put people in jail for "contempt" for just calling out corruption.<p>From [0]:<p>"India's Supreme Court has banned a school textbook after a chapter in it made a reference to corruption in the judiciary.<p>The revised social science book was published by the National Council of Educational Research and Training (NCERT), which designs the syllabus and textbooks for millions of schoolchildren in the country.<p>On Wednesday, after Chief Justice Surya Kant criticised the book, saying it could damage the reputation of the judiciary, NCERT apologised and withdrew it from distribution.<p>Now the court has ordered a complete halt on the book's publication, saying its contents were "extremely contemptuous" and "reckless".<p>"A complete blanket ban is hereby imposed on any further publication, reprinting or digital dissemination of the book," the court said on Thursday, according to legal news website LiveLaw.<p>The judges also issued notices to the top bureaucrat in the school education department and the NCERT director, asking them to explain why they should not be held in contempt of court for including the "offending chapter".<p>[0] <a href="https://www.bbc.com/news/articles/c627l7zexr8o" rel="nofollow">https://www.bbc.com/news/articles/c627l7zexr8o</a>
You are way off-base here; see - <a href="https://news.ycombinator.com/item?id=47235274">https://news.ycombinator.com/item?id=47235274</a>
[dead]
[dead]
[flagged]
You had two choices: 1) read the article or 2) be racist about Indians. You chose 2.<p>From the article:<p>> In October, two federal judges in the US were called out for the use of AI tools which led to errors in their rulings. In June 2025, the High Court of England and Wales warned lawyers not to use AI-generated case material after a series of cases cited fictitious or partially made up rulings.
You are right this would never happen in an advanced country like the USA, and certainly not in a top Federal court<p><a href="https://www.reuters.com/sustainability/society-equity/two-federal-judges-say-use-ai-led-errors-us-court-rulings-2025-10-23/" rel="nofollow">https://www.reuters.com/sustainability/society-equity/two-fe...</a>
[flagged]
one should also consider that even with fake hallucinated AI situation, the productivity and correctness of the work produced by the culprit ( in general ) may still have been of higher quality then before AI regardless of the fails
Hard to believe when this judge apparently thought that outsourcing their — extremely confidential, sensitive, and important — work to a known unreliable tool was a good idea. And then further thought that they apparently did not even need to check the results.<p>Sound like extreme incompetence or laziness.
The vast majority of court cases lead to dismissal.<p>Why not use AI to adjudicate cases, and if it is dismissal, dismissal it is.<p>If not then move to a proper court.<p>This way the backlog of cases will significantly drop, and we will work only on cases that there is enough meat to lead to a conviction.
> Senior judges at the Supreme Court in Delhi have threatened consequences over the use of AI<p>Setting AI aside for a moment, this reflects a broader issue in India and elsewhere. When institutions respond to new technologies with anger or threats rather than systemic thinking, it signals a deeper problem.<p>The real challenge is not AI itself, but how complex systems adapt to change. Instead of reacting defensively, institutions should anticipate second-order effects, build regulatory capacity, and treat this as a governance and systems problem.<p>Mature institutions approach disruption with foresight, incentives, and feedback loops, not emotions. Without that shift, they risk reinforcing outdated hierarchies rather than serving the public effectively.
> Setting AI aside for a moment, this reflects a broader issue in India and elsewhere. When institutions respond to new technologies with anger or threats rather than systemic thinking, it signals a deeper problem.<p>No, especially in this case when the first appeal to the high court resulted in the high court brushing it off as if nothing happened.<p>It was a reprimand two to institutes (the trial court and the high court) they have a job to do and they can't shirk that responsibility.
This comment sounds AI-generated.
No, you are reading it wrong; The Indian Supreme Court is doing the right thing.<p>The lower courts in India are all overloaded with pending cases (i.e. not enough judges) and so the incentives for both judges and lawyers to "outsource" to AI is very high. This needs to be done with caution and that is what the supreme court said, viz;<p><i>The Supreme Court called the case a matter of "institutional concern" and said fake AI-generated judgements had "a direct bearing on integrity of adjudicatory process".<p>...<p>The defendants challenged the order in the state's high court, pointing out that the cited orders were fake. The high court acknowledged this, but accepted that the junior civil judge had made the error in "good faith" and went on to agree with the trial court's decision anyway.<p>In its order, the high court said that "the citations may be non-existent, but if the learned trial court has considered the correct principles of law and its application to the facts of the case is also correct, mere mentioning of incorrect or non-existent rulings/citations in the order cannot be a ground to set aside the order".<p>The high court had also sought a report from the junior judge who had used the AI-generated rulings. She told the court that this was her first time using an AI tool and she had believed the citations to be "genuine". She had no intention to misquote or misrepresent the rulings and that "the mistake occurred solely due to the reliance on an automatic source", the high court wrote.<p>The high court also advocated for the "exercise of actual intelligence over artificial intelligence".<p>Following this, the defendants appealed again, taking the matter to the Supreme Court, which was less forgiving about the impact of AI.<p>Coming down sternly against the fake judgements, the top court last Friday stayed the lower court's order on the property dispute. It said the use of AI while making judgements was not simply "an error in decision making" but an act of "misconduct".<p>"This case assumes considerable institutional concern, not because of the decision that was taken on the merits of the case, but about the process of adjudication and determination," the top court said.</i><p>PS: To get an idea of how overloaded the Indian Judicial System is; this happened in a recent case in Allahabad High Court - <i>The order then took an unusual turn. “Since I am feeling hungry, tired and physically incapacitated to dictate the judgment, the judgment is reserved,” the judge recorded.</i> - He had been hearing more than 30 cases on that day - <a href="https://www.hindustantimes.com/india-news/hungry-tired-allahabad-hc-judge-reserves-order-after-3-hour-overtime-101772217165006.html" rel="nofollow">https://www.hindustantimes.com/india-news/hungry-tired-allah...</a>
The pattern here isn't really about individual negligence — it's a systems design problem. We keep deploying LLMs into workflows where the failure mode is "plausible-sounding fabrication" and the downstream consequence is legal or institutional harm, then blaming the end user for not catching it.<p>The better question is why these tools are being integrated into judicial workflows without mandatory citation verification layers. The EU AI Act classifies judicial AI as high-risk and requires human oversight mechanisms specifically for this reason. India's Digital Personal Data Protection Act (2023) doesn't yet have equivalent provisions for AI in courts, which is the actual gap.<p>From an engineering standpoint, the fix is straightforward: any LLM-assisted legal research tool should require grounded retrieval (RAG against verified case law databases) with mandatory source links that the user must click through before citing. The fact that most legal AI tools still don't enforce this is a product design failure, not a user education problem.