See the public phab ticket: <a href="https://phabricator.wikimedia.org/T419143" rel="nofollow">https://phabricator.wikimedia.org/T419143</a><p>In short, a Wikimedia Foundation account was doing some sort of test which involved loading a large number of user scripts. They decided to just start loading random user scripts, instead of creating some just for this test.<p>The user who ran this test is a Staff Security Engineer at WMF, and naturally they decided to do this test under their highly-privileged Wikimedia Foundation staff account, which has permissions to edit the global CSS and JS that runs on every page.<p>One of those random scripts was a 2 year old malicious script from ruwiki. This script injects itself in the global Javascript on every page, and then in the userscripts of any user that runs into it, so it started spreading and doing damage really fast. This triggered tons of alerts, until the decision was made to turn the Wiki read-only.
This is a pretty egregious failure for a staff security engineer
It's a pretty egregious failure for the org because it controlled the conditions for it to happen.<p>The security guy is just the patsy because he actioned it.<p>They have obviously done this a million times before and now they got burned.
Pretty much the definition of a “career limiting event”
It's either a a Career Limiting Event, or a Career Learning event.<p>In the case of a Learning event, you keep your job, and take the time to make the environment more resilient to this kind of issue.<p>In the case of a Limiting event, you lose your job, and get hired somewhere else for significantly better pay, and make the new environment more resilient to this kind of issue.<p>Hopefully the Wikimedia foundation is the former.
In the average real world, the staff engineer learns nothing, regardless of whether they get to lose or keep their job. Some time down the line, they make other careless mistakes. Eventually they retire, having learned nothing.<p>This is more common than you'd think.
I was able to run some stats at scale on this and people who make mistakes are more likely to make more mistakes, not less. Essentially sampling from a distribution of a propensity for mistakes and this dominated any sign of learning from mistakes. Someone who repeatedly makes mistakes is not repeatedly learning, they are accident prone.
Can you elaborate? What scale? What kind of mistakes? This sounds quite interesting.
What if you define a hard rule from this statistics that « you must fire anyone on error one »? Won’t your company be empty in a rather short timeframe?
[or will be composed only of doingNothing people?]
Nobody is going to know who did this, so probably not career limiting in any major way.
They'll be fine, recruiters don't look this stuff up and generally background checks only care about illegal shit.
[flagged]
Didn't realise this was some historic evil script and not some active attacker who could change tack at any moment.<p>That makes the fix pretty easy. Write a regex to detect the evil script, and revert every page to a historic version without the script.
Letting ancient evil code run? Have we learned nothing from <i>A Fire Upon the Deep</i>?!
"It was really just humans playing with an old library. It should be safe, using their own automation, clean and benign.<p>This library wasn't a living creature, or even possessed of automation (which here might mean something more, far more, than human)."
Link to the Prologue of <i>Fire Upon the Deep</i>: <a href="https://www.baen.com/Chapters/-0812515285/A_Fire_Upon_the_Deep.htm" rel="nofollow">https://www.baen.com/Chapters/-0812515285/A_Fire_Upon_the_De...</a><p>It's very short and from one of my favorite books. Increasingly relevant.
Legitimately listening to this book for the first time after a coworker recommended it. It's rapidly becoming one of my favorite books that balances the truly alien with the familiar just right.<p>Not so ironically, it came up when we were discussing "software archeology".
\(^O^)/ zones of thought mentioned \(^O^)/
I've only just heard of it. But, I already knew to not run random scripts under a privileged account. And thank you for the book suggestion - I'm into those kinds of tales.
I love that book
Are you sure?
Are you $150 million ARR sure?
Are you $150 million ARR, you'd really like to keep your job, you're not going to accidentally leave a hole or blow up something else, sure?<p>I agree, mostly, but I'm also really glad I don't have to put out this fire. Cheering them on from the sidelines, though!
Or just restore from backup across the board. Assuming they do their backups well this shouldn't be too hard (especially since its currently in Read Only mode which means no new updates)
True but it does say something that such a script was able to lie dormant for so long.
300 million dollar organization btw
> One of those random scripts was a 2 year old malicious script from ruwiki. This script injects itself in the global Javascript on every page, and then in the userscripts of any user that runs into it, so it started spreading and doing damage really fast.<p>So, like the Samy worm? (<a href="https://en.wikipedia.org/wiki/Samy_%28computer_worm%29" rel="nofollow">https://en.wikipedia.org/wiki/Samy_%28computer_worm%29</a>)
I'm guessing,
"1> Hey Claude, your script ran this malicious script!"<p>"Claude> Yes, you're absolutely right! I'm sorry!"
wait as a wikipedia user you can just put random JS to some settings and it will just... run? privileged?<p>this is both really cool and really really insane
It's a mediawiki feature: there's a set of pages that get treated as JS/CSS and shown for either all users or specifically you. You <i>do</i> need to be an admin to edit the ones that get shown to all users.<p><a href="https://www.mediawiki.org/wiki/Manual:Interface/JavaScript" rel="nofollow">https://www.mediawiki.org/wiki/Manual:Interface/JavaScript</a>
Yes, you can have your own JS/CSS that’s injected in every page. This is pretty useful for widgets, editing tools, or to customize the website’s apparence.
On one hand, I was about to get irrationally angry someone was attacking Wikipedia, so I'm a bit relieved<p>On the other hand,<p>>a Staff Security Engineer at WMF, and naturally they decided to do this test under their highly-privileged Wikimedia Foundation staff account<p><i>seriously?</i>
To paraphrase Bush,<p>> our enemies are innovative and resourceful, and so are we. They never stop thinking about new ways to harm our site and our users, and neither do we.
Wow. This worm is fascinating. It seems to do the following:<p>- Inject itself into the MediaWiki:Common.js page to persist globally, and into the User:Common.js page to do the same as a fallback<p>- Uses jQuery to hide UI elements that would reveal the infection<p>- Vandalizes 20 random articles with a 5000px wide image and another XSS script from basemetrika.ru<p>- If an admin is infected, it will use the Special:Nuke page to delete 3 random articles from the global namespace, AND use the Special:Random with action=delete to delete another 20 random articles<p>EDIT! The Special:Nuke is really weird. It gets a default list of articles to nuke from the search field, which could be any group of articles, and rubber-stamps nuking them. It does this three times in a row.
There doesn’t seem to be an ulterior motive beyond “Muahaha, see the trouble I can cause!”
A classical virus, from the good old days. None of this botnet/bitcoin mining in the background nonsense.
No one actually knows what the payload from basemetrika.ru contains, though. So it's possible it was originally intended to be more damaging. But no matter what it would have caught attention super fast, so there's probably an upper limit to how sophisticated it could have been.
As someone on the Wikipediocracy forums pointed out, basemetrika.ru does not exist. I get an NXDomain response trying to resolve it. The plot thickens.
> Vandalizes 20 random articles with a 5000px wide image and another XSS script from basemetrika.ru<p>Note while this looks like its trying to trigger an xss, what its doing is ineffective, so basemetrika.ru would never get loaded (even ignoring that the domain doesnt exist)
Wouldn't be surprised if elaborate worms like this are AI-designed
I wouldn't be surprised either. But the original formatting of the worm makes me think it was human written, or maybe AI assisted, but not 100% AI. It has a lot of unusual stylistic choices that I don't believe an AI would intentionally output.
> It has a lot of unusual stylistic choices that I don't believe an AI would intentionally output.<p>Indeed. One of those unusual choices is that it uses jQuery. Gotta have IE6 compatibility in your worm!<p>I'm not sure what to make of `Number("20")` in the source code. I would think it's some way to get around some filter intended to discourage CPU-intensive looping, but I don't think user scripts have any form of automated moderation, and if that were the case it doesn't make sense that they would allow a `for` loop in the first place.
jQuery is still sooo much easier to use than React and whatever other messes modern frameworks have created. As a bonus, you don't have to npm build your JS project, you just double click and it opens and works without any build step, which is how interpreted languages were intended to be.
I would. AI designed software in general does not include novel ideas. And this is the kind of novel software AI is not great at, because there's not much training data.<p>Of course it's very possible someone wrote it with AI help. But almost no chance it was designed by AI.
Turns out it's a pretty rudimentary XSS worm from 2023. If all you have is a hammer, everything looks like a nail; if all you have is a LLM, everything looks like slop?
I mean....elaborate is a stretch.
> Cleaning this up is going to be an absolute forensic nightmare for the Wikimedia team since the database history itself is the active distribution vector.<p>Well, worm didn't get root -- so if wikimedia snapshots or made a recent backup, probably not so much of a nightmare? Then the diffs can tell a fairly detailed forensic story, including indicators of motive.<p>Snapshotting is a very low-overhead operation, so you can make them very frequently and then expire them after some time.
Even if they reset to several days ago and lose, say, thousands of edits, even tens of thousands of minor edits, they're still in a pretty good place. Losing a few days of edits is less-than-ideal but very tolerable for Wikipedia as a whole
At $work we're hosting business knowledge databases. Interestingly enough, if you need to revert a day or two of edits, you're better off to do it asap, over postponing and mulling over it. Especially if you can keep a dump or an export around.<p>People usually remember what they changed yesterday and have uploaded files and such still around. It's not great, but quite possible. Maybe you need to pull a few content articles out from the broken state if they ask. No huge deal.<p>If you decide to roll back after a week or so, editors get really annoyed, because now they are usually forced to backtrack and reconcile the state of the knowledge base, maybe you need a current and a rolled-back system, it may have regulatory implications and it's a huge pain in the neck.
Nah, you can snapshot every 15 minutes. The snapshot interval depends on the frequency of changes and their capacity, but it's up to them how to allocate these capacities... but it's definitely doable and there are real reasons for doing so. You can collapse deltas between snapshots after some time to make them last longer. I'd be surprised if they don't do that.<p>As an aside, snapshotting would have prevented a good deal of horror stories shared by people who give AI access to the FS. Well, as long as you don't give it root.......
><i>Nah, you can snapshot every 15 minutes.</i><p>obviously you <i>can</i>. but, what is the <i>actual</i> snapshot frequency? like, what is the timestamp of the last known good snapshot? that is what matters.<p>in any case, the comment you are replying to is a hypothetical, which correctly points out that even a day or two of lost edits is fine (not ideal, but fine). your reply doesnt engage with their comment at all.
> the comment you are replying to is a hypothetical, which correctly points out that even a day or two of lost edits is fine (not ideal, but fine). your reply doesnt engage with their comment at all.<p>I did engage, by pointing out that it wasn't relevant nor a realistic scenario for a competent sysadmin. (Did you read the OP?) That's a /you/ problem if you rely on infrequent backups, especially for a service with so much flux.<p>> what is the actual snapshot frequency? like, what is the timestamp of the last known good snapshot?<p>? Why would I know what their internal operations are?
><i>I did engage, by pointing out that it wasn't relevant nor a realistic scenario for a competent sysadmin.</i><p>><i>Why would I know what their internal operations are?</i><p>i mean... you must, right? you know that once-a-day snapshots is not relevant to this specific incident. you know that their sysadmins are apparently competent. i just assumed you must have some sort of insider information to be so confident.
Nowadays I refuse to do any serious work that isn't in source control anywhere besides my NAS that takes copy-on-write snapshots every 15 minutes. It has saved my butt more times than I can count.
Yeah same here. Earlier I had a sync error that corrupted my .git, somehow. no problem; I go back 15 minutes and copy the working version.<p>Feels good to pat oneself in the back. Mine is sore, though. My E&O/cyber insurance likes me.
The problem isn't the granularity of the backup but since the worm silently nukes pages, it's virtually impossible to reconcile the state before the attack and the current state, so you have to just forfeit any changes made since then and ask the contributors to do the leg work of reapplying the correct changes
Nothing was rolled back in the db sense, i think people just used normal wiki revert tools.<p>It also never effected wikipedia, just the smaller meta site (used for interproject coordination)
A theory on phab: "Some investigation was made in Russian Wikipedia discord chat, maybe it will be useful.<p>1. In 2023, vandal attacks was made against two Russian-language alternative wiki projects, Wikireality and Cyclopedia. Here <a href="https://wikireality.ru/wiki/РАОрг" rel="nofollow">https://wikireality.ru/wiki/РАОрг</a> is an article about organisators of these attacks.<p>2. In 2024, ruwiki user Ololoshka562 created a page <a href="https://ru.wikipedia.org/wiki/user:Ololoshka562/test.js" rel="nofollow">https://ru.wikipedia.org/wiki/user:Ololoshka562/test.js</a> containing script used in these attacks. It was inactive next 1.5 years.<p>3. Today, sbassett massively loaded other users' scripts into his global.js on meta, maybe for testing global API limits: <a href="https://meta.wikimedia.org/wiki/Special:Contributions/SBassett_(WMF)" rel="nofollow">https://meta.wikimedia.org/wiki/Special:Contributions/SBasse...</a> . In one edit, he loaded Ololoshka's script: <a href="https://meta.wikimedia.org/w/index.php?diff=prev&oldid=30167202" rel="nofollow">https://meta.wikimedia.org/w/index.php?diff=prev&oldid=30167...</a> and run it."
Woah this looks like an old school XSS worm <a href="https://meta.wikimedia.org/wiki/Special:RecentChanges?hidebots=1&translations=filter&hidecategorization=1&hideWikibase=1&limit=100&days=30&safemode=1&urlversion=2" rel="nofollow">https://meta.wikimedia.org/wiki/Special:RecentChanges?hidebo...</a><p>I’ve always thought the fact that MediaWiki sometimes lets editors embed JavaScript could be dangerous.
Also, I’m also surprised an XSS attack like hasn’t yet been actually used to harvest credentials like passwords through browser autofill[0].<p>It seems like the worm code/the replicated code only really attacks stuff on site. But leaking credentials (and obviously people reuse passwords across sites) could be sooo much worse.<p>[0] <a href="https://varun.ch/posts/autofill/" rel="nofollow">https://varun.ch/posts/autofill/</a>
I think autofill-based credential harvesting is harder than it sounds because browsers and password managers treat saved credentials as a separate trust boundary, and every vendor implements different heuristics. The tricky part is getting autofill to fire without a real user gesture and then exfiltrating values, since many browsers require exact form attributes or a user activation and several managers ignore synthetic events.<p>If an attacker wanted passwords en masse they could inject fake login forms and try to simulate focus and typing, but that chain is brittle across browsers, easy to detect and far lower yield than stealing session tokens or planting persistent XSS. Defenders should assume autofill will be targeted and raise the bar with HttpOnly cookies, SameSite=strict where practical, multifactor auth, strict Content Security Policy plus Subresource Integrity, and client side detection that reports unexpected DOM mutations.
Chrome doesnt actually autofill before you interact. It only displays what it <i>would</i> fill in at the same location visually.
Time to add 2FA...
A comment from my wiki-editor friend:<p><pre><code> "The incident appears to have been a cross-site scripting hack. The origin of rhe malicious scripts was a userpage on the Russian Wikipedia. The script contained Russian language text.
During the shutdown, users monitoring [https://meta.wikimedia.org/wiki/special:RecentChanges Recent changes page on Meta] could view WMF operators manually reverting what appeared to be a worm propagated in common.js
Hopefully this means they won't have to do a database rollback, i.e. no lost edits. "
</code></pre>
Interesting to note how trivial it is today to fake something as coming "from the Russians".
Why do you think it was faked? It is a well known Russian tech (woodpecker), the earliest version I can find now was created in 2013 (but I personally saw it in 2007), it is a well known Russian damocles sword against misconfigured MediaWiki websites.
Additional context:<p><a href="https://wikipediocracy.com/forum/viewtopic.php?f=8&t=14555" rel="nofollow">https://wikipediocracy.com/forum/viewtopic.php?f=8&t=14555</a><p><a href="https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical)#Meta-Wiki_compromised" rel="nofollow">https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(techni...</a><p><a href="https://old.reddit.com/r/wikipedia/comments/1rllcdg/megathread_wikimedia_wikis_locked_accounts/" rel="nofollow">https://old.reddit.com/r/wikipedia/comments/1rllcdg/megathre...</a><p>Apparent JS worm payload: <a href="https://ru.wikipedia.org/w/index.php?title=%D0%A3%D1%87%D0%B0%D1%81%D1%82%D0%BD%D0%B8%D0%BA:Ololoshka562/test.js&oldid=136952558" rel="nofollow">https://ru.wikipedia.org/w/index.php?title=%D0%A3%D1%87%D0%B...</a>
Check <a href="https://web.archive.org/web/20260305155250/https://ru.wikipedia.org/wiki/%D0%A3%D1%87%D0%B0%D1%81%D1%82%D0%BD%D0%B8%D0%BA:Ololoshka562/test.js" rel="nofollow">https://web.archive.org/web/20260305155250/https://ru.wikipe...</a> for the payload (safe to view)
Thanks - we've added the first 3 links to the toptext. Not sure about the 4th.
Wikipediocracy link gives "not authorized".
This was only a matter of time.<p>The Wikipedia community takes a cavalier attitude towards security. Any user with "interface administrator" status can change global JavaScript or CSS for all users on a given Wiki with no review. They added mandatory 2FA only a few years ago...<p>Prior to this, <i>any</i> admin had that ability until it was taken away due to English Wikipedia admins reverting Wikimedia changes to site presentation (Mediaviewer).<p>But that's not all. Most "power users" and admins install "user scripts", which are unsandboxed JavaScript/CSS gadgets that can completely change the operation of the site. Those user scripts are often maintained by long abandoned user accounts with no 2 factor authentication.<p>Based on the fact user scripts are globally disabled now I'm guessing this was a vector.<p>The Wikimedia foundation knows this is a security nightmare. I've certainly complained about this when I was an editor.<p>But most editors that use the website are not professional developers and view attempts to lock down scripting as a power grab by the Wikimedia Foundation.
Maybe somewhat unrelated, but I'm reminded of the fact that people have deleted the main page on a few occasions: <a href="https://en.wikipedia.org/wiki/Wikipedia:Don%27t_delete_the_main_page" rel="nofollow">https://en.wikipedia.org/wiki/Wikipedia:Don%27t_delete_the_m...</a>
> Any user with "interface administrator" status can change global JavaScript or CSS for all users on a given Wiki with no review.<p>True, but there aren't very many interface administrators. It looks like there are only 137 right now [0], which I agree is probably more than there should be, but that's still a relatively small number compared to the total number of active users. But there are lots of bots/duplicates in that list too, so the real number is likely quite a bit smaller. Plus, most of the users in that list are employed by Wikimedia, which presumably means that they're fairly well vetted.<p>[0]: <a href="https://en.wikipedia.org/w/api.php?action=query&format=json&list=allusers%7Cglobalallusers&formatversion=2&aurights=editsitejs&aulimit=max&agugroup=global-interface-editor%7Cstaff%7Csteward%7Csysadmin&agulimit=max" rel="nofollow">https://en.wikipedia.org/w/api.php?action=query&format=json&...</a>
There shouldn't be any interface admins as such. There should be an enforced review process for changes to global JavaScript so stuff like this can't happen.<p>I'm sure there are Google engineers who can push changes to prod and bypass CI but that isn't a normal way to handle infra.
There are 15 interface admins as per these links<p><a href="https://en.wikipedia.org/wiki/Wikipedia:Interface_administrators" rel="nofollow">https://en.wikipedia.org/wiki/Wikipedia:Interface_administra...</a><p><a href="https://en.wikipedia.org/wiki/Special:ListUsers/interface-admin" rel="nofollow">https://en.wikipedia.org/wiki/Special:ListUsers/interface-ad...</a>
Those are the English Wikipedia-only users, but you also need to include the "global" users (which I think were the source of this specific compromise?). Search this page [0] for "editsitejs" to see the lists of global users with this permission.<p>[0]: <a href="https://en.wikipedia.org/wiki/Special:GlobalGroupPermissions" rel="nofollow">https://en.wikipedia.org/wiki/Special:GlobalGroupPermissions</a>
Seems like a good time to donate one's resources to fix it. The internet is super hostile these days. If Wikipedia falls... well...
It's a political issue. Editors are unwilling or unable to contribute to development of the features they need to edit.<p>Unfortunately, Wikipedia is run on insecure user scripts created by volunteers that tend to be under the age of 18.<p>There might be more editors trying to resume boost if editing Wikipedia under your real name didn't invite endless harassment.
They have 100s of millions USD, they will be fine: <a href="https://upload.wikimedia.org/wikipedia/foundation/3/3f/Wikimedia_Foundation_FY_24-25_Audit_Report.pdf" rel="nofollow">https://upload.wikimedia.org/wikipedia/foundation/3/3f/Wikim...</a> (page 5-7).
Wikipedia doesn't even spend donation of Wikipedia anymore.
Sounds more like a political issue this. Can't buy your way out of that.
My understanding is that Wikipedia receives more donations than they need, surely they have the resources to fix it themselves?
<p><pre><code> > Based on the fact user scripts are globally disabled now I'm guessing this was a vector.
</code></pre>
Disabled at which level?<p>Browsers still allow for user scripts via tools like TamperMonkey and GreaseMonkey, and that's not enforceable (and arguably, not even trivially <i>visible</i>) to sites, including Wikipedia.<p>As I say that out loud, I figure there's a separate ecosystem of Wikipedia-specific user scripts, but arguably the same problem exists.
Yeah, wikipedia has its own user script system, and that was what was disabled.
The sitewide JavaScript/CSS is an editable Wiki page.<p>You can also upload scripts to be shared and executed by other users.
This is apparently not done browser side but server side.<p>As in, user can upload whatever they wish and it will be shown to them and ran, as JS, fully privileged and all.
For reference<p>>There are currently 15 interface administrators (including two bots).<p><a href="https://en.wikipedia.org/wiki/Wikipedia:Interface_administrators" rel="nofollow">https://en.wikipedia.org/wiki/Wikipedia:Interface_administra...</a>
[flagged]
Most admins on Wikipedia are competent in areas outside of webdev and security.
Wikipedia admins are not IT admins, they're more like forum moderators or admins on a free phpBB 2 hosting service in 2005. They don't have "admin" access to backend systems. Those are the WMF sysadmins.
I've never understood why client-side execution is so heavy in modern web pages. Theoretically, the costs to execute it are marginal, but in practice, if I'm browsing a web page from a battery-powered device, all that compute power draining the battery not only affects how long I can use the device between charges, but is also adding wear to the battery, so I'll have to replace it sooner. Also, a lot of web pages are downright slow, because my phone can only perform 10s of billions of operations per second, which isn't enough to responsively arrange text and images (which are composited by dedicated hardware acceleration) through all of the client-side bloat on many modern web pages. If there was that much bloat on the server side, the web server would run out of resources with even moderate usage.<p>There's also a lot of client-side authentication, even with financial transactions, e.g. with iOS and Android locally verifying a users password, or worse yet a PIN or biographic information, then sending approval to the server. Granted, authentication of any kind is optional for credit card transactions in the US, so all the rest is security theater, but if it did matter, it would be the worst way to do it.
Fyi they released an official statement: <a href="https://meta.wikimedia.org/wiki/Wikimedia_Foundation/Product_and_Technology/Product_Safety_and_Integrity/March_2026_User_Script_Incident" rel="nofollow">https://meta.wikimedia.org/wiki/Wikimedia_Foundation/Product...</a>
I completely understand marking the software that controls drinking water as critical infrastructure- but at some point a state based cyber attack that just wipes wikipedia off the net is deeply damaging to our modern society’s ability to agree on common facts …<p>Just now thought “if Wikipedia vanished what would it mean … and it’s not on the level of safe drinking water, but it is <i>a</i> level.
> if Wikipedia vanished what would it mean …<p>That someone would need to restore some backups, and in the meantime, use mirrors.<p>Seriously, not that big of a deal. I don't know how many copies of Wikipedia are lying around but considering that archives are free to download, I guess a lot. And if you count text-only versions of the English Wikipedia without history and talk pages, it is literally everywhere as it is a common dataset for natural language processing tasks. It is likely to be the most resilient piece of data of that scale in existence today.<p>The only difficulty in the worst case scenario would be rebuilding a new central location and restarting the machinery with trusted admins, editors, etc... Any of the tech giants could probably make a Wikipedia replacement in days, with all data restored, but it won't be Wikipedia.
You can download the entirety of wikipedia and store it in your own offline immutable backup.
The dump of english wikipedia is 26gb compressed and completely usable with that compressed format plus a small index file.<p>That's small enough to live on most people's phones. It's small enough to be a single BluRay. Maybe Wikipedia should fund some mass printings.<p>What you do not get however is any media. No sounds, images, videos, drawings, examples, 3D artifacts, etc etc etc. This is a huge loss on many many many topics.
What you're suggesting is literally impossible. There are plenty of mirrors and random people that download the thing in its entirety. The entire planet would have to be nuked for that to be possible.
Don't worry, I personally have an offline backup of the English on my phone.
All persistent data should have backup.<p>It's not a high bar.
There are so many mirrors anyway and trivial to get a local copy? What is much more concerning is government censorship and age verification/digital id laws where what articles you read becomes part of your government record the police sees when they pull you over.
> but at some point a state based cyber attack that just wipes wikipedia off the net is deeply damaging to our modern society’s ability to agree on common facts<p>Haven't we hit that point already with bad faith (and potentially government-run) coordinated editing and voting campaigns, as both Wales and Sanger have been pointing out for a while now?<p>See, for example,<p>* Sanger: <a href="https://en.wikipedia.org/wiki/User:Larry_Sanger/Nine_Theses" rel="nofollow">https://en.wikipedia.org/wiki/User:Larry_Sanger/Nine_Theses</a><p>* Wales: <a href="https://en.wikipedia.org/wiki/Talk:Gaza_genocide/Archive_22#Statement_from_Jimbo_Wales" rel="nofollow">https://en.wikipedia.org/wiki/Talk:Gaza_genocide/Archive_22#...</a><p>* PirateWires: <a href="https://www.piratewires.com/p/how-wikipedia-is-becoming-a-massive-pay-to-play-scheme" rel="nofollow">https://www.piratewires.com/p/how-wikipedia-is-becoming-a-ma...</a>
> <i>Haven't we hit that point already with bad faith (and potentially government-run) coordinated editing […] campaigns,</i><p>Yes, this is a real phenomenon. See, for instance, <a href="https://en.wikipedia.org/wiki/Timeline_of_Wikipedia%E2%80%93U.S._government_conflicts" rel="nofollow">https://en.wikipedia.org/wiki/Timeline_of_Wikipedia%E2%80%93...</a>: the examples from 2006 are funny, and the article's subject matter just gets sadder and sadder as the chronology goes on.<p>> <i>and voting campaigns</i><p>I'm not sure what you mean by this. Wikipedia is not a democracy.<p>> <i>as both Wales and Sanger have been pointing out</i><p>{{fv}}. Neither of those essays make this point. The closest either gets is Sanger's first thesis, which misunderstands the "support / oppose" mechanism. Ironically, his ninth thesis says to <i>introduce</i> voting, which would <i>create</i> the "voting campaign" vulnerability!<p>These are both really bad takes, which I struggle to believe are made in good faith, and I'm glad Wikipedians are mostly ignoring them. (I have not read the third link you provided, because Substack.)
If you're using wikipedia to "agree on common facts" I think you might have bigger problems...
<a href="https://grokipedia.com/" rel="nofollow">https://grokipedia.com/</a>
Nice to see jQuery still getting used :)
I’m not saying that this is related to Wikipedia ditching archive.is but timing in combination with Russian messages is at least…weird.
The script was uploaded in 2024, and triggered today because of an accident<p><a href="https://en.wikipedia.org/wiki/Wikipedia:Village_stocks#Scott_Bassett_for_the_Mutually_Assured_Destruction_award" rel="nofollow">https://en.wikipedia.org/wiki/Wikipedia:Village_stocks#Scott...</a>
And they probably used mind-control to make the admin run random userscripts on his privileged account as well, the capabilities of russian hackers is scary.<p>/s<p>It is just another human acting human again.
>Cleaning this up<p>Find the first instance and reset to the backup before then. An hour, a day, a week? Doesn't matter that much in this case.
It is true that they have a particularly robust, distributed backup system that can/has come in handy, but FWIW the timing matters <i>to them</i>. English Wikipedia receives ~2 edits per second, or 172,800 per day. Many of them are surely minor and/or automated, but still: 1,036,800 lost edits is a lot!
Are they really lost though? I think they should not be lost; they could be stored in a separate database additionally.
In fact, as long as the malware is just doing deletes, you can just merge the two "timelines" by restoring the snapshot and then replaying all the edits but ignoring the deletes. Lost deletes really aren't much of a problem!
Filesystem & database snapshots are very cheap to make, you can make them every 15 minutes. You can expire old snapshots (or collapse the deltas between them) depending on the storage requirements.
That doesn't really matter though against an attack that takes some time to spread. If the attack was active for let's say, 6 hours, then 43,000 legitimate edits happened in between the last "clean" snapshot and the discovery of the attack. If you just revert to the last clean snapshot you lose those legitimate edits.
We should be using federated organizational architectures when appropriate.<p>For Wikipedia, consider a central read-only aggregated mirror that delegates the editorial function to specialized communities. Common, suggested tooling (software and processes) could be maintained centrally but each community might be improved with more independence. This separation of concerns may be a better fit for knowledge collection and archival.<p>Note: I edited to stress central mirroring of static content with delegation of editorial function to contributing organizations. I'm expressly not endorsing technical "dynamic" federation approaches.
> Hitting MediaWiki:Common.js is the absolute nightmare scenario for MediaWiki deployments because that script gets executed by literally every single visitor<p>...except for us security wonks who have js turned off by default, don't enable it without good reason, disable it ASAP, and take a dim view of websites that require it.<p>Not too many years ago this behavior was the domain of Luddites and schizophrenics. Today it has become a useful tool in the toolbox of reasonable self-defense for anybody with UID 0.<p>Perhaps the WMF should re-evaluate just how specialsnowflake they think their UI is and see if, maybe just maybe, they can get by without js. Just a thought.
> and see if, maybe just maybe, they can get by without js.<p>Unless it changed recently (it's too slow right now for me to check), Wikipedia has always worked perfectly fine without JS; that includes even editing articles (using the classic editor which shows the article markup directly, instead of the newer "visual" editor).<p>Edit: I just checked, and indeed I can still open the classic edit page even with JS blocked.
It warms my heart that there's basically a 0% chance that they ever approach this camp's viewpoint based on the Herculean effort it took to switch over to a slightly more modern frontend a few years back. I'm glad you don't think of yourself of a Luddite, but I think you're vastly overstating how open people are to a purely-static web.<p>Also, FWIW: Wikipedia <i>is</i> "specialsnowflake". If it isn't, that's merely because it was <i>so</i> specialsnowflake that there's now a healthy of ecosystem of sites that copied their features! It's far, far more capable than a simple blog, especially when you get into editing it.
Ok, fair point. I presumed that this crowd would be far more familiar with the capabilities of HTML5 and dynamic pages sans js than most. (Surely more familiar than I, who only dabble in code by comparison.)<p>No, I'm not suggesting we all go back to purely-static web pages, imagemap gifs and server side navigation. But you're going to have a hard time convincing me that I really truly need to execute code of unknown provenance in my this-app-does-everything-for-me process just to display a few pages of text and 5 jpegs.<p>And for the record, I've called myself a Technologist for almost 30 years now. If I were a closet Luddite I'd be one of the greatest hypocrites of human history. :-)
I just checked a wiki, and the "MediaWiki:Common.js" page there was read-only, even for wikisysop users.
In the early 2010’s I worked for a company whose primary income was subscriptions to site protection services - one of which included cleaning up malware-infected Wordpress installations. I worked on the team that did this job.<p>This exact type of database-stored executable javascript was one of the most annoying types of infections to clean up.
Ok, so there are tons of mediawiki installations all over the internet. What do these operators do? Set their wikis to read-only mode, hang tight, and wait for a security patch?<p>Also, does this worm have a name?
There is nothing to do, the incident was not caused by a vulnerability in mediawiki.<p>Basically someone who had permissions to alter site js, accidentally added malicious js. The main solution is to be very careful about giving user accounts permission to edit js.<p>[There are of course other hardening things that maybe should be done based on lessons learned]
There are already tools and techniques to validate served JS is as-intended, and these techniques could be beefed up by adding browser checks. I've been surprised these haven't been widely adopted given the spate of recent JS-poisoning attacks.
Well, admins (or anybody other than the developers / deployment pipeline) having permissions to alter the JS sounds like a significant vulnerability. Maybe it wasn't in the early 2000s, but unencrypted HTTP was also normal then.
That's a fair point, but keep in mind normal admin is not sufficient. For local users (the account in question wasn't local) you need to be an "interface admin", of which there are only 15 on english wikipedia.<p>The account in question had "staff" rights which gave him basically all rights on all wikis.
> For local users (the account in question wasn't local) you need to be an "interface admin", of which there are only 15 on english wikipedia.<p>It used to be all "admin" accounts, of which there were many more. Restricting it to "interface admin" only is a fairly recent change.
> Well, admins (or anybody other than the developers / deployment pipeline) having permissions to alter the JS sounds like a significant vulnerability.<p>It's a common feature of CMS'es and "tag management systems." Its presence is a massive PITA to developers even _besides_ the security, but PMs _love them_, in my experience.
I wonder if any poisoned data made it into LLM training data pipelines?
Interesting angle. Everyone has already pointed out that there are backups basically everywhere, and from an information standpoint, shaving off a day (or whatever) of edits just to get to a known-good point is effectively zero cost. But I wonder what the cost is of the potentially bad data getting baked into those models, and if anyone really cares enough to scrap it.
Nobody uses Wikipedia anymore since all the AIs scraped it.
Too much app logic in the client side (Javascript) has always been an attack vector. The more that can reasonably be server side, the more that can't be seen.
Unit 8200 at it again
This is unfortunate that Wikipedia is under attack. It seems as if there are more malicious actors now than, say, 5 years ago.<p>This may be unrelated but I also noticed more attacks on e. g. libgen, Anna's archive and what not. I am not at all saying this is similar to Wikipedia as such, mind you, but it really seems as if there are more actors active now who target people's freedom now (e. g. freedom of choice of access to any kind of information; age restriction aka age "verification" taps into this too).
Looking forward to the postmortem...
Another reason to make the default disabling JS on all websites, and the website should offer a service without JS, especially those implemented in obsolete garbage tech. If it's not an XSS from a famous website, it will be an exploit from a sketchy website.
Its been opened. Although I have issues with Wikipedia, being a creep is not a valid response.
There's thousands of copies of the whole wikipedia in sql form though, IIRC it's just like 47GB.
Correct. Not sure about a sql archive, but the kiwix ZIM archive of the top 1M English articles including (downsized but not minimized) images is 43GiB: <a href="https://download.kiwix.org/zim/wikipedia/" rel="nofollow">https://download.kiwix.org/zim/wikipedia/</a><p>And the entire English wikipedia with no images is, interestingly, also 43GiB.
It's Wikipedia's 25th birthday but their security discipline is still very much circa 2001. No code signing, BOM / supply chain security. Only recently activated 2fa for admins (after another breach). Most admins are anons.<p>Let's hope they allocate more of the $200M+ / year to security infra.
Just thought about.<p>Who wins the most from a Wikipedia outage and has questionable moral views?
The same who currently struggles to find paying customers for his services.<p>The large AI companies.
I can edit it
there's a very active tech discussion on the Wikipedia discord you can join here <a href="https://en.wikipedia.org/wiki/Wikipedia:Discord" rel="nofollow">https://en.wikipedia.org/wiki/Wikipedia:Discord</a>
"Закрываем проект" is Russian for "Closing the project"
It's reassuring to know Wikipedia has these kinds of security mechanisms in place.
GOD am I thankful to my old self for disabling js by default. And sticking with it.<p>edit: lol downvoted with no counterpoint, is it hitting a nerve?
> edit: lol downvoted with no counterpoint, is it hitting a nerve?<p>I have upvoted ya fwiw and I don't understand it either why people would try to downvote ya.<p>I mean, if websites work for you while disabling js and you are fine with it. Then I mean JS is an threat vector somewhat.<p>Many of us are unable to live our lives without JS. I used to use librewolf and complete and total privacy started feeling a little too <i>uncomfortable</i><p>Now I am on zen-browser fwiw which I do think has some improvements over stock firefox in terms of privacy but I can't say this for sure but I mainly use zen because it looks really good and I just love zen.
> I mean, if websites work for you while disabling js and you are fine with it. Then I mean JS is an threat vector somewhat<p>It's also been torture, I definitely don't prescribe it. :P Like you say, it's a sanity / utility / security tradeoff. I just happen to be willing to trade off sanity for utility and security.<p>And yes, unfortunately I have to enable JS for some sites -- the default is to leave it disabled. And of course with cloudflare I have to whitelist it specifically for their domains (well, the non analytics domains). But thankfully wikipedia is light and spiffy without the javascript.
What is uncomfortable about Librewolf? I thought it was basically FF without telemetry and UBO already baked in?
I appreciate librewolf but when I used to use it, IIRC its fingerprinting features were too strict for some websites IIRC and you definitely have to tone it down a bit by going into the settings. Canvases don't work and there were some other features too.<p>That being said, Once again, Librewolf is amazing software. I can see myself using it again but I just find zen easier in the sense of something which I can recommend plus ubO obv<p>Personally these are more aesthetic changes more than anything. I just really like how zen looks and feels.<p>The answer is sort of, Just personal preference that's all.
Time to spend some of this excess money on a bit of security tightening? I hear we're talking about a 9 digit figure.
[flagged]
> [...] is incredibly insidious. It really exposes the foundational danger of [...]<p>My LLM sense is tingling.
I opened his post history and scrolled down a bit and literally the first thing I saw was a comment starting with "You're absolutely right" lol
Yeah, it's like the really high-energy way it's written or something? Can't quite put my finger on it.
Could you point to where you found the details of the exploit? It’s not in the linked page. Really interested. Especially the part about modifying it and the other users propagating it?
[flagged]
Stop posting this AI-generated word salad.<p>This was an XSS attack. A malicious script was executed inside an admin’s already authenticated browser context, allowing said malicious script to place itself into public facing pages. Nothing to do with any browser fingerprinting nonsense you’re going on about.
Here before someone says that it's because MediaWiki is written in PHP.
[dead]
[flagged]
[flagged]
[flagged]
"The Wikimedia Foundation, which operates Wikipedia, reported a total revenue of $185.4 million for the 2023–2024 fiscal year (ending June 2024). The majority of this funding comes from individual donations, with additional income from investments and the Wikimedia Enterprise commercial API service."<p>(Unless this was satire and I missed it)
What's the operating budget for other websites with comparable traffic? Without context $185 million <i>seems like a lot</i>, but compared to what? Reddit's operating budget for the same timeframe was $1.86 billion.
I agree, but it's not a shoestring budget. They also seem to run a surplus every year:<p>The Wikimedia Foundation (WMF) maintains a significant financial surplus and a growing, healthy balance sheet, with net assets reaching approximately $271.5 million in the 2023–2024 fiscal year. This surplus is largely driven by consistent, high-volume, small-dollar donations, with total annual revenue often exceeding $180 million.
Surplus is a good thing right? Long term stability, responsible financial management, healthy margins? If they said one year "You know what? We're good on donations this year." it would never be restarted.
I think the question might be how much money, effort, and <i>expertise</i> is going into the platform itself.
They are rather well funded for a non-profit and the reserves in the endowment fund are very healthy:<p><a href="https://en.wikipedia.org/wiki/Wikipedia:Fundraising_statistics" rel="nofollow">https://en.wikipedia.org/wiki/Wikipedia:Fundraising_statisti...</a><p><a href="https://wikimediafoundation.org/who-we-are/financial-reports/" rel="nofollow">https://wikimediafoundation.org/who-we-are/financial-reports...</a>
Wikipedia probably actively wastes $100m per year
please stop spreading lies, Wikipedia is swimming in money and they have money for years or even decades if they would not waste them on various seminars and other nonsense unrelated to running Wikipedia
Society and culture were fine before Wikipedia. I could argue that they have degraded substantially since Wikipedia came into being (but correlation is not causation, in either direction).
They have no incentive to improve the site, because they’re a for-profit entity.<p>Despite the constant screeching for donations, the entire site is owned by a company with shareholders. All the “donations” go to them. They already met their funding needs for the next century a long time ago, this is all profit.
Long past time to eliminate JavaScript from existence
You will have a long trek to do that. We have a javascript interpreter deployed at the second Sun-Earth Lagrange point.<p><a href="https://www.theverge.com/2022/8/18/23206110/james-webb-space-telescope-javascript-jwst-instrument-control" rel="nofollow">https://www.theverge.com/2022/8/18/23206110/james-webb-space...</a>
Yep, WASM is so much more secure.
This.<p>Actually fuck the whole dynamic web. Just give us hypertext again and build native apps.<p>Edit: perhaps I shouldn't say this on an VC driven SaaS wankfest forum...
You may be interested in <a href="https://geminiprotocol.net/" rel="nofollow">https://geminiprotocol.net/</a>
I mean sure, but that's never going to happen, so complaining about it is just shaking your fist at the sky. The only way it will change is if the economics of the web change. Maybe that is the economics of developer time (it being easier/fast/more resilient and thus cheaper to do native dev), or maybe it is that dynamic scripting leads to such extreme vulnerabilities that ease of deployment/development/consumer usage change the macroeconomics of web deployment enough to shift the scales to local.<p>But if there's one thing I've learned over the years as a technologist, it's this: the "best technology" is not often the "technology that wins".<p>Engineering is not done in a vacuum. Indeed, my personal definition of engineering is that it is "constraint-based applied science". Yes, some of those constraints are "VC buxx" wanting to see a return on investment, but even the OSS world has its own set of constraints - often overlapping. Time, labor, existing infrastructure, domain knowledge.
I think it will change.<p>The entire web is built on geopolitical stability and cooperation. That is no longer certain. We already have supply chains failing (RAM/storage) meaning that we will be hardware constrained for the foreseeable future. That puts the onus on efficiency and web apps are NOT efficient however we deliver them.<p>People are also now very concerned about data sovereignty whereas they previously were not. If it's not in your hands or on your computer than it is at risk.<p>The VC / SaaS / cloud industry is about to get hit very very hard via this and regulation. At that point, it's back to native as delivery is not about being tied to a network control point.<p>I've been around long enough to see the centralisation and decentralisation cycles. We're heading the other way now
Imagine if wikipedia was a native app, what this vuln would have caused. I for one prefer using stuff in the browser where at least it's sandboxed. Also, there's nothing stopping you from disabling JS in your browser.
How do they know? Has this been published in a Reliable Source?