Automatic version detection, revalidation, prewarming... caching seems so complicated these days. Forgive me for starting a sentence with "why don't we just"... but why don't we just use the hash of the object as the cache key and be done with it? You get integrity validation as a bonus to boot.<p><pre><code> <link rel="stylesheet" href="main.css?hash=sha384-5rcfZgbOPW7..." integrity="sha384-5rcfZgbOPW7..."/>
Etag: "sha384-5rcfZgbOPW7..."
Cache-Control: max-age=31536000, immutable</code></pre>
Sure, but where's the fun in that? Then you wouldn't be able to write "we <i>architected</i> a caching layer"! To their credit, at least this isn't the actual title of the article, but it still left me wondering if an actual architect (you know, the kind of architect that designs buildings) would say "I architected this"?
Because you want the ability to invalidate the cache for an entire site at the same time. So you would still need some map between domain and hash.
I just don’t get it. Their last paragraph describes how they changed their dynamic site to be static. So then why do you need workers at all? Just deploy to a CDN.<p>How do you do version updates? Add content hash to all files except for root index.html.<p>Cache everything forever, except for index.html<p>To deploy new version upload all files, making sure index.html is last.<p>Since all files are unique, old version continues to be served.<p>No cache invalidating required since all files have unique paths, expect index.html which was never cached.<p>You have to ensure you absolutely have properly content hashes for everything. Images, css, js. Everything
The invalidation queue is interesting, but building a custom cache key manually? Even Cloudflare now supports Cache-Tags
Sometimes I feel like work and needless infra complexity grows perfectly to match headcount and nominally available resources.
I feel the same, 72 million monthly page views is about 8 pages per second even if in a single timezone (72e6 / 8h * 30d * 3600h/s) - even with today's heavy weight pages we are talking under well under 1000 req/s. Assuming they are not super image/asset heavy i would expect this to comfortably be served by a couple of reasonable old school ngnix servers[1]. If each page was a full megabyte of uncached content we are < 10Gbits/sec. Probably under 1<p>The build logic to decide which things to rebuild of course is probably the interesting bits but we dont need all these services... </grey-beard-rant><p>[1] <a href="https://openbenchmarking.org/test/pts/nginx&eval=c18b8feaeca6235b318667a0c1159c7eb54ce634#metrics" rel="nofollow">https://openbenchmarking.org/test/pts/nginx&eval=c18b8feaeca...</a><p>edit: to be less ranty they are more or less building static sites out of their Next.js codebase but on-demand updated etc which is indeed interesting but none of this needs cloudflare/hyerscaler tech<p>Not sure how many customers/sites they have. Perhaps they don't want to spend CPU regenerating all sites on every deployment? They do describe a content-driven pre-warmer but I'm still unclear why this couldn't be a content-driven static site generator running on some build machine
The thing is you can still stick a CDN in front of your old school servers and just use a 'stale-while-revalidate' header to get exactly the effect described here.
Stale-while-revalidate as implemented in the post was easier for us and required less resources than migrating from our dynamic site architecture to static. Ideally we would have migrated to fully static sites, but the engineering effort required to make that happen wasn't in scope.
Something I noticed a long time ago is that Vercel turns everything they touch into being 10 times harder than it needs to be.<p>I have come to conclude it is that way because they focus on optimizing for a demo case that presents well to non-technical stakeholders. Doing one particular thing that looks good at a glance gets the buy-in, and then those who bought in never have to deal with the consequences of the decision once it is time to build something other than the demo.
I'm no fan of Vercel, but it's kind of the symptom of a wider pattern, right? I see crazy architecture astronaut setups in so many places. It's true non-technical stakeholders can cause problems but I often see it pushed from inside the tech org too. I'm thinking it's some combination of resume-driven development, misunderstanding of 'scalability'/when it's needed, and intra-org working-together problems where it's easier to just make a new service and assert your dominion over it.
I blame this more on NextJS than Vercel, but agree in spirit. Their architecture creates a pit of failure where you're encouraged to fall into a fully dynamic pattern and is a huge trap.<p>However, it's probably more inexperience than anything. Nobody senior was around to tell our founders that they should go for a SSG architecture when they started /shrug. It's mostly worked out anyways though haha.
2025, the world rediscovers simple static caching. You could do the same with varnish/nginx or wp-cache with 10% of the complexity. Or a CDN.<p>“Incremental Static Regeneration” is also one of the funniest things to come out of this tech cycle.
A lot of people are criticizing this for unnecessary complexity, but it's a little more complicated than that. I actually think it makes sense given where they are at right now. The complexity stems from Vercel and Next.js - had they used a different tech, say Cloudflare directly and architected their own systems designed to handle rapidly changing static content none of this would have been necessary. So I guess it depends on your definition of unnecessary complexity. It's definitely unnecessary for the problem space, but probably necessary for their existing stack.