Simplifying Vulkan one subsystem at a time

(khronos.org)

223 points by amazari14 hours ago

15 comments

kvark12 hours ago
The main problem with Vulkan isn't the programming model or the lack of features. These are tackled by Khronos. The problem is with coverage and update distribution. It's all over the place! If you develop general purpose software (like Zed), you can't assume that even the basic things like dynamic rendering are supported uniformly. There are always weird systems with old drivers (looking at Ubuntu 22 LTS), hardware vendors abandoning and forcefully deprecating the working hardware, and of course driver bugs... So, by the time I'm going to be able to rely on the new shiny descriptor heap/buffer features, I'll have more gray hair and other things on the horizon.
- zamalek10 hours ago
 > Ubuntu LTSThis is why I try to encourage new Linux users away from Ubuntu: it's a laggard with, often important, functionality. It is now an enterprise OS (where durability is more important than functionality), it's not really suitable for a power user (like someone who would use Zed).
 - 6SixTy9 hours ago
 My understanding with Mesa is that it has very few dependencies and is ABI stable, so freezing Mesa updates is counterproductive. I'm not sure about Snaps, but Flatpak ships as it's own system managing Mesa versions.
 - fc417fc8026 hours ago
 > Flatpak ships as it's own system managing Mesa versions.Mixing and matching the kernel and userspace mesa components is subject to limitations. However it will transparently fall back to software rendering so you might not notice if you aren't doing anything intensive.Related, being a container flatpak has no choice but to ship the mesa userspace component. If it didn't nothing would work.
 - tambre7 hours ago
 > My understanding with Mesa is that it has very few dependenciesSome of the shader compilers require LLVM which is a giant dependency to say the least. But with Valve's ACO for RADV I think that could technically be omitted.
 - fsloth9 hours ago
 " It is now an enterprise OS"You really want enterprise standards support for your graphics API.Bleeding edge ...is not nice in graphics. Especially the more complex the systems get, so do the edge cases.I mean in general. If you are writing a high end game engine don't listen to me, you know better. But if you are a mid-tier graphics wonk like myself 20 year old concepts are usually quite pareto-optimal for _lots_ of stuff and should be robustly covered by most apis.If I could give one advice for myself 20 years ago.For anything practical - focus on the platform native graphics API. Windows - DirectX. Mac - OpenGL (20 years ago! Predates metal!. Today ofc would be metal).I don't think that advice would be much different today (apart from Metal) IF you don't know what to do and just want to start on doing graphics. For senior peeps who know the field do whatever rights for you of course.Linux - good luck. Find the API that has best support for your card & driver combo - meaning likely the most stabilized with most users.
 - BadBadJellyBean10 hours ago
 You don't have to run LTS. There is a new release every 6 months.
 - fulafel9 hours ago
 Especially a 4 year old LTS. But I guess the point was that you will run into some users that do when you ship to the general audience.You run into the same problem on other platforms too of course (eg Android)
 - esseph10 hours ago
 I've been running Linux for a very long time.Ubuntu has never ever been the most stable or useful distro. What it did have was apt and more up to date stuff than debian.I would never willingly choose Ubuntu if allowed other options (Fedora, Debian, maybe CoreOS, etc)
 - horsawlarway9 hours ago
 I have a lot of respect for Canonical for driving a distro that was very "noob friendly" in an ecosystem where that's genuinely hard.But I mostly agree with you. Once you get out of that phase, I don't really see much value in Ubuntu. I'd pick pretty much anything else for everything I do these days. Debian/Fedora/Alpine on the server. Arch on the desktop.
 - bwat498 hours ago
 not to mention the OP mentioned 22 LTS which isn't even the most current LTS
 - superkuh1 hour ago
 And this is a prime example of development-centric thinking prioritizing developer comfort over the capabilities and usability of the actual software. Rather than targeting stable older feature sets it's always targeting the bleeding edge and then being confused that this doesn't work on machines that aren't their own and then blaming everyone else for their decision. 4 years is not a long time (LTS). 4 years is the minimum that software should be able to live.
 - yxhuvud7 hours ago
 Ubuntu's perfectly fine if you avoid LTS versions.
 - adithyassekhar10 hours ago
 Which one would you recommend for regular users and power users?
 - zamalek10 hours ago
 If you want something relatively uninteresting: Fedora or Debian (honestly, stable is fine).If you want something extremely reliable, more modern, but may require some learning to tweak: Silverblue or Kinoite.
 - direwolf2010 hours ago
 Debian updates even less frequently than Ubuntu and stays with years old versions of packages. If you're looking for fresh, Debian is not it. Maybe Arch?
 horsawlarway9 hours ago
 Yeah, the folks in here recommending Debian as a solution to this problem are insane.I love Debian, it's a great distro. It's NOT the distro I'd pick to drive things like my laptop or personal development machine. At least not if you have even a passing interest in:- Using team communication apps (slack/teams/discord)- Using software built for windows (Wine/Proton)- Gaming (of any form)- Wayland support (or any other large project delivering new features relatively quickly)- Hardware support (modern linux kernels)I'd recommend it immediately as a replacement for Ubuntu as a server, but I won't run it for daily drivers.Again - Arch (or it's derivatives) are basically the best you can get in that space.
 cosmic_cheese8 hours ago
 I think Debian Stable, Ubuntu LTS, and derivatives thereof are particularly poor fits for general consumers who are more likely to try to run the OS on a random machine they picked up from Best Buy that’s probably built with hardware that kernels any older than what ships in Fedora are unlikely to support.The stable/testing/etc distinction doesn't really help, either, because it's an alien concept to those outside of technical spheres.I strongly believe that the Fedora model is the best fit for the broadest spread of users. Arch is nice for those capable of keeping it wrangled but that's a much smaller group of people.
 horsawlarway7 hours ago
 I find this a very reasonable take.I'll add - I think the complexity is somewhat "over-stated" for Arch at this point. There was absolutely a period where just reading the entire install guide (much less actually completing it) was enough to turn a large number of even fairly technical people off the distro. Archinstall removed a lot of that headache.And once it's up, it's generally just fine. I moved both my spouse and my children to Arch instead of Windows 11, and they don't seem particularly bothered. They install most of their own software using flatpaks through the store GUI in Gnome, or through Steam, the browser does most of the heavy lifting these days anyways.I basically just grab their machine and run `pacman -Syu` on it once in a while, and help install something more complicated once in a blue moon.Still requires someone who doesn't mind dropping into a terminal, but it's definitely not what I'd consider "all that challenging".
 cosmic_cheese7 hours ago
 YMMV, but the issue I usually run into with Arch is that unless you watch patch notes like a hawk, updates will break random things every so often, which I found quite frustrating. The risk of this increases the longer the system goes without updates due to accumlated missing config file migrations and such.Even as someone who uses the terminal daily it's more involved than I really care for.
 nickjj3 hours ago
 > but the issue I usually run into with Arch is that unless you watch patch notes like a hawk,The good news is you can run `yay -Pwwq` to get the latest Arch news headlines straight in your terminal.I've wrapped that with running `pacman -Syu` into a little helper script so that I always get to see the news before I run an update.This is built into my dotfiles by default at <a href="https://github.com/nickjj/dotfiles" rel="nofollow">https://github.com/nickjj/dotfiles</a>.
 r_lee9 hours ago
 Debian has multiple editions, if you want Arch, go for sid/testing.Stable is stable as in "must not be broken at all costs" kind of stable.basically everything works just fine. there's occasionally a rare crash or gnome reset where you need to login again, but other than that not many problems.
 horsawlarway7 hours ago
 Again, I like Debian a lot as a distro (much more than Ubuntu), but it's just not the same as a distro like Arch, even when you're on testing. Sid is close, but between Arch and sid... I've actually found fewer issues on Arch, and since there's an existing expectation that the community maintains and documents much of the software in AUR, there's almost always someone actually paying attention and updating things, rather than only getting around to it later.It's not that Debian is a bad release, but it's the difference in a game on steam being completely unavailable for a few hours (Arch) or 10 days (Debian testing) due to an upstream issue.I swapped a while back, mostly because I kept hitting issues that are accurately described and resolved by steps coming from Arch's community, even on distros like Debian and Fedora.---The power in debian is still that Ubuntu has made it very popular for folks doing commercial/closed source releases to provide a .deb by default. Won't always work... but at least they're targeting your distro (or almost always, ubuntu, but usually close enough).Same for Fedora with the Redhat enterprise connections.But I've generally found that the community in Arch is doing a better job at actually dogfooding, testing, and fixing the commercial software than most of the companies that release it... which is sad, but reality.Arch has plenty of its own issues, but "Stale software" isn't the one to challenge it on. Much better giving it a pass due to arch/platform support limitations, security or stability needs, etc... All those are entirely valid critiques, and reasonable drivers for sticking to something like Debian.
 akdev1l8 hours ago
 No Debian is stable as in “it shall not change”.There are times where there are known bugs in Debian which are purposely not fixed but instead documented and worked around. That’s part of the stability promise. The behaviour shall not change which sometimes includes “bug as a feature”
 fiddlerwoaroof9 hours ago
 Over time I evolved to Debian testing for the base system and nix for getting precise versions of tools, which worked fairly well. But, I just converted my last Debian box to nixos
 bayindirh8 hours ago
 I'm using Debian testing in my daily driving desktop(s) for the last, checks notes, 20 years now?Servers and headless boxes use stable and all machines are updated regularly. Most importantly, stable to stable (i.e. 12 to 13) upgrades takes around 5 minutes incl. final reboot.I reinstalled Debian once. I had to migrate my system to 64 bit, and there was no clear way to move from 32 to 64 bit at that time. Well, once in 20 years is not bad, if you ask me.
 fiddlerwoaroof6 hours ago
 I've had a couple outages due to major version upgrades: the worst was the major version update that introduced systemd, but I don't think I've ever irreparably lost a box. The main reason I like nixos now is:1) nix means I have to install a lot fewer packages globally, which prevents accidentally using the wrong version of a package in a project.2) I like having a version controlled record of what my systems look like (and I actually like the nix language)
 fc417fc8026 hours ago
 You're allowed to throw debian testing or arch in a chroot. The only thing that doesn't work well for is gaming since it's possible for the mesa version to diverge too far.
 - horsawlarway10 hours ago
 Not joking, Arch. Pick Gnome/KDE/Sway as you please.Arch is a wonderful daily driver distro for folks who can deal with even a small amount of configuration.Excellent software availability through AUR, excellent update times (pretty much immediate).The only downside is there's not a ton of direct commercial software packaged for it by default (ex - most companies they care give a .deb or a .rpm) but that's easily made up for by the rest of AUR.It's not even particularly hard to install anymore - run `archinstall` <a href="https://wiki.archlinux.org/title/Archinstall" rel="nofollow">https://wiki.archlinux.org/title/Archinstall</a> make some choices, get a decent distro.Throw in that steam support is pretty great... and it's generally one of the best distros available right now for general use by even a moderate user.Also fine as a daily driver for kids/spouses as long as there's someone in the house to run pacman every now and then, or help install new stuff.
 - stalfosknight10 hours ago
 Arch or Endeavour
 - jauntywundrkind10 hours ago
 Debian/testing, with stable pinned on at low priority.It slows down for a couple months around release, but generally provides pretty reliable & up to date experience with a very good OS.Dance dance the red spiral.
 - gspr6 hours ago
 A stable-testing mix is quite exotic. What are you trying to achieve here?
 jauntywundrkind6 hours ago
 It's rare but every now and then testing has an unsatisfiable dependency. It's usually resolved within a day or so. But I keep a lower distro around basically to insure I have a fallback, so I'm not blocked now. The next update should likely get me back to testing.
 - r_lee9 hours ago
 You can go for sid too :)
 jauntywundrkind6 hours ago
 I run sid (debian's unstable branch) on all my systems, it's great! With experimental pinned on at low priority! It's great, I love it!I'm not quite bold enough to recommend it to people but if anyone asks I would definitely say yes to running sid. Apt-pin for testing at low priority is good to have, just because sometimes there's lag when one library updates for everyone using it to update, and you can get unsatisfiable dependencies.
 - plagiarist4 hours ago
 I encourage them away from Ubuntu because of the Snaps. If people want an enterprise distro that lags upstreams by a lot they should go with Debian.
- thegrim0002 hours ago
 Yes, this is the problem. They tout this new latest and greatest extension that fixes and simplifies a lot, yet you go look up the extension on vulkan.gpuinfo.org and see ... currently 0.3% of all devices support it. Which means you can't in any way use it. So you wait 5 years, and now maybe 20% of devices support it. Then you wait another 5 years, and maybe 75% of devices support it. And maybe you can get away with limiting your code to running on 75% of devices. Or, you wait another 5 years to get into the 90s.
- MereInterest9 hours ago
 > There are always weird systems with old drivers (looking at Ubuntu 22 LTS)While I agree with your general point, RHEL stands out way, way more to me. Ubuntu 22.04 and RHEL 9 were both released in 2022. Where Ubuntu 22.04 has general support until mid-2027 and security support until mid-2032, RHEL 9 has "production" support through mid-2032 and extended support until mid-2034.Wikipedia sources for ubuntu[0] and RHEL [1]:[0] <a href="https://en.wikipedia.org/wiki/Ubuntu#Releases" rel="nofollow">https://en.wikipedia.org/wiki/Ubuntu#Releases</a>[1] <a href="https://upload.wikimedia.org/wikipedia/en/timeline/fcppf7prx10mvntfzjdz2pa83g48ile.png" rel="nofollow">https://upload.wikimedia.org/wikipedia/en/timeline/fcppf7prx...</a>
- m-schuetz11 hours ago
 Tbh, we should more readily abandon GPU vendors that refuse to go with the times. If we cater to them for too long, they have no reason to adapt.
 - afandian10 hours ago
 I had a relatively recent graphics card (5 years old perhaps?). I don't care about 3D or games, or whatever.So I was sad not to be able to run a text editor (let's be honest, Zed is nice but it's just displaying text). And somehow the non-accelerated version is eating 24 cores. Just for text.<a href="https://github.com/zed-industries/zed/discussions/23623" rel="nofollow">https://github.com/zed-industries/zed/discussions/23623</a>I ended up buying a new graphics card in the end.I just wish everyone could get along somehow.
 - ronsor9 hours ago
 The fact that we need advanced GPU acceleration for a text editor is concerning.
 - jsheard7 hours ago
 Such is life when built-in laptop displays are now pushing a billion pixels per second, rendering anything on the CPU adds up fast.Sublime Text spent over a decade tuning their CPU renderer and it still didn't cut it at high resolutions.<a href="https://www.sublimetext.com/blog/articles/hardware-accelerated-rendering" rel="nofollow">https://www.sublimetext.com/blog/articles/hardware-accelerat...</a>
 the84726 hours ago
 Most of the pixels don't change every second though. Compositors do have damage tracking APIs, so you only need to render that which changed. Scrolling can be mostly offset transforms (browsers do that, they'd be unbearably slow otherwise).
 simonask3 hours ago
 That’s not the slow part. The slow part is moving any data at all to the GPU - doesn’t super matter if it’s a megabyte or a kilobyte. And you need it there anyway, because that’s what the display is attached to.Now, the situation is that your display is directly attached to a humongously overpowered beefcake of a coprocessor (the GPU), which is hyper-optimized for calculating pixel stuff, and it can do it orders of magnitude faster than you can tell it manually how to update even a single pixel.Not using it is silly when you look at it that way.
 Dylan168071 hour ago
 Sure, use it. But it very much shouldn't be needed, and if there's a bug keeping you from using it your performance outside video games should still be fine. Your average new frame only changes a couple pixels, and a CPU can copy rectangles at full memory speed.
 - ianlevesque9 hours ago
 Text editor developers get bored too!
 - hyperman110 hours ago
 No. I remember a phone app ( Whatsapp?) doggedly supporting every godforsaken phone, even the nokias with the zillion incompatible Java versions. A developer should go where the customers are.What does help is an industry accepted benchmark, easily ran by everyone. I remember browser css being all over the place, until that whatsitsname benchmark (with the smiley face) demonstrated which emperors had no clothes. Everyone could surf to the test and check how well their favorite browser did. Scores went up quickly, and today, css is in a lot better shape.
 - aeldidi9 hours ago
 The Acid2 test is the benchmark you’re thinking of, for anyone not aware: acid2.acidtests.org
 - Octoth0rpe10 hours ago
 > we should more readily abandon GPU vendorsThis was so much more practical before the market coalesced to just 3 players. Matrox, it's time for your comeback arc! and maybe a desktop pcie packaging for mali?
 - dyingkneepad5 hours ago
 The market is not just 3 players. These days we have these things called smartphones, and they all include a variety of different graphics cards on them. And even more devices than just those include decently powerful GPUs as well. If you look at the Contributors section of the extension in the post, and look at all the companies involved, you'll have a better idea.
 - Animats6 hours ago
 NVidia says no new gamer GPUs in 2026, and increasing prices through 2030. They're too focused on enterprise AI machines.
tonis210 hours ago
I wish they would just allow us to push everything to GPU as buffer pointers, like buffer_device address extension allows you to, and then reconstruct the data to your required format via shaders.The GPU programming seems to be both super low level, but also high level, cause textures and descriptors need these ultra specific data format's, and then the way you construct and upload those formats are very complicated and change all the time.Is there really no way to simplify this ?Regular vertex data was supposed to be strictly pre formatted in pipeline too, util it was not suddenly, and now we can just give the shader a `device_address`extension memory pointer and construct the data from that.
- softfalcon8 hours ago
 I also want what you're describing. It seems like the ideal "data-in-out" pipeline for purely compute based shaders.I've brought it up several times when talking with folks who work down in the chip level for optimizing these operations and all I can say is, there are a lot of unforeseen complications to what we're suggesting.It's not that we can't have a GPU that does these things, it's apparently more of a combination of previous and current architectural decisions that don't want that. For instance, an nVidia GPU is focused on providing the hardware optimizations necessary to do either LLM compute or graphics acceleration, both essentially proprietary technologies.The proprietariness isn't why it's obtuse though, you can make a chip go super-duper fast for specific tasks, or more general for all kinds of tasks. Somewhere, folks are making a tradeoff of backwards compatibility and supporting new hardware accelerated tasks.Neither of these are "general purpose compute and data flow" focuses. As such, you get the GPU that only sorta is configurable for what you want to do. Which in my opinion explains your "GPU programming seems to be both super low level, but also high level" comment.That's been my experience. I still think what you're suggesting is a great idea and would make GPU's a more open compute platform for a wider variety of tasks, while also simplifying things a lot.
 - cmovq5 hours ago
 This is true, but what the parent comment is getting at is we really just want to be able to address graphics memory the same way it's exposed in CUDA for example. Where you can just have pointers to GPU memory in structures visible to the CPU, without this song and dance with descriptor set bindings.
- fc417fc8026 hours ago
 If you got what you're asking for you'd presumably lose access to any fixed function hardware. RE your example, knowing the data format permits automagic hardware accelerated translations between image formats.You're free to do what you're asking after by simply performing all operations manually in a compute shader. You can manually clip, transform, rasterize, and even sample textures. But you'll lose the implicit use of various fixed function hardware that you currently benefit from.
 - bsder17 minutes ago
 > If you got what you're asking for you'd presumably lose access to any fixed function hardware.Are there any fixed functions left that aren't just being implemented by the general compute shader hardware?I guess the ray tracing stuff would qualify, but that isn't what people are complaining about here.
- jsheard10 hours ago
 Relevant: Descriptors are Hard from XDC 2025 - <a href="https://www.youtube.com/watch?v=TpwjJdkg2RE" rel="nofollow">https://www.youtube.com/watch?v=TpwjJdkg2RE</a>Even on modern hardware there's still a lot of architectural differences to reconcile at the API level.
- hinkley7 hours ago
 I’m not watching Rust as closely as I once did, but it seems like buffer ownership is something it should be leaning on more fully.There’s an old concurrency pattern where a producer and consumer tag team on two sets of buffers to speed up throughput. Producer fills a buffer, transfers ownership to the consumer, and is given the previous buffer in return.It is structurally similar to double buffered video, but for any sort of data.It seems like Rust would be good for proving the soundness. And it should be a library now rather than a roll your own.
 - LoganDark3 hours ago
 > There’s an old concurrency pattern where a producer and consumer tag team on two sets of buffers to speed up throughput. Producer fills a buffer, transfers ownership to the consumer, and is given the previous buffer in return.Isn't this just called a swapchain?
pjmlp12 hours ago
At least they are making an effort to correct the extension spaghetti, already worse than OpenGL.Addiitionally most of these fixes aren't coming into Android, now getting WebGPU for Java/Kotlin[0] after so many refused to move away from OpenGL ES, and naturally any card not lucky to get new driver releases.Still, better now than never.[0] - <a href="https://developer.android.com/jetpack/androidx/releases/webgpu" rel="nofollow">https://developer.android.com/jetpack/androidx/releases/webg...</a>
- viktorcode8 hours ago
 As someone from game development, not supporting Vulkan on Android and sticking with OpenGL ES instead is a safer bet. There is always some device(s) that bug out on Vulkan badly. Nobody wants to sit and find workarounds for that obscure vendor.
- tadfisher10 hours ago
 Bizarre take. Notice how that WebGPU is an AndroidX library? That means WebGPU API support is built into apps via that library and runs on top of the system's Vulkan or OpenGL ES API.Do you work for Google or an Android OEM? If not, you have no basis to make the claim that Android will cease updating Vulkan API support.
 - pjmlp9 hours ago
 I did not do such claim.WebGPU on Android runs on top of Vulkan.If you knew about 3D programming on Android, you would know that there are ongoing efforts to have only Vulkan, with OpenGL ES on top.However Java and Kotlin devs refuse to bother with the NDK for Vulkan, and keep reaching for OpenGL ES instead.Please refer to Google talks on Vulkanised conferences.
 - flohofwoe9 hours ago
 > ...efforts to have only Vulkan, with OpenGL ES on top...Ok this made me laugh given that Vulkan support on Android is so bad that WebGPU needs a fallback mode to GLES ;)<a href="https://github.com/gpuweb/gpuweb/issues/4266" rel="nofollow">https://github.com/gpuweb/gpuweb/issues/4266</a>
 - pjmlp9 hours ago
 Agreed, which is Google's motivation for doing that.The argument being that if Android only does Vulkan, that OEMs will be forced to care about their drivers.There are talks done by Google on this, either Vulkanised, Google IO, or GDC, can't remember now the exact one.
 - torginus7 hours ago
 Is it possible to support OpenGL on top of Vulkan well? It has been pointed out that Vulkan requires you to completely freeze and compile a graphics pipeline before using it, while OpenGL's state machine is more flexible, and the underlying hardware is somewhat more amenable to these state transitions at runtime, than the Vulkan API would suggest.Don't these compatibility layers run into issues with constant pipeline recompilation related performance issues, when emulating OpenGL?
 - pjmlp5 hours ago
 It is no different from running DirectX on Vulkan, DirectX or Vulkan on Metal.It works, kind of.
 - mirsadm5 hours ago
 Vulkan is awful to work with and the drivers are buggy. Google's own phones are the worst for it. I have an app with a compute only vulkan pipeline and on the Google Pixel 10 the whole screen becomes corrupted with some fairly basic shaders.
- kllrnohj10 hours ago
 > Addiitionally most of these fixes aren't coming into AndroidThe fuck are you talking about? Of course they'll come to Android
 - pjmlp9 hours ago
 Thanks for showing the audience the lack of experience with Vulkan drivers on Android.
hmry14 hours ago
I'm really enjoying these changes. Going from render passes to dynamic rendering really simplified my code. I wonder how this new feature compares to existing bindless rendering.From the linked video, "Feature parity with OpenCL" is the thing I'm most looking forward to.
- exDM6914 hours ago
 You can use descriptor heaps with existing bindless shaders if you configure the optional "root signature".However it looks like it's simpler to change your shaders (if you can) to use the new GLSL/SPIR-V functionality (or Slang) and don't specify the root signature at all (it's complex and verbose).Descriptor heaps really reduce the amount of setup code needed, with pipeline layouts gone you can drop like third of the code needed to get started.Similar in magnitude to dynamic rendering.
 - flohofwoe13 hours ago
 Having quite recently written a (still experimental) Vulkan backend for sokol_gfx.h, my impression is that starting with `VK_EXT_descriptor_buffer` (soon-ish to be replaced with `VK_EXT_descriptor_heap`), the "core API" is in pretty good shape now (with the remaining problem that all the outdated and depreciated sediment layers are still part of the core API, this should really be kicked out - e.g. when I explicitly request a specific API version like 1.4 I don't care about any features that have been deprecated in versions up to 1.4 and I don't care about any extensions that have been incorporated into the core API up until 1.4, so I'd really like to have them at least not show up in the Vulkan header so that code completion cannot sneak in outdated code (like EXT/KHR postfixes for things that have been moved into core).The current OpenGL-like sediment-layer-model (e.g. never remove old stuff) is extremely confusing when not following Vulkan development very closely since 2016, since there's often 5 ways to do the same thing, 3 of which are deprecated - but finding out whether a feature is deprecated is its own sidequest.What I actually wrestled with most was getting the outer frame-loop right without validation layer errors. I feel like this should be the next thing which the "Eye of Khronos" should focus on.All official tutorial/example code I've tried doesn't run without swapchain-sync-related validation errors on one or another configuration. Even this 'best practices' example code which demonstrates how to do the frame-loop scaffolding correctly produces valiation layer errors, so it's also quite useless:<a href="https://docs.vulkan.org/guide/latest/swapchain_semaphore_reuse.html" rel="nofollow">https://docs.vulkan.org/guide/latest/swapchain_semaphore_reu...</a>What's worse: different hardware/driver combos produce different validation layer errors (even in the swapchain-code which really shouldn't have different implementations across GPU vendors - e.g. shouldn't Khronos provide common reference code for those GPU-independent parts of drivers?). I wonder if there is actually any Vulkan code out there which is completely validation-layer-clean across all possible configs (I seriously doubt it).Also the VK_[EXT/KHR]_swapchain_maintenance1 extension which is supposed to fix all those little warts has such a low coverage that it's not worth supporting (but it should really be part of the core API by now - the extension is from 2019).Anyway... baby steps into the right direction, only a shame that it took a decade ;)
 - reactordev12 hours ago
 Vulkan is by far the most powerful and the most pain in the ass API I've ever worked with. I agree on every point you just made.
 - jorvi10 hours ago
 Isn't the idea that 99% of people use a toolkit atop of Vulkan?Like, these days game devs just use Unreal Engine, which abstracts away having to work with the PS5 / PS4, DirectX 12, and Vulkan APIs.I imagine unless it's either for A. edification or B. very bespoke purpose code, you're not touching Vulkan.
 flohofwoe10 hours ago
 > Isn't the idea that 99% of people use a toolkit atop of Vulkan?This idea creates a serious chicken-egg-problem.Two or three popular engine code bases sitting on top of Vulkan isn't enough 'critical mass' to get robust and high performance Vulkan drivers. When there's so little diversity in the code hammering on the Vulkan API it's unlikely that all the little bugs and performance problems lurking in the drivers will be triggered and fixed, especially when most Unity or Unreal game projects will simply select the D3D11 or D3D12 backend since their main target platform on PC is Windows.Similar problem to when GLQuake was the only popular OpenGL game, as soon as your own code used the GL API in a slightly different way than Quake did all kinds of things broke since those GL drivers only properly implemented and tested the GL subset used by GLQuake, and with the specific function call patterns of GLQuake.From what I've seen so far, the MESA Vulkan drivers on Linux seem to be in much better shape than the average Windows Vulkan driver. The only explanation I have for this is that there are hardly any Windows games running on top of Vulkan (instead they use D3D11 or D3D12), while running those same D3D11/D3D12 games on Linux via Proton always goes through the Vulkan driver. So on Linux there may be more 'evolutionary pressure' to get high quality Vulkan drivers indirectly via D3D11/D3D12 games that run via Proton.
 jorvi8 hours ago
 You might be unaware of this, but Vulkan Video Decode is slowly but surely replacing the disparate bespoke video decode acceleration on almost all platforms.Vulkan is mature. It has been used in production since 2013 (!) in the form of Mantle. I have no idea why all the Vulkan doomsayers here think it still needs a half-to-whole decade to be 'useful'.
 reactordev8 hours ago
 >”hardly any Windows games running on top of Vulkan”I run all my windows games on Vulkan.<a href="https://www.pcgamingwiki.com/wiki/List_of_Vulkan_games" rel="nofollow">https://www.pcgamingwiki.com/wiki/List_of_Vulkan_games</a>
 flohofwoe6 hours ago
 280 games over 10 years really isn't impressive (2.5x less than even D3D8 which was an unpopular 'inbetween' D3D version and only relevant for about 2 years). D3D12 (890 games) isn't great either when compared to D3D11 (4.6k) or D3D9 (3.3k), it really demonstrates what a massive failure the modern 3D APIs are for real-world usage :/I don't think those lists are complete, but they seem to show the right relative amount of 3D API usage across PC games.
 reactordev6 hours ago
 I’m just pointing out that Vulkan is supported on all major modern engines, internal and public. Some also go so far as to do DX12 (fine, it’s a similar feeling API) but what’s really amazing is taking all of those games that run on OpenGL, DirectX, etc and forcing them to run on Vulkan…Proton is amazing and Wine project deserves your support.
 m-schuetz10 hours ago
 Many people need something in-between heavy frameworks and engines or oppinionated wrappers with questionable support on top of Vulkan; and Vulkan itself. OpenGL served that purpose perfectly, but it's unfortunately abandoned.
 quantummagic8 hours ago
 Isn't that what the Zink, ANGLE, or GLOVE projects meant to provide? Allow you to program in OpenGL, which is then automatically translated to Vulkan for you.
 m-schuetz8 hours ago
 I don't see the point of those when I can just directly use OpenGL. Any translation layer typically comes with limitations or issues. Also, I'm not that glued to OpenGL, I do think it's a terrible API, but there just isn't anything better yet. I wanted Vulkan to be something better, but I'm not going to use an API with entirely pointless complexity with zero performance benefits for my use cases.
 reactordev8 hours ago
 Those are mostly designed for back porting and not new projects. OpenGL is dead for new projects.
 m-schuetz5 hours ago
 I do all my new projects in OpenGL and Cuda since I'm not smart enough for Vulkan.
 fc417fc8025 hours ago
 > OpenGL is dead for new projects.Says who? Why?It looks long term stable to me so I don't see the issue.
 reactordev5 hours ago
 DirectX 9 is long term stable so I don't see the issue...No current gen console supports it. Mac is stuck on OpenGL 4.1 (you can't even compile anything OpenGL on a Mac without hacks). Devices like Android run Vulkan more and more and are sunsetting OpenGLES. No, OpenGL is dead. Vulkan/Metal/NVN/DX12/WebGPU are the current.
 fc417fc8025 hours ago
 The aforementioned abstraction layers exist. You had dismissed those as only suitable for backporting. Can you justify that? What exactly is wrong with using a long term stable API whether via the native driver or an abstraction layer?Edit: By the same logic you could argue that C89 is dead for new projects but that's obviously not true. C89 is eternal and so is OpenGL now that we've got decent hardware independent implementations.
 jplusequalt8 hours ago
 Wasn't it announced last year that it was getting a new mesh shader extension?
 reactordev8 hours ago
 No.There are literally dozens of in-house engines that run on Vulkan. Not everything is Unreal or Unity.
 jplusequalt8 hours ago
 >Like, these days game devs just use Unreal EngineThis is not true in the slightest. There are loads of custom 3D engines across many many companies/hobbyists. Vulkan has been out for a decade now, there are likely Vulkan backends in many (if not most) of them.
 - sho_hn12 hours ago
 Are there any good Vulkan tutorials that are continuously updated to reflect these advancement and ease of use improvements?It's a similar challenge to the many different historical strata of C++ resources.
 - jsheard12 hours ago
 <a href="https://howtovulkan.com" rel="nofollow">https://howtovulkan.com</a> is a recent one which targets the modern flavour of Vulkan that everything supports today.Well, all desktop hardware and drivers at least. God help you if you want to ship on Android.
 - dismalaf12 hours ago
 The one on Vulkan.org recently got updated to use dynamic rendering and a bunch of the newest features (plus modern C++, Slang instead of glsl, etc...).<a href="https://docs.vulkan.org/tutorial/latest/00_Introduction.html" rel="nofollow">https://docs.vulkan.org/tutorial/latest/00_Introduction.html</a>
 - positron2612 hours ago
 Finding the optimal sub-language is about API coupling with client code, making a moving sweet spot for where bread & butter techniques live.
m-schuetz12 hours ago
I suspect we are only 5-10 years away until Vulkan is finaly usable. There are so many completely needlessly complex things, or things that should have an easy-path for the common case.BDA, dynamic rendering and shader objects almost make Vulkan bearable. What's still sorely missing is a single-line device malloc, a default queue that can be used without ever touching the queue family API, and an entirely descriptor-free code path. The latter would involve making the NV bindless extension the standard which simply gives you handles to textures, without making you manage descriptor buffers/sets/heaps. Maybe also put an easy-path for synchronization on that list and making the explicit API optional.Until then I'll keep enjoying OpenGL 4.6, which already had BDA with c-style pointer syntax in glsl shaders since 2010 (NV_shader_buffer_load), and which allows hassle-free buffer allocation and descriptor-set-free bindless textures.
- bvjgkbl8 hours ago
 I use Vulkan on a daily basis. Some examples:- with DXVK to play games - with llama.cpp to run local LLMsVulkan is already everywhere, from games to AI.
pixelpoet12 hours ago
I would like to / am "supposed to" use Vulkan but it's a massive pain coming from OpenCL, with all kinds of issues that need safe handling which simply don't come from OpenCL workloads.Everyone keeps telling me OpenCL is deprecated (which is true, although it's also true that it continues to work superbly in 2026) but there isn't a good / official OpenCL to Vulkan wrapper out there to justify it for what I do.
Animats7 hours ago
Not sure if this is an "oh, no" event.So this goes into Vulkan. Then it has to ship with the OS. Then it has to go into intermediate layers such as WGPU. Which will probably have to support both old and new mode. Then it has to go into renderers. Which will probably have to support both old and new mode. Maybe at the top of the renderer you can't tell if you're in old or new mode, but it will probably leak through. In that case game engines have to know about this. Which will cause churn in game code.And Apple will do something different, in Metal.Unreal Engine and Unity have the staffs to handle this, but few others do. The Vulkan-based renderers which use Vulkan concurrency to get performance OpenGL can't deliver are few. Probably only Unreal Engine and Unity really exploit Vulkan properly.Here's the top level of the Vulkan changes.[1] It doesn't look simple.(I'm mostly grumbling because the difficulty and churn in Vulkan/WGPU has resulted in three abandoned renderers in Rust land through developer burnout. I'm a user of renderers, and would like them to Just Work.)[1] <a href="https://docs.vulkan.org/refpages/latest/refpages/source/VK_EXT_descriptor_heap.html" rel="nofollow">https://docs.vulkan.org/refpages/latest/refpages/source/VK_E...</a>
- nicebyte6 hours ago
 > Not sure if this is an "oh, no" event.it's not.descriptor sets are realistically never getting deprecated. old code doesn't have to be rewritten if it works. there's no point.if you're doing bindless (which you most certainly arent if you're still stuck with descriptor sets) this offers a better way of handling that.if you care to upgrade your descriptor set based path to use heaps, this extension offers a very nice pathway to doing so _without having to even recompile shaders_.for new/future code, this is a solid improvement.if you're happy where you are with your renderer, there isn't a need to do anything.
 - p_l5 hours ago
 And apparently if you do mobile you stay away from big chunk of dynamic rendering and use Vulkan 1.0 style renderpasses... or you leave performance on the floor (based on guidelines from various mobile GPU vendors)
 - Animats13 minutes ago
 Vulkan on mobile and web runs several years behind Vulkan on desktop. This is a problem for portable toolkits such as WGPU.
jabl10 hours ago
Does this evolution of the Vulkan API get closer to the model explained in <a href="https://www.sebastianaaltonen.com/blog/no-graphics-api" rel="nofollow">https://www.sebastianaaltonen.com/blog/no-graphics-api</a> which we discussed in <a href="https://news.ycombinator.com/item?id=46293062">https://news.ycombinator.com/item?id=46293062</a> ?
- rkevingibson10 hours ago
 Yes, you can get very close to that API with this extension + existing Vulkan extensions. The main difference is that you still kind of need opaque buffer and texture objects instead of raw pointers, but you can get GPU pointers for them and still work with those. In theory I think you could do the malloc API design there but it's fairly unintuitive in Vulkan and you'd still need VkBuffers internally even if you didn't expose them in a wrapper layer. I've got a (not yet ready for public) wrapper on Vulkan that mostly matches this blog post, and so far it's been a really lovely way to do graphics programming.The main thing that's not possible at all on top of Vulkan is his signals API, which I would enjoy seeing - it could be done if timeline semaphores could be waited on/signalled inside a command buffer, rather than just on submission boundaries. Not sure how feasible that is with existing hardware though.
- flohofwoe10 hours ago
 It's a baby-step in this direction, e.g. from Seb's article:> Vulkan’s VK_EXT_descriptor_buffer (<a href="https://www.khronos.org/blog/vk-ext-descriptor-buffer" rel="nofollow">https://www.khronos.org/blog/vk-ext-descriptor-buffer</a>) extension (2022) is similar to my proposal, allowing direct CPU and GPU write. It is supported by most vendors, but unfortunately is not part of the Vulkan 1.4 core spec.The new `VK_EXT_descriptor_heap` extension described in the Khronos post is a replacement for `VK_EXT_descriptor_buffer` which fixes some problems but otherwise is the same basic idea (e.g. "descriptors are just memory").
HexDecOctBin13 hours ago
I personally just switched to using push descriptors everywhere. On desktops, the real world limits are high enough that it end up working out fine and you get a nice immediate mode API like OpenGL.
- exDM6912 hours ago
  That's the right way to go for simple use cases and especially getting started on a new project.
socalgal29 hours ago
Vulkan takes like 600+ lines to do what Metal does in 50.I'm sure the comments will be all excuses and whys but they're all nonsense. It's just a poorly thought out API.
- wasmperson4 hours ago
 My understanding of API standards that need to be implemented by multiple vendors is that there's a tradeoff between having something that's easy for the programmer to use and something that's easy for vendors to implement.A big complaint I hear about OpenGL is that it has inconsistent behavior across drivers, which you could argue is because of the amount of driver code that needs to be written to support its high-level nature. A lower-level API can require less driver code to implement, effectively moving all of that complexity into the open source libraries that eventually get written to wrap it. As a graphics programmer you can then just vendor one of those libraries and win better cross-platform support for free.For example: I've never used Vulkan personally, but I still benefit from it in my OpenGL programs thanks to ANGLE.
- m-schuetz8 hours ago
 Agreed. It has way too much completely unnecessary verbosity. Like, why the hell does it take 30 lines to allocate memory rather than one single malloc.
 - nicebyte7 hours ago
 just use the vma library. the low level memory allocation interface is for those who care to have precise control over allocations. vma has shipped in production software and is a safe choice for those who want to "just allocate memory".
 - m-schuetz6 hours ago
 Nah, I know about VMA and it's a poor bandaid. I want a single-line malloc with zero care about usage flags and which only produces one single pointer value, because that's all that's needed in pretty much all of my use cases. VMA does not provide that.And Vulkans unnecessary complexity doesn't stop at that issue, there are plenty of follow-up issues that I also have no intention of dealing with. Instead, I'll just use Cuda which doesn't bother me with useless complexity until I actually opt-in to it when it's time to optimize. Cuda allows to easily get stuff done first then check the more complex stuff to optimize, unlike Vulkan which unloads the entire complexity on you right from the start, before you have any chance to figure out what to do.
 - nicebyte6 hours ago
 > I want a single-line malloc with zero care about usage flags and which only produces one single pointer valueThat's not realistic on non-UMA systems. I doubt you want to go over PCIe every time you sample a texture, so the allocator has to know what you're allocating memory _for_. Even with CUDA you have to do that.And even with unified memory, only the implementation knows exactly how much space is needed for a texture with a given format and configuration (e.g. due to different alignment requirements and such). "just" malloc-ing gpu memory sounds nice and would be nice, but given many vendors and many devices the complexity becomes irreducible. If your only use case is compute on nvidia chips, you shouldn't be using vulkan in the first place.
 m-schuetz6 hours ago
 > Even with CUDA you have to do that.No you don't, cuMemAlloc(&ptr, size) will just give you device memory, and cuMemAllocHost will give you pinned host memory. The usage flags are entirely pointless. Why would UMA be necessary for this? There is a clear separation between device and host memory. And of course you'd use device memory for the texture data. Not sure why you're constructing a case where I'd fetch them from host over PCI, that's absurd.> only the implementation knows exactly how much space is needed for a texture with a given format and configurationOpenGL handles this trivially, and there is also no reason for a device malloc to not also work trivially with that. Let me create a texture handle, and give me a function that queries the size that I can feed to malloc. That's it. No heap types, no usage flags. You're making things more complicated than they need to be.
 nice_byte5 hours ago
 > No you don't, cuMemAlloc(&ptr, size) will just give you device memory, and cuMemAllocHost will give you pinned host memory.that's exactly what i said. You have to explicitly allocate one or the other type of memory. I.e. you have to think about what you need this memory _for_. It's literally just usage flags with extra steps.> Why would UMA be necessary for this?UMA is necessary if you want to be able to "just allocate some memory without caring about usage flags". Which is something you're not doing with CUDA.> OpenGL handles this trivially,OpenGL also doesn't allow you to explicitly manage memory. But you were asking for an explicit malloc. So which one do you want, "just make me a texture" or "just give me a chunk of memory"?> Let me create a texture handle, and give me a function that queries the size that I can feed to malloc. That's it. No heap types, no usage flags.Sure, that's what VMA gives you (modulo usage flags, which as we had established you can't get rid of). Excerpt from some code:``` VmaAllocationCreateInfo vma_alloc_info = { .usage = VMA_MEMORY_USAGE_GPU_ONLY, .requiredFlags = VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT};VkImage img; VmaAllocation allocn; const VkResult create_alloc_vkerr = vmaCreateImage( vma_allocator, &vk_image_info, // <-- populated earlier with format, dimensions, etc. &vma_alloc_info, &img, &allocn, NULL); ```Since i dont care about reslurce aliasing, that's the extent of "memory management" that i do in my rhi. The last time i had to think about different heap types or how to bind memory was approximately never.
 m-schuetz5 hours ago
 No, it's not usage flags with extra steps, it's less steps. It's explicitly saying you want device memory without any kind of magical guesswork of what your numerous potential combinations of usage flags may end up giving you. Just one simple device malloc.Likewise, your claim about UMA makes zero sense. Device malloc gets you a pointer or handle to device memory, UMA has zero relation to that. The result can be unified, but there is no need for it to be.Yeah, OpenGL does not do malloc. I'm flexible, I don't necessarily need malloc. What I want is a trivial way to allocate device memory, and Vulkan and VMA don't do that. OpenGL is also not the best example since it also uses usage flags in some cases, it's just a little less terrible than Vulkan when it comes to texture memory.I find it fascinating how you're giving a bad VMA example and passing that of as exemplary. Like, why is there gpu-only and device-local. That vma alloc info as a whole is completely pointless because a theoretical vkMalloc should always give me device memory. I'm not going to allocate host memory for my 3d models.
 mirsadm5 hours ago
 But what if you want both on a shared memory system?
 m-schuetz5 hours ago
 No problem: Then you provide an optional more complex API that gives you additional control. That's the beautiful thing about Cuda, it has an easy API for the common case that suffices 99% of the time, and additional APIs for the complex case if you really need that. Instead of making you go through the complex API all the time.
 nice_byte5 hours ago
 > It's explicitly saying you want device memoryYou are also explicitly saying that you want device memory by specifying DEVICE_LOCAL_BIT. There's no difference.> Likewise, your claim about UMA makes zero sense. Device malloc gets you a pointer or handle to device memory,It makes zero sense to you because we're talking past each other. I am saying that on systems without UMA you _have_ to care where your resources live. You _have_ to be able to allocate both on host and device.> Like, why is there gpu-only and device-local.Because there's such a thing as accessing GPU memory from the host. Hence, you _have_ to specify explicitly that no, only the GPU will try to access this GPU-local memory. And if you request host-visible GPU-local memory, you might not get more than around 256 megs unless your target system has ReBAR.> a theoretical vkMalloc should always give me device memory.No, because if that's the only way to allocate memory, how are you going to allocate staging buffers for the CPU to write to? In general, you can't give the copy engine a random host pointer and have it go to town. So, okay now we're back to vkDeviceMalloc and vkHostMalloc. But wait, there's this whole thing about device-local and host visible, so should we add another function? What about write-combined memory? Cache coherency? This is how you end up with a zillion flags.This is the reason I keep bringing UMA up but you keep brushing it off.
 m-schuetz4 hours ago
 > You are also explicitly saying that you want device memory by specifying DEVICE_LOCAL_BIT. There's no difference.There is. One is a simple malloc call, the other uses arguments with numerous combinations of usage flags which all end up doing exactly the same, so why do thy even exist.> You _have_ to be able to allocate both on host and device.cuMemAlloc and cuMemAllocHost, as mentioned before.> Because there's such a thing as accessing GPU memory from the hostNever had the need for that, just cuMemcpyHtoD and DtoH the data. Of course host-mapped device memory can continue to exist as a separate, more cumbersome API. The 256MB limit is cute but apparently not relevant im Cuda where I've been memcpying buffers with GBs in size between host and device for years.> No, because if that's the only way to allocate memory, how are you going to allocate staging buffers for the CPU to write to?With the mallocHost counterpart.cuMemAllocHost, so a theoretic vkMallocHost, gives you pinned host memory where you can prep data before sending it to device with cuMemcpyHtoD.> This is how you end up with a zillion flags.Apparently only if you insist on mapped/host-visible memory. This and usage flags never ever come up in Cuda where you just write to the host buffer and memcpy when done.> This is the reason I keep bringing UMA up but you keep brushing it off.Yes I think I now get why keep bringing up UMA - because you want to directly access buffers between host or device via pointers. That's great, but I don't have the need for that and I wouldn't trust the performance behaviour of that approach. I'll stick with memcpy which is fast, simple, has fairly clear performance behaviours and requires none of the nonsense you insist on being necessary. But what I want isn't either this or that approach, I want the simple approach in addition what exists now, so we can both have our cakes.
 MindSpunk3 hours ago
 What exactly is the difference between these?cuMemAlloc -> vmaAllocate + VMA_MEMORY_USAGE_GPU_ONLYcuMemAllocHost -> vmaAllocate + VMA_MEMORY_USAGE_CPU_ONLYIt seems like the functionality is the same, just the memory usage is implicit in cuMemAlloc instead of being typed out? If it's that big of a deal write a wrapper function and be done with it?Usage flags never come up in CUDA because everything is just a bag-of-bytes buffer. Vulkan needs to deal with render targets and textures too which historically had to be placed in special memory regions, and are still accessed through big blocks of fixed function hardware that are very much still relevant. And each of the ~6 different GPU vendors across 10+ years of generational iterations does this all differently and has different memory architectures and performance cliffs.It's cumbersome, but can also be wrapped (i.e. VMA). Who cares if the "easy mode" comes in vulkan.h or vma.h, someone's got to implement it anyway. At least if it's in vma.h I can fix issues, unlike if we trusted all the vendors to do it right (they wont).
 nice_byte4 hours ago
 > I want the simple approach in addition what exists now, so we can both have our cakes.The simple approach can be implemented on top of what Vulkan exposes currently.In fact, it takes only a few lines to wrap that VMA snippet above and you never have to stare at those pesky structs again!But Vulkan the API can't afford to be "like CUDA" because Vulkan is not a compute API for Nvidia GPUs. It has to balance a lot of things, that's the main reason it's so un-ergonomic (that's not to say there were no bad decisions made. Renderpasses were always a bad idea.)
 m-schuetz4 hours ago
 > In fact, it takes only a few lines to wrap that VMA snippet above and you never have to stare at those pesky structs again!If it were just this issue, perhaps. But there are so many more unnecessary issues that I have no desire to deal with, so I just started software-rasterizing everything in Cuda instead. Which is way easier because Cuda always provides the simple API and makes complexity opt-in.
- pjmlp7 hours ago
 Same with DirectX, if only COM actually had better tooling, instead of pick your adventure C++ framework, or first class support for .NET.
 - flohofwoe6 hours ago
 DXGI+D3D11 via C is actually fine and is close or even lower than Metalv1 when it comes to 'lines of code needed to get a triangle on screen". D3D12 is more boilerplate-heavy, but still not as bad as Vulkan.
 - pjmlp5 hours ago
 I guess at least that way is easier to have bindings.I like COM as idea, but the tooling execution could be so much better.
jauntywundrkind10 hours ago
How are folks feeling about WebGPU these days?Once Vulkan is finally in good order, descriptor_heap and others, I really really hope we can get a WebGPU.next.Where are we at with the "what's next for webgpu" post, from 5 quarters ago? <a href="https://developer.chrome.com/blog/next-for-webgpu" rel="nofollow">https://developer.chrome.com/blog/next-for-webgpu</a> <a href="https://news.ycombinator.com/item?id=42209272">https://news.ycombinator.com/item?id=42209272</a>
- hutao3 hours ago
 This is my point of view as someone who learned WebGPU as a precursor to learning Vulkan, and who is definitely not a graphics programming expert:My personal experience with WebGPU wasn't the best. One of my dislikes was pipelines, which is something that other people also discuss in this comment thread. Pipeline state objects are awkward to use without an extension like dynamic rendering. You get a combinatorial explosion of pipelines and usually end up storing them in a hash map.In my opinion, pipelines state objects are a leaky abstraction that exposes the way that GPUs work: namely that some state changes may require some GPUs to recompile the shader, so all of the state should be bundled together. In my opinion, an API for the web should be concerned with abstractions from the point of view of the programmer designing the application: which state logically acts as a single unit, and which state may change frequently?It seems that many modern APIs have gone with the pipeline abstraction; for example, SDL_GPU also has pipelines. I'm still not sure what the "best practices" are supposed to be for modern graphics programming regarding how to structure your program around pipelines.I also wish that WebGPU had push constants, so that I do not have to use a bind group for certain data such as transformation matrices.Because WebGPU is design-by-committee and must support the lowest common denominator hardware, I'm worried whether it will evolve too slowly to reflect whatever the best practices are in "modern" Vulkan. I hope that WebGPU could be a cross-platform API similar to Vulkan, but less verbose. However, it seems to me that by using WebGPU instead of Vulkan, you currently lose out on a lot of features. Since I'm still a beginner, I could have misconceptions that I hope other people will correct.
- m-schuetz10 hours ago
 WebGPU is kinda meh, a 2010s graphic programmers vision of a modern API. It follows Vulkan 1.0, and while Vulkan is finally getting rid of most of the mess like pipelines, WebGPU went all in. It's surprisingly cumbersome to bind stuff to shaders, and everything is static and has to be hashed&cached, which sucks for streaming/LOD systems. Nowadays you can easily pass arbitrary amounts of buffers and entire scene descriptions via GPU memory pointers to OpenGL, Vulkan, CUDA, etc. with BDA and change them dynamically each frame. But not in WebGPU which does not support BDA und is unlikely to support it anytime soon.It's also disappointing that OpenGL 4.6, released in 2017, is a decade ahead of WebGPU.
 - kllrnohj10 hours ago
 WebGPU has the problem of needing to handle the lowest common denominator (so GLES 3 if not GLES 2 because of low end mobile), and also needing to deal with Apple's refusal to do anything with even a hint of Khronos (hence why no SPIR-V even though literally everything else including DirectX has adopted it)Web graphics have never and will never be cutting edge, they can't as they have to sit on top of browsers that have to already have those features available to it. It can only ever build on top of something lower level. That's not inherently bad, not everything needs cutting edge, but "it's outdated" is also just inherently going to be always true.
 - m-schuetz10 hours ago
 I understand not being cutting-edge. But having a feature-set from 2010 is...not great.Also, some things could have easily be done different and then be implemented as efficient as a particular backend allows. Like pipelines. Just don't do pipelines at all. A web graphics API does not need them, WebGL worked perfectly fine without them. The WebGPU backends can use them if necessary, or not use them if more modern systems don't require them anymore. But now we're locked-in to a needlessly cumbersome and outdated way of doing things in WebGPU.Similarly, WebGPU could have done without that static binding mess. Just do something like commandBuffer.draw(shader, vertexBuffer, indexBuffer, texture, ...) and automatically connect the call with the shader arguments, like CUDA does. The backend can then create all that binding nonsense if necessary, or not if a newer backend does not need it anymore.
 - flohofwoe10 hours ago
 > WebGL worked perfectly fine without themExcept it didn't. In the GL programming model it's trivial to accidentially leak the wrong granular render state into the next draw call, unless you always reconfigure all states anyway (and in that case PSOs are strictly better, they just include too much state).The basic idea of immutable state group objects is a good one, Vulkan 1.0 and D3D12 just went too far (while the state group granularity of D3D11 and Metal is just about right).> Similarly, WebGPU could have done without that static binding mess.This I agree with, pre-baked BindGroup objects were just a terrible idea right from the start, and AFAIK they are not even strictly necessary when targeting Vulkan 1.0.
 cmovq8 hours ago
 There should be a better abstraction to solve the GL state leakage problem than PSOs. We end up with a combinatory explosion of PSOs when some states they abstract are essentially toggling some bits in a GPU register in no way coupled with the rest of the pipeline state.
 flohofwoe6 hours ago
 That abstraction exists in D3D11 and to a lesser extent in Metal via smaller state-group-objects (for instance D3D11 splits the rende state into immutable objects for rasterizer-state, depth-stencil-state, blend-state and (vertex-)input-layout-state (not even needed anymore with vertex pulling).Even if those state group objects don't match the underlying hardware directly they still reign in the combinatorial explosion dramatically and are more robust than the GL-style state soup.AFAIK the main problem is state which needs to be compiled into the shader on some GPUs while other GPUs only have fixed-function hardware for the same state (for instance blend state).
 m-schuetz9 hours ago
 > Except it didn't. In the GL programming model it's trivial to accidentially leak the wrong granular render state into the next draw callThis is where I think Vulkan and WebGPU are chasing the wrong goal: To make draw calls faster. What's even faster, however, is making fewer draw calls and that's something graphics devs can easily do when you provide them with tools like multi-draw. Preferably multi-draw that allows multiple different buffers. Doing so will naturally reduce costly state changes with little effort.
 pjmlp7 hours ago
 Agreed, this is the console approach with command buffers that get DMAed, and having more code on the GPU side.
- flohofwoe10 hours ago
 I think in the end it all depends on Android. Average Vulkan driver quality on Android doesn't seem to be great in the first place, getting uptodate Vulkan API support, and in high quality and high enough performance for a modernized WebGPU version to build on might be too much to ask of the Android ecosystem for the next one or two decades.
- pjmlp9 hours ago
 As always, the only two positive things about WebGL and WebGPU, are being available on browsers, and having been designed for managed languages.They lag behind modern hardware, and after almost 15 years, there are zero developer tools to debug from browser vendors, other than the oldie SpectorJS that hardly counts.
 - socalgal21 hour ago
 This is kind of ridiculous takeYou can use wgpu or dawn in a native app and use native tools for GPU debugging if that's what you wantYou can then take that and also run it in the browser, and, you can debug the browser in the same tools. Google it for instructionsThe positive things about WebGPU is it's actually portable, unlike Vulkan. And, it's easy to use, unlike Vulkan.
- yu3zhou410 hours ago
 I try my best to push ML things into WebGPU and I think it has a future, but performance is not there yet. I have little experience with Vulkan except toy projects, but WebGPU and Vulkan seem very similar
- Cloudef9 hours ago
 WebGPU is kinda meh. It's when you need to do do something on browser that you can't with WebGL. GLES is the compatibility king and runs pretty much everywhere, if not natively then through a compatibility layer like ANGLE. I'm sad that WebGPU killed WebGL 3 which was supposed to add compute shaders. Maybe WebGPU would've been more interesting if it wasn't made to replace WebGL but instead be a non-compatibility API targetting modern rendering and actually supporting Spir-V.
janlucien7 hours ago
[dead]
openclawagent1312 hours ago
[dead]
lucastytthhh13 hours ago
[flagged]
sxzygz8 hours ago
Uuugh, graphics. So many smart people expending great energy to look busy while doing nothing particularly profound.Graphics people, here is what you need to do.1) Figure out a machine abstraction.2) Figure out an abstraction for how these machines communicate with each other and the cpu on a shared memory bus.3) Write a binary spec for code for this abstract machine.4) Compilers target this abstract machine.5) Programs submit code to driver for AoT compilation, and cache results.6) Driver has some linker and dynamic module loading/unloading capability.7) Signal the driver to start that code.AMD64, ARM, and RISC-V are all basically differing binary specs for a C-machine+MMU+MMIO compute abstraction.Figure out your machine abstraction and let us normies write code that’s accelerated without having to throw the baby out with the bathwater ever few years.Oh yes, give us timing information so we can adapt workload as necessary to achieve soft real-time scheduling on hardware with differing performance.
- sxzygz4 hours ago
 I don’t know which of my detractors to respond to, so I’ll respond here.It should be clear that I’m only interested in compute and not a GPU expert.GPUs, from my understanding, have lost the majority of fixed-function units as they’ve become more programmable. Furthermore, GPUs clearly have a hidden scheduler and this is not fully exposed by vendors. In other words we have no control over what is being run on a GPU at any given instant, we simply queue work for it.Given all these contrivances, why should not the interface exposed to the user be absolutely simple. It should then be up to vendors to produce hardware (and co-designed compilers) to run our software as fast as possible.Graphics developers need to develop a narrow-waist abstraction for wide, latency-hiding, SIMD compute. On top of this Vulkan, or OpenGL, or ML inference, or whatever can be done. The memory space should also be fully unified.This is what needs to be worked on. If you don’t agree, that’s fine, but don’t pretend that you’re not protecting entrenched interests from the likes of Microsoft, Nvidia, Epic Games, Valve and others.Telling people to just use Unreal engine, or Unity, or even Godot, it just like telling people to just use Python, or Typescript, or Go to get their sequential compute done.Expose the compute!
- dyingkneepad5 hours ago
 They have done it. The current modern abstraction is called Vulkan, and the binary spec code for this machine is called SPIR-V.
- flohofwoe6 hours ago
 Wow, you should get NVIDIA, AMD and Intel on the phone ASAP! Really strange that they didn't come up with such a simple and straightforward idea in the last 3 decades ;)
- M95D8 hours ago
 It sounds like webgl + wasm.
- nicebyte7 hours ago
 some of this is what's khronos standards are theoretically supposed to achieve.surprise, it's very difficult to do across many hw vendors and classes of devices. it's not a coincidence that metal is much easier to program for.maybe consider joining khronos since you apparently know exactly how to achieve this very simple goal...
 - flohofwoe6 hours ago
 > it's not a coincidence that metal is much easier to program forTbf, Metal also works on non-Apple GPUs and with only minimal additional hints to manage resources in non-unified memory.