6 comments

  • bheadmaster7 hours ago
    <p><pre><code> An understanding of READ_ONCE() and WRITE_ONCE() is important for kernel developers who will be dealing with any sort of concurrent access to data. So, naturally, they are almost entirely absent from the kernel&#x27;s documentation. </code></pre> Made me chuckle.
    • semiquaver6 hours ago
      More chuckles from the source:<p><pre><code> &#x2F;* * Yes, this permits 64-bit accesses on 32-bit architectures. These will * actually be atomic in some cases (namely Armv7 + LPAE), but for others we * rely on the access being split into 2x32-bit accesses for a 32-bit quantity * (e.g. a virtual address) and a strong prevailing wind. *&#x2F;</code></pre>
  • amelius9 minutes ago
    The problem is that the compiler has a too abstract view of the underlying system.<p>If reading twice would, say, launch missiles, then a read can be modeled as:<p><pre><code> some_register = do_read() counter += 1 if counter &gt;= 2: launch_missiles() </code></pre> With this substitution, a compiler would never replace a single read by two reads.
  • staticassertion5 hours ago
    &gt; There are a couple of interesting implications from this outcome, should it hold. The first of those is that, as Rust code reaches more deeply into the core kernel, its code for concurrent access to shared data will look significantly different from the equivalent C code, even though the code on both sides may be working with the same data. Understanding lockless data access is challenging enough when dealing with one API; developers may now have to understand two APIs, which will not make the task easier.<p>The thing is, it&#x27;ll be far less challenging for the Rust code, which will actually define the ordering semantics explicitly. That&#x27;s the point of rejecting the READ_ONCE&#x2F;WRITE_ONCE approach - it&#x27;s unclear what the goal is when using those, what guarantee you actually want.<p>I suspect that if Rust continues forward with this approach it will basically end up as the code where someone goes to read the actual semantics to determine what the C code should do.
    • bjackman2 hours ago
      In my experience, in practice, it usually isn&#x27;t that hard to figure out what people meant by a READ&#x2F;WRITE_ONCE().<p>Most common cases I see are:<p>1. I&#x27;m sharing data between concurrent contexts but they are all on the same CPU (classic is sharing a percpu variable between IRQ and task).<p>2. I&#x27;m reading some isolated piece of data that I know can change any time, but it doesn&#x27;t form part of a data structure or anything, it can&#x27;t be &quot;in an inconsistent state&quot; as long as I can avoid load-tearing (classic case: a performance knob that gets mutated via sysfs). I just wanna READ it ONCE into a local variable, so I can do two things with it and know they both operate with the same value.<p>I actually don&#x27;t think C++ or Rust have existing semantics that satisfy this kinda thing? So will be interesting to see what they come up with.
    • marcosdumay4 hours ago
      &gt; I suspect that if Rust continues forward with this approach it will basically end up as the code where someone goes to read the actual semantics to determine what the C code should do.<p>That will also put it on the unfortunate position of being the place that breaks every time somebody adds a bug to the C code.<p>Anyway, given the cultures involved, it&#x27;s probably inevitable.
      • mustache_kimono3 hours ago
        &gt; That will also put it on the unfortunate position of being the place that breaks every time somebody adds a bug to the C code.<p>Can someone explain charitably what the poster is getting at? To me, the above makes zero sense. If the Rust code is what is implemented correctly, and has the well-defined semantics, then, when the C code breaks, it&#x27;s obviously the C code&#x27;s problem?
        • Sharlin2 hours ago
          I think a charitable interpretation is that given that the Rust code will be less forgiving, it will &quot;break&quot; C code and patterns that &quot;used to work&quot;, albeit with latent UB or other nonobvious correctness issues. Now, obviously this is ultimately a good thing, and no developer worth their salt would seriously argue that latent bugs should stay latent, but as we&#x27;ve already seen, people have egos and aren&#x27;t always exceedingly rational.
  • gpderetta7 hours ago
    Very interesting. AFAIK the kernel explicitly gives consume semantics to read_once (and in fact it is not just a compiler barrier on alpha), so technically lowering it to a relaxed operation is wrong.<p>Does rust have or need the equivalent of std::memory_order_consume? Famously this was deemed unimplementable in C++.
    • steveklabnik7 hours ago
      It wasn’t implemented for the same reason. Rust uses C++20 ordering.
      • gpderetta7 hours ago
        right, so I would expect that the equivalent of READ_ONCE is converted to an acquire in rust, even if slightly pessimal.<p>But the article says that the suggestion is to convert them to relaxed loads. Is the expectation to YOLO it and hope that the compiler doesn&#x27;t break control and data dependencies?
        • bonzini7 hours ago
          There is a yolo way that actually works, which would be to change it to a relaxed load followed by an acquire signal fence.
          • loeg7 hours ago
            Is that any better than just using an acquire load?
            • gpderetta7 hours ago
              It is cheaper on ARM and POWER. But I&#x27;m not sure it is always safe. The standard has very complex rules for consume to make sure that the compiler didn&#x27;t break the dependencies.<p>edit: and those rules where so complex that compilers decided where not implementable or not worth it.
              • bonzini3 hours ago
                The rules were there to explain what optimizations remained possible. Here no optimization is possible at the compiler level, and only the processor retains freedom because we know it won&#x27;t use it.<p>It is nasty, but it&#x27;s very similar to how Linux does it (volatile read + __asm__(&quot;&quot;) compiler barrier).
                • gpderetta54 minutes ago
                  In principle a compiler could convert the data dependency into to a control dependency (for example, after PGO after checking against the most likely value), and those are fairly fragile.<p>I guess in practice mainstream compilers do not do it and relaxed+signal fence works for now, but the fact that compilers have been reluctant to use it to implement consume means that they are reluctant to commit to it.<p>In any case I think you work on GCC, so you probably know the details better than me.<p>edit: it seems that ARM specifically does not respect control dependencies. But I might misreading the MM.
                • comex3 hours ago
                  This is still unsound (in both C and Rust), because the compiler can break data dependencies by e.g. replacing a value with a different value known to be equal to it. A compiler barrier doesn&#x27;t prevent this. (Neither would a hardware barrier, but with a hardware barrier it doesn&#x27;t matter if data dependencies are broken.) The difficulty of ensuring the compiler will never break data dependencies is why compilers never properly implemented consume. Yet at the same time, this kind of optimization is actually very rare in non-pathological code, which is why Linux has been able to get away with assuming it won&#x27;t happen.
      • Fulgen3 hours ago
        C++20 actually [changed the semantics of consume](<a href="https:&#x2F;&#x2F;devblogs.microsoft.com&#x2F;oldnewthing&#x2F;20230427-00&#x2F;?p=108107" rel="nofollow">https:&#x2F;&#x2F;devblogs.microsoft.com&#x2F;oldnewthing&#x2F;20230427-00&#x2F;?p=10...</a>), but Rust doesn&#x27;t include it. And last I remember compilers still treat it as acquire, so it&#x27;s not worth the bytes it&#x27;s stored in.
        • jcranmer2 hours ago
          In the current drafts of C++ (I don&#x27;t know which version it landed in), memory_order::consume is fully dead and listed as deprecated in the standard.
    • loeg7 hours ago
      Does anything care about Alpha? The platform hasn&#x27;t been sold in 20 years.
      • jcranmer7 hours ago
        It&#x27;s a persistent misunderstanding that release-consume is about Alpha. It&#x27;s not; in fact, Alpha is one of the few architectures where release-consume <i>doesn&#x27;t</i> help.<p>In a TSO architecture like x86 or SPARC, every &quot;regular&quot; memory load&#x2F;store is effectively a release&#x2F;acquire by default. Using release&#x2F;consume or relaxed provides no extra speedup on these architectures. In weak memory models, you need to add in acquire barriers to get release&#x2F;acquire architectures. But also, most weak memory models have a basic rule that a data-dependent load has an implicit ordering dependency on the values that computed it (most notably, loading *p has an implicit dependency on p).<p>The goal of release&#x2F;consume is to be able to avoid having an acquire fence if you have only those dependencies--to promote a hardware data dependency semantic rule to a language-level semantic rule. For Alpha&#x27;s ultra-weak model, you <i>still</i> need the acquire fence in this mode, it doesn&#x27;t help Alpha one whit. Unfortunately, for various reasons, no one has been able to work out a language-level semantics for consume that compilers are willing to implement (preserving data dependencies through optimizations is a lot more difficult than it appears), so all compilers have remapped consume to acquire, making it useless.
      • gpderetta7 hours ago
        consume is trivial on alpha, it is the same as acquire (always needs a #LoadLoad). It is also the same as acquire (and relaxed) on x86 and SPARC (a plain load, #LoadLoad is always implied).<p>The only place where consume matters is on relaxed but not too relaxed architectures like ARM and POWER, where consume relies on the implicit #LoadLoad of controls and data dependencies.
        • bonzini6 hours ago
          Also on alpha there&#x27;s only store-store and full memory barriers. Acquire is very expensive.
          • gpderetta36 minutes ago
            Indeed. On the other hand recently ARM has added explicit load acquires primitives which are relatively cheap, so converting a consume to an acquire is not a big loss (and Linus considered doing it for the kernel a while ago just to avoid having to think too hard about compiler optimizations).
  • chrismsimpson7 hours ago
    &gt; The truth of the matter, though, is that the Rust community seems to want to take a different approach to concurrent data access.<p>Not knowing anything about development of the kernel, does this kind of thing create a two tier Linux development experience?
    • zaphar7 hours ago
      Not sure if it introduces a tiered experience or not. But reading the article it appears that the Rust devs advocated for an api that is clearer in it&#x27;s semantics with the tradeoff that now understanding how it interacts with C code requires understanding two APIs. How this shakes out in practice remains to be seen.
      • thenewwazoo4 hours ago
        Advocating for an API with clearer semantics has, afaict, been most of the actual work of integrating Rust into the kernel.
        • zaphar4 hours ago
          That is my understanding from the outside as well. The core question here should, I think, be whether the adoption and spread of clearer semantics via Rust is worth the potential for confusion and misunderstandings at the boundaries between C and Rust. From the article it appears that this specific instance actually resulted in identifying issues in the usage of the C api&#x27;s here that are geting scrutiny and fixes as a result. That would indicate the introduction of Rust is causing the trend line to go in the correct direction in at least this instance.
          • thenewwazoo4 hours ago
            That&#x27;s been largely my experience of RIIR over years of work in numerous contexts: attempting to encode invariants in the type system results in identifying semantic issues. over and over.<p>edit to add: and I&#x27;m not talking about compilation failures so much as design problems. when the meaning of a value is overloaded, or when there&#x27;s a &quot;you must do Y after X and never before&quot; and then you can&#x27;t write equivalent code in all cases, and so on. &quot;but what does this <i>mean</i>?&quot; becomes the question to answer.
  • epolanski7 hours ago
    What is your take on their names instead of &quot;atomic_read&quot; and &quot;atomic_write&quot;?
    • gpm6 hours ago
      The problem with atomic_read and atomic_write is that some people will interpret that as &quot;atomic with a sequentially consistent ordering&quot; and some as &quot;atomic with a relaxed ordering&quot; and everything in between. It&#x27;s a fine name for a function that takes an argument that specifies memory ordering [1]. It&#x27;s not great for anything else.<p>Read_once and Write_once convey that there&#x27;s more nuance than that, and tries to convey the nuance.<p>[1] E.g. in rust anything that takes <a href="https:&#x2F;&#x2F;doc.rust-lang.org&#x2F;std&#x2F;sync&#x2F;atomic&#x2F;enum.Ordering.html" rel="nofollow">https:&#x2F;&#x2F;doc.rust-lang.org&#x2F;std&#x2F;sync&#x2F;atomic&#x2F;enum.Ordering.html</a>
    • kccqzy6 hours ago
      I think “atomic” implies something more than just “once” because for atomic we customarily consider the memory order with that memory access, but “once” just implies reading and writing exactly once. Neither are good names because the kernel developers clearly assumed some kind of atomicity with some kind of memory ordering here but just calling it “atomic” doesn’t convey that.
    • bjackman2 hours ago
      Those things both exist in the kernel and they refer to CPU atomics similar to std::atomic in C++.