Aliasing

(xania.org)

96 points by ibobev53 days ago

6 comments

  • turol47 days ago
    For a real world example of how this can affect code check out this commit I made in mesa: <a href="https:&#x2F;&#x2F;gitlab.freedesktop.org&#x2F;mesa&#x2F;mesa&#x2F;-&#x2F;merge_requests&#x2F;20877&#x2F;diffs?commit_id=4b8dfaae89eedd54f7f9881adc8712d99ff30a60" rel="nofollow">https:&#x2F;&#x2F;gitlab.freedesktop.org&#x2F;mesa&#x2F;mesa&#x2F;-&#x2F;merge_requests&#x2F;20...</a>
  • Ono-Sendai46 days ago
    When you have done enough C++ you don&#x27;t need to fire up compiler explorer, you just use local variables to avoid aliasing pessimisations.<p>I also wrote about this a while ago: <a href="https:&#x2F;&#x2F;forwardscattering.org&#x2F;post&#x2F;51" rel="nofollow">https:&#x2F;&#x2F;forwardscattering.org&#x2F;post&#x2F;51</a>
    • dataflow46 days ago
      I think this might not be a shortcoming of MSVC but rather a deliberate design decision. It seems likely that MSVC is failing to apply strict aliasing, but that it&#x27;s deliberately avoiding it, probably for compatibility reasons with code that wasn&#x27;t&#x2F;isn&#x27;t written to spec. And frankly it can be very onerous to write code that is 100% correct per the standard when dealing with e.g. memory-mapped files; I&#x27;m struggling to recall seeing a single case of this.
      • gpderetta46 days ago
        AFIK MSVC has never implemented TBAA by design.
        • gpvos46 days ago
          TBAA = type-based alias analysis
  • andrepd46 days ago
    Thank you Rust for having aliasing guarantees on references!
  • adev_47 days ago
    Aliasing is no joke and currently the only reason why some arithmetic intensive code-bases still prefer Fortran even nowadays.<p>While it is possible to remove most aliasing performance issues in a C or C++ codebase, it is <i>a pain</i> to do it properly.
    • bregma46 days ago
      Aliasing can be a problem in Fortran too.<p>Decades ago I was a Fortran developer and encountered a very odd bug in which the wrong values were being calculated. After a lot of investigation I tracked it down to a subroutine call in which a hard-coded zero was being passed as an argument. It turned out that in the body of that subroutine the value 4 was being assigned to that parameter for some reason. The side effect was that the value of zero because 4 for the rest of the program execution because Fortran aliases all parameters since it passes by descriptor (or at least DEC FORTRAN IV did so on RSX&#x2F;11). As you can imagine, hilarity ensued.
      • pklausler46 days ago
        How does this bug concern aliasing?
        • mrspuratic46 days ago
          In old school FORTRAN (I only recall WATFOR&#x2F;F77, my uni&#x27;s computers were quite ancient) subroutine (aka &quot;subprogram&quot;) parameters are call-by-reference. If you passed a literal constant it would be treated as a variable in order to be aliased&#x2F;passed by reference. Due to &quot;constant pooling&quot;, modifications to a variable that aliased a constant could then propagate throughout the rest of the program where that constant[sic] was used.<p>&quot;Passing constants to a subprogram&quot; <a href="https:&#x2F;&#x2F;www.ibiblio.org&#x2F;pub&#x2F;languages&#x2F;fortran&#x2F;ch1-8.html" rel="nofollow">https:&#x2F;&#x2F;www.ibiblio.org&#x2F;pub&#x2F;languages&#x2F;fortran&#x2F;ch1-8.html</a>
        • Etheryte46 days ago
          It&#x27;s literally in the description? Because of aliasing, a variable that should&#x27;ve been zero became four.
          • pklausler46 days ago
            It wasn&#x27;t a variable.
            • Etheryte46 days ago
              It wasn&#x27;t intended to be a variable, but it did become one. Its value varied, it&#x27;s in the name.
              • pklausler46 days ago
                But this is just Fortran&#x27;s call-by-reference in action. It&#x27;s not aliasing.
    • uecker46 days ago
      Is it? You just add &quot;restrict&quot; where needed?<p><a href="https:&#x2F;&#x2F;godbolt.org&#x2F;z&#x2F;jva4shbjs" rel="nofollow">https:&#x2F;&#x2F;godbolt.org&#x2F;z&#x2F;jva4shbjs</a>
      • adev_46 days ago
        &gt; Is it? You just add &quot;restrict&quot; where needed?<p>Yes. That is the main solution and it is not a good one.<p>1- `restrict` need to be used carefully. Putting it everywhere in large codebase can lead to pretty tricky bugs if aliasing does occurs under the hood.<p>1- Restrict is not an official keyword in C++. C++ always has refused to standardize it because it plays terribly with almost any object model.
        • uecker46 days ago
          Regarding &quot;restrict&quot;, I don&#x27;t think one puts it everywhere, just for certain numerical loops which otherwise are not vectorized should be sufficient. FORTRAN seems even more dangerous to me. IMHO a better solution would be to have explicit notation for vectorized operations. Hopefully we will get this in C. Otherwise, I am very happy with C for numerics, especially with variably modified typs.<p>For C++, yes, I agree.
    • kryptiskt46 days ago
      Support for arrays without having to mess with pointers is pretty attractive for number crunchers too.
  • Bootvis47 days ago
    The whole series is excellent and as a non regular user of assembly I learned a ton.
  • artemonster47 days ago
    I wonder how much potential optimisation there is if we entirely drop pointer nonsense.
    • aw162110747 days ago
      Are you talking about dropping pointers as a programmer-facing programming language concept (in which case you might find Hylo and similar languages interesting), or dropping pointers from <i>everything</i> - programming languages, their implementations, compilers, etc. (in which case I&#x27;m not sure that&#x27;s even possible)?
      • artemonster47 days ago
        Only the first one. Ofc under the hood they will stay, but I think its time to ditch random access model and pull fetching and concept of time closer to programmer
        • uecker46 days ago
          This is basically what many functional programming languages do. This always came with plausibly sounding claims that this allows so much better optimizations that this soon will surpass imperative programs in performance, but this never materialized (it still did not - even though Rust fans now adopted this claim, it still isn&#x27;t quite true). Also control over explicit memory layout is still more important.
          • aw162110746 days ago
            Gah, can&#x27;t believe I forgot about functional programming languages here :(<p>&gt; even though Rust fans now adopted this claim<p>Did they? Rust&#x27;s references seem pretty pointer-like to me on the scale of &quot;has pointers&quot; to &quot;pointers have been entirely removed from the language&quot;.<p>(Obviously Rust has <i>actual</i> pointers as well, but since usefully using them requires unsafe I assume they&#x27;re out of scope here)
            • uecker46 days ago
              What I meant is that Rust has stricter aliasing rules which make some optimization possible without extra annotations, but this is balanced out by many other issues.
              • aw162110746 days ago
                Sure, but I think the presence&#x2F;absence of aliasing is different from what GP was wondering&#x2F;asking about, which was the removal of <i>pointers</i> from the programmer-facing model.
    • newpavlov46 days ago
      For a system programming language the right solution is to properly track aliasing information in the type system as done in Rust.<p>Aliasing issues is just yet another instance of C&#x2F;C++ inferiority holding the industry back. C could&#x27;ve learnt from Fortran, but we ended up with the language we have...
      • cv500546 days ago
        For systems programming the correct way is to have explicit annotations so you can tell the compiler things like:<p><pre><code> void foo(void *a, void *b, int n) { assume_aligned(a, 16); assume_stride(a, 16); assume_distinct(a, b); ... go and vectorize! }</code></pre>
        • newpavlov46 days ago
          LOL, nope. Those annotations must be part of the type system (e.g. `&amp;mut T` in Rust) and must be checked by the compiler (the borrow checker). The language can provide escape hatches like `unsafe`, but they should be rarely used. Without it you get a fragile footgunny mess.<p>Just look at the utter failure of `restrict`. It was so rarely used in C that it took several years of constant nagging from Rust developers to iron out various bugs in compilers caused by it.
          • aw162110746 days ago
            Does make me wonder what restrict-related bugs will be (have been?) uncovered in GCC, if any. Or whether the GCC devs saw what LLVM went through and decided to try to address any issues preemptively.
            • newpavlov45 days ago
              IIRC at least one of the `restrict` bugs found by Rust was reproduced on both LLVM and GCC.
            • gpderetta46 days ago
              gcc has had restrict for 25 years I think. I would hope most bugs have been squashed by now.
              • aw162110746 days ago
                Possibly? LLVM had been around for a while as well but Rust still ended up running into aliasing-related optimizer bugs.<p>Now that I think about it some more, perhaps gfortran might be a differentiating factor? Not familiar enough with Fortran to guess as to how much it would exercise aliasing-related optimizations, though.
                • gpderetta45 days ago
                  I think Fortran function arguments are assumed not to alias. I&#x27;m not sure if it matches C restrict semantics though.
                  • aw162110744 days ago
                    Yeah, that&#x27;s why I was wondering whether GCC might have shaken out its aliasing bugs. Sibling seems to recall otherwise, though.