20 comments

  • WalterBright6 hours ago
    These sorts of things are fun and interesting. Compiler optimizations fall into two categories:<p>1. organized data flow analysis<p>2. recognizing a pattern and replacing it with a faster version<p>The first is very effective over a wide range of programs and styles, and is the bulk of the actual transformations. The second is a never-ending accumulation of patterns, where one reaches diminishing returns fairly quickly.<p>The example in the linked article is very clever and fun, but not really of much value (I&#x27;ve never written a loop like that in 45 years). As mentioned elsewhere &quot;Everyone knows the Gauss Summation formula for sum of n integers i.e. n(n+1)&#x2F;2&quot; and since everyone knows it why not just write that instead of the loop!<p>Of course one could say that for any pattern, like replacing i*2 with i&lt;&lt;1, but those pattern replacements are very valuable because they are generated by high level generic coding.<p>And you could say I&#x27;m just being grumpy about this because my optimizer does not do this particular optimization. Fair enough!
    • gizmo6865 hours ago
      It&#x27;s not clear to me what optimizations the compiler actually did here. Years ago, I worked on a niche compiler, and was routinely surprised by what the optimizer was able to figure out; despite having personally written most of the optimization transformations myself.
      • steveklabnik57 minutes ago
        I can&#x27;t actually speak to the specifics here but usually this is &quot;idiom recognition&quot;, that is, it just notices that the pattern is there and transforms it directly.
    • Validark5 hours ago
      It might have more value than you think. If you look up SCEV in LLVM you&#x27;ll see it&#x27;s primarily used for analysis and it enables other optimizations outside of math loops that, by themselves, probably don&#x27;t show up very often.
  • bumholes10 hours ago
    The code that does this is here, if anyone is curious:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;llvm&#x2F;llvm-project&#x2F;blob&#x2F;release&#x2F;21.x&#x2F;llvm&#x2F;lib&#x2F;Analysis&#x2F;ScalarEvolution.cpp" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;llvm&#x2F;llvm-project&#x2F;blob&#x2F;release&#x2F;21.x&#x2F;llvm&#x2F;...</a><p><a href="https:&#x2F;&#x2F;github.com&#x2F;llvm&#x2F;llvm-project&#x2F;blob&#x2F;release&#x2F;21.x&#x2F;llvm&#x2F;lib&#x2F;Transforms&#x2F;Scalar&#x2F;IndVarSimplify.cpp" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;llvm&#x2F;llvm-project&#x2F;blob&#x2F;release&#x2F;21.x&#x2F;llvm&#x2F;...</a>
    • vodou9 hours ago
      Almost 16000 lines in a single source code file. I find this both admirable and unsettling.
      • loeg8 hours ago
        Does it really matter where the lines are? 16,000 lines is still 16,000 lines.
        • vodou8 hours ago
          Even though I do find your indifference refreshing I must say: it does matter for quite a few people.
          • neerajsi4 hours ago
            If you want recognize all the common patterns, the code can get very verbose. But it&#x27;s all still just one analysis or transformation, so it would be artificial to split into multiple files. I haven&#x27;t worked much in llvm, but I&#x27;d guess that the external interface to these packages is pretty reasonable and hides a large amount of the complexity that took 16kloc to implement
          • MobiusHorizons7 hours ago
            If you don’t rely on IDE features or completion plugins in an editor like vim, it can be easier to navigate tightly coupled complexity if it is all in one file. You can’t really scan it or jump to the right spot as easily as smaller files, but in vim searching for the exact symbol under the cursor is a single character shortcut, and that only works if the symbol is in the current buffer. This type of development works best for academic style code with a small number (usually one or two) experts that are familiar with the implementation, but in that context it’s remarkably effective. Not great for merge conflicts in frequently updated code though.
        • jiggawatts3 hours ago
          ... yes.<p>If it was 16K lines of modular &quot;compositional&quot; code, or a DSL that compiles in some provably-correct way, that would make me confident. A single file with 16K lines of -- let&#x27;s be honest -- unsafe procedural spaghetti makes me much less confident.<p>Compiler code tends to work &quot;surprisingly well&quot; because it&#x27;s beaten to death by millions of developers throwing random stuff at it, so bugs tend to be ironed out relatively quickly, unless you go off the beaten path... then it rapidly turns out to be a mess of spiky brambles.<p>The Rust development team for example found a series of LLVM optimiser bugs related to (no)aliasing, because C&#x2F;C++ didn&#x27;t use that attribute much, but Rust can aggressively utilise it.<p>I would be much more impressed by 16K lines of provably correct transformations with associated Lean proofs (or something), and&#x2F;or something based on EGG: <a href="https:&#x2F;&#x2F;egraphs-good.github.io&#x2F;" rel="nofollow">https:&#x2F;&#x2F;egraphs-good.github.io&#x2F;</a>
          • mananaysiempre3 hours ago
            On the other end of the optimizer size spectrum, a surprising place to find a DSL is LuaJIT’s “FOLD” stage: <a href="https:&#x2F;&#x2F;github.com&#x2F;LuaJIT&#x2F;LuaJIT&#x2F;blob&#x2F;v2.1&#x2F;src&#x2F;lj_opt_fold.c" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;LuaJIT&#x2F;LuaJIT&#x2F;blob&#x2F;v2.1&#x2F;src&#x2F;lj_opt_fold.c</a> (it’s just pattern matching, more or less, that the DSL compiler distills down to a perfect hash).
        • afiori7 hours ago
          Part of the issue is that it suggests that the code had a spaghettified growth; it is neither sufficient nor necessary but lacking external constraints (like an entire library developed as a single c header) it suggests that code organisation is not great.
          • anon2916 hours ago
            Hardware is often spaghetti anyway. There are a large number of considerations and conditions that can invalidate the ability to use certain ops, which would change the compilation strategy.<p>The idea of good abstractions and such falls apart the moment the target environment itself is not a good abstraction.
      • j-o-m3 hours ago
        I find the real question: are all 16,000 of those lines require to implement the optimization? How much of that is dealing with LLVM’s internal representation and the varying complexity of LLVM’s other internal structure?
      • zahlman9 hours ago
        I do too, but I&#x27;m pretty sure I&#x27;ve seen worse.
    • bitwizeshift6 hours ago
      Thank you, bumholes
  • JonChesterfield11 hours ago
    That one is called scalar evolution, llvm abbreviates it as SCEV. The implementation is relatively complicated.
  • gslin11 hours ago
    More similar optimizations: <a href="https:&#x2F;&#x2F;matklad.github.io&#x2F;2025&#x2F;12&#x2F;09&#x2F;do-not-optimize-away.html" rel="nofollow">https:&#x2F;&#x2F;matklad.github.io&#x2F;2025&#x2F;12&#x2F;09&#x2F;do-not-optimize-away.ht...</a>
    • wging5 hours ago
      The beginning of that article is slightly wrong: the compiler should compute N(N-1)&#x2F;2 (and does), because the original code adds up all the numbers from 0 to N <i>excluding N</i>. The usual formulation in math includes the upper bound: the sum of integers from 1 to N, <i>including N</i>, is N(N+1)&#x2F;2, so you have to replace N by (N-1) if you want a formula for the sum where the last number is N-1.
    • Lvl999Noob6 hours ago
      Couldn&#x27;t the compiler optimise this still? Make two versions of the function, one with constant folding and one without. Then at runtime, check the value of the parameter and call the corresponding version.
      • saagarjha4 hours ago
        Yes, a sufficiently smart compiler can always tell you’re doing a benchmark and delete it. It’s just unlikely.
  • vatsachak10 hours ago
    Compilers can add way more closed forms. Would it be worth it?<p><a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Wilf%E2%80%93Zeilberger_pair" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Wilf%E2%80%93Zeilberger_pair</a>
  • Neywiny8 hours ago
    I&#x27;m once again surprised at GCC being slower than clang. I would have thought that GCC, which had a 20? year head start would&#x27;ve made faster code. And yet, occasionally I look into the assembly and go &quot;what are you doing?&quot; And the same flags + source into clang is better optimized or uses better instructions or whatever. One time it was bit extraction using shifts. Clang did it in 2 steps: shift left, shift right. GCC did it in 3 I think? I think it maybe shifted right first or maybe did a logical instead of arithmetic and then sign extended. Point is, it was just slower.
    • saagarjha4 hours ago
      GCC and Clang are largely similar when it comes to performance as each implements passes the other does not. It’s always possible to find examples where they optimize a piece of code differently and one comes out ahead of the other.
    • stmw7 hours ago
      Compiler know-how and resources available during compilations made very signicant progress between gcc and LLVM&#x2F;clang era.<p>gcc was and is an incredible achievement, but it is traditionally considered difficult to implement many modern compiler techqniques in it. It&#x27;s at least unpleasant, let&#x27;s put it this way.
      • uecker6 hours ago
        Not sure whether this is generally true. GCC appears to have similar optimizations and I personally find LLVM&#x27;s code much more intimidating. But it is certainly true that LLVM seems to see more investment. I assume the license may also play a role. For comparison, here is some related code:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;gcc-mirror&#x2F;gcc&#x2F;blob&#x2F;master&#x2F;gcc&#x2F;tree-chrec.cc" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;gcc-mirror&#x2F;gcc&#x2F;blob&#x2F;master&#x2F;gcc&#x2F;tree-chrec...</a> <a href="https:&#x2F;&#x2F;github.com&#x2F;llvm&#x2F;llvm-project&#x2F;blob&#x2F;release&#x2F;21.x&#x2F;llvm&#x2F;lib&#x2F;Analysis&#x2F;ScalarEvolution.cpp" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;llvm&#x2F;llvm-project&#x2F;blob&#x2F;release&#x2F;21.x&#x2F;llvm&#x2F;...</a>
      • saagarjha4 hours ago
        GCC has almost the same modern compiler techniques implemented.
    • fweimer6 hours ago
      Did it involve bitfields? GCC is notoriously bad at optimizing them. There are some target-specific optimizations, but pretty much nothing in the middle-end.
      • Neywiny3 hours ago
        It did, yes. On an architecture without bit field extracts.
  • cjdell3 hours ago
    This is really bluring the line between implementation and specification. You may think you&#x27;re writing <i>the</i> implementation but it is really a proxy for the specification. In other words, the compiler creating an illusion of an imperative machine.
  • Validark5 hours ago
    What&#x27;s actually way cooler about this is that it&#x27;s generic. Anybody could pattern match the &quot;sum of a finite integer sequence&quot; but the fact that it&#x27;s general purpose is really awesome.
  • dejj12 hours ago
    It’s neat. I wonder if someone attempted detecting a graph coloring problem to replace it with a constant.
    • emih10 hours ago
      Graph coloring is NP-hard so it would be very difficult to replace it with an O(1) algorithm.<p>If you mean graph coloring restricted to planar graphs, yes it can always be done with at most 4 colors. But it could still be less, so the answer is not always the same.<p>(I know it was probably not a very serious comment but I just wanted to infodump about graph theory.)
  • MobiusHorizons7 hours ago
    I will admit I was initially surprised Matt was not already familiar with this behavior given his reputation. I remember discovering it while playing with llvm intermediate representation 10 years ago in college. I would never have considered myself very knowledgeable about modern compilers, and have never done any serious performance work. In that case it had solved a recursion to a simple multiplication, which completely surprised me. The fact that Matt did not know this makes me think this pass may only work on relatively trivial problems that he would never have written in the first place, and therefore never have witnessed the optimization.
    • pwdisswordfishy6 hours ago
      He was: he brought up the very same example in a talk in 2017.<p><a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=bSkpMdDe4g4&amp;t=2640" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=bSkpMdDe4g4&amp;t=2640</a>
      • MobiusHorizons5 hours ago
        Ah that makes much more sense. I guess he means the optimization is surprising when you first discover it, which it certainly was for me!
  • Animats5 hours ago
    That&#x27;s neat.<p>A hard problem in optimization today is trying to fit code into the things complex SSE-type instructions can do. Someone recently posted an example where they&#x27;d coded a loop to count the number of one bits in a word, and the compiler generated a &quot;popcount&quot; instruction. That&#x27;s impressive.
    • mattgodbolt3 hours ago
      It may be a different post, but I covered this earlier this month in the same series of blog posts&#x2F;YouTube videos.
  • tester7568 hours ago
    A lot of hardcoding, making expression consistent, e.g transforming a+3 into 3+a for easier pattern matching
  • j16sdiz9 hours ago
    The first thing I had in mind was: the final answer needed to be &#x2F;2. keeping the number before dividing not overflowing needs some tedious work
    • trehalose9 hours ago
      It&#x27;s not <i>very</i> tedious. Instead of dividing the product by 2, you can just divide whichever of x or x+1 is even by 2 before multiplying.
  • vardump8 hours ago
    Only thing that surprised me was that GCC didn&#x27;t manage to optimize it. I expected it to be able to do so.
  • g0wda11 hours ago
    If you now have a function where you call this one with an integer literal, you will end up with a fully inlined integer answer!
    • loeg8 hours ago
      Could do that whether SCEV’d or not with C++20 consteval, lol.
  • mgaunard12 hours ago
    Those are just basic and essential optimizations, nothing too surprising here.<p>The sum of integers is actually a question I ask developers in interviews (works well from juniors to seniors), with the extra problem of what happens if we were to use floating-point instead of integers.
    • zipy12411 hours ago
      To those who don&#x27;t know about compiler optimisation, the replacement with a closed form is rather suprising I&#x27;d say, especially if someone with Matt Godbolt&#x27;s experience of all people is saying it is surprising.<p>Also this series is targeted towards more of a beginner audience to compilers, thus its likely to be suprising to the audience, even if not to you.
      • mattgrice8 hours ago
        Gauss supposedly did it when he was 7. The hardest part for the compiler is figuring out that you have a loop that computes that sum and does nothing else important.
        • saagarjha3 hours ago
          Unfortunately I don’t have a hiring pipeline filled with Gausses
      • CorrectHorseBat8 hours ago
        It&#x27;s something we saw in highschool, I would expect anyone with a CS degree to recognize this optimization.<p>I barely know anything about compiler optimization, so I have no clue whether a compiler applying this optimization is surprising or something trivial.
        • saagarjha3 hours ago
          Implementing this in a compiler is nontrivial.
          • CorrectHorseBat3 hours ago
            Yes, that was clear to me from the article and the discussion. My point is that to someone who knows about Gauss&#x27; formula but doesn&#x27;t know anything about compilers might not understand what the fuss is about.
    • f1shy10 hours ago
      Yeah. Pretty basic. Just 14k LOC<p><a href="https:&#x2F;&#x2F;github.com&#x2F;llvm&#x2F;llvm-project&#x2F;blob&#x2F;release&#x2F;21.x&#x2F;llvm&#x2F;lib&#x2F;Analysis&#x2F;ScalarEvolution.cpp" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;llvm&#x2F;llvm-project&#x2F;blob&#x2F;release&#x2F;21.x&#x2F;llvm&#x2F;...</a>
      • jjmarr9 hours ago
        I would&#x27;ve assumed it was hardcoded. Not a generic solution for any loop involving a recurring variable.
    • nebezb10 hours ago
      <a href="https:&#x2F;&#x2F;www.npopov.com&#x2F;2023&#x2F;10&#x2F;03&#x2F;LLVM-Scalar-evolution.html" rel="nofollow">https:&#x2F;&#x2F;www.npopov.com&#x2F;2023&#x2F;10&#x2F;03&#x2F;LLVM-Scalar-evolution.html</a><p>“basic and essential” are interesting ways to describe the field of compiler optimization research.<p>Are you suggesting that the discovery and implementation of SCEV in LLVM is basic and essential? Or that summing integers in a range is basic and essential?
      • mgaunard5 hours ago
        I spoke in the context of coding those optimizations yourself.
    • ramraj0711 hours ago
      Im curious what exactly you ask here. I consider myself to be a decent engineer (for practical purposes) but without a CS degree, and I might likely have not passed that question.<p>I know compilers can do some crazy optimizations but wouldn&#x27;t have guessed it&#x27;ll transform something from O(n) to O(1). Having said that, I dont still feel this has too much relevance to my actual job for the most part. Such performance knowledge seems to be very abstracted away from actual programming by database systems, or managed offerings like spark and snowflake, that unless you intend to work on these systems this knowledge isn&#x27;t that useful (being aware they happen can be though, for sure).
      • scuff3d11 hours ago
        He thinks it makes him look clever, or more likely subtlety wants people to think &quot;wow, this guy thinks something is obvious when Matt Godbolt found it surprising&quot;.<p>This kind of question is entirely useless in an interview. It&#x27;s just a random bit of trivia that either a potential hire happen to have come across, or happens to remember from math class.
        • mgaunard5 hours ago
          I guess what&#x27;s surprising here is that compilers are able to perform those optimizations systematically on arbitrary code, not the optimizations themselves, which should be obvious to a human.
        • yeasku10 hours ago
          Trying to look smart by dissing Matt is not a good idea.
          • nickysielicki6 hours ago
            Have you considered that maybe Matt isn’t all that surprised by this optimization, but he is excited about how cool it is, and he wants readers of all backgrounds to also be excited about how cool it is, and is just feigning surprise so that he can share a sense of excitement with his audience?<p>It’s writing for effect.
            • yeasku5 hours ago
              Everybody who has seen any video of Matt knows that.<p>You can be surprised about things you know for years.<p>For example I am surprised every time I think about js coalescing even tougth I know it for decades.
              • nickysielicki5 hours ago
                The one that always gets me is what Truffle&#x2F;JRuby was capable of, ten years ago:<p><a href="https:&#x2F;&#x2F;x.com&#x2F;chrisgseaton&#x2F;status&#x2F;619885182104043520" rel="nofollow">https:&#x2F;&#x2F;x.com&#x2F;chrisgseaton&#x2F;status&#x2F;619885182104043520</a><p><a href="https:&#x2F;&#x2F;x.com&#x2F;chrisgseaton&#x2F;status&#x2F;619888649866448896" rel="nofollow">https:&#x2F;&#x2F;x.com&#x2F;chrisgseaton&#x2F;status&#x2F;619888649866448896</a>
          • mattgodbolt7 hours ago
            I dunno he can honestly be quite a jerk sometimes
          • f1shy10 hours ago
            AKA you get exactly the opposite…
        • nickysielicki10 hours ago
          Whether they get the question exactly right and can pinpoint the specific compiler passes or algebraic properties responsible for reductions like this is totally irrelevant and not what you’re actually looking for or asking about. It’s a very good jumping point for a conversation about optimization and testing whether they’re the type of developer who has ever looked at the assembly produced in their hotpath or not.<p>Anyone who dumbly suggests that loops in source code will always result in loops in assembly doesn’t have a clue. Anyone who throws their hands up and says, “I have no idea, but I wonder if there’s some loop invariant or algebraic trick that can be used to optimize this, let’s think about it out loud for a bit” has taken a compiler class and gets full marks. Anyone who says, “I dunno, let’s see what godbolt does and look through the llvm-opt pane” gets an explicit, “hire this one” in the feedback to the hiring manager.<p>It’s less about what they know and more about if they can find out.
          • scuff3d7 hours ago
            So in other words, it isn&#x27;t &quot;basic and essential optimizations&quot; that you would expect even a junior engineer to know (as your comment implies), but a mechanism to trigger a conversation to see how they think about problems. In fact, it sounds like something you <i>wouldn&#x27;t</i> expect them to know.
            • nickysielicki6 hours ago
              I didn’t write the GP comment. I wouldn’t call this basic and essential, but I would say that compilers have been doing similar loop simplifications for quite some time. I’d expect any mid to senior developer with C&#x2F;C++ on their resume to at least consider the possibility that the compiler can entirely optimize away a loop.<p>&gt; In fact, it sounds like something you wouldn&#x27;t expect them to know.<p>I’d go a step further, I don’t think <i>anyone</i>, no matter how experienced they are, can confidently claim that optimized assembly will or won’t be produced for a given loop. That’s why the best answer above is, “I dunno”. If performance really matters, you have to investigate and confirm that you’re getting good code. You can have an intuition for what you think <i>might</i> happen, and that’s a useful skill to have on its own, but it’s totally useless if you don’t also know how to confirm your suspicions.
              • mgaunard5 hours ago
                My question is in the context of doing those optimizations yourself, understanding what can be done to make the code more efficient and how to code it up, not the compiler engineering to make that happen.
                • nickysielicki4 hours ago
                  Yikes, gross. That’s like an option of last resort IMO. I’d rather maintain the clean loop-based code unless I had evidence that the compiler was doing the wrong thing and it was in my critical path.
                  • mgaunard4 hours ago
                    The compiler is only able to perform certain optimizations that have no observable behaviour.<p>For example it can only parallelize code which is inherently parallelizable to begin with, and unless you design your algorithm with that in mind, it&#x27;s unlikely to be.<p>My belief is that it&#x27;s better to be explicit, be it with low-level or high-level abstractions.
      • mgaunard5 hours ago
        My interview aims to assess whether the candidate understands that the dependency of each iteration on the previous one prevents effective utilization of a superscalar processor, knows the ways to overcome that, and whether the compiler is able to optimize that automatically, and if so when it absolutely cannot and why.<p>I generally focus more on sum of arbitrary data, but I used to also ask about a formulaic sum (linear to constant time) as an example of something a compiler is unlikely to do.<p>My thinking is that I expect good engineers to be able to do those optimizations themselves rather than rely on compilers.
    • phh10 hours ago
      Since GCC is lacking such an essential optimization, you should consider have one of your junior interviewee contribute this basic optimization mainline.
    • yeasku11 hours ago
      For Matt, the creator of compiler explorer, those are surprises.<p>For you are essentials.<p>You and the juniors you hire must have a deeper knoledge than him.
      • porise11 hours ago
        You don&#x27;t have to be an expert in compiler design to make godbolt in fairness, although he does know a lot.<p>I spend a lot of time looking at generated assembly and there are some more impressive ones.
        • yeasku11 hours ago
          As i said you must have a deeper knoledge than him.<p>It would be great if you shared it with the world like Matt does instead of being smug about it.
    • hypeatei11 hours ago
      What type of positions are you interviewing for? Software development is a big tent and I don&#x27;t think this would be pertinent in a web dev interview, for example.
    • bayesnet12 hours ago
      To provide the solution to the second part of the question, there is no closed-form solution. Since floating point math is not associative, there’s no O(1) optimization that can be applied that preserves the exact output of the O(n) loop.
      • zipy12411 hours ago
        Technically there is a closed form solution as long as the answer is less than 2^24 for a float32 or 2^53 for a float64, since below those all integers can be represented fully by a floating point number, and integer addition even with floating point numbers is identical if the result is below those caps. I doubt a compiler would catch that one, but it technically could do the optimisation and have the exact same bit answer. If result was intialised to a non-integer number this would not be true however of course.
        • bayesnet11 hours ago
          A very good point! I didn’t think of that.
      • dist-epoch10 hours ago
        This is why you have options like -ffast-math, to allow more aggressive but not 100% identical outcome optimizations.
    • f1shy10 hours ago
      I’m pretty sure making an algorithm that converts loops to close forms (I’m sure it detects much more than just a summation) is a little bit complicated.<p>Maybe you have much more experience than Mr Godbolt in compiliers.
    • xandrius11 hours ago
      Nothing is surprising once you know the answer. It takes some mental gymnastics to put yourself in someone else&#x27;s shoes before they discovered it and thus making it less &quot;basic&quot;.
    • rramadass9 hours ago
      Everyone knows the Gauss Summation formula for sum of n integers i.e. n*(n+1)&#x2F;2 but it is just nice to see it in GCC vs. Clang.
    • cratermoon11 hours ago
      <a href="https:&#x2F;&#x2F;xkcd.com&#x2F;1053&#x2F;" rel="nofollow">https:&#x2F;&#x2F;xkcd.com&#x2F;1053&#x2F;</a>
  • andrepd11 hours ago
    I&#x27;m actually surprised that gcc <i>doesn&#x27;t</i> do this! If there&#x27;s one thing compilers do well is pattern match on code patterns and replace with more efficient ones; just try pasting things from Hacker&#x27;s Delight and watch it always canonicalise it to the equivalent, fastest machine code.
    • nikic11 hours ago
      This particular case isn&#x27;t really due to pattern matching -- it&#x27;s a result of a generic optimization that evaluates the exit value of an add recurrence using binomial coefficients (even if the recurrence is non-affine). This means it will work even if the contents of the loop get more exotic (e.g. if you perform the sum over x * x * x * x * x instead of x).
    • f1shy10 hours ago
      Doing something like that with a pattern is obvious, but also useless, as it will catch very limited cases. The example presented, is known there is a closed form (it’s believed Gauss even discovered it being 6 yo). I’m sure this optimization will catch many other things, so is not trivial at all.
  • maximgeorge9 hours ago
    [flagged]
  • dist-epoch10 hours ago
    &gt; I love that despite working with compilers for more than twenty years, they can still surprise and delight me.<p>This kind of optimization, complete loop removal and computing the final value for simple math loops, is at least 10 years old.
    • f1shy10 hours ago
      10 years is not a lot. Is almost “yesterday” things being done in a field 10 years old, can still surprise experts in the field. With 30+ years experience I still find relatively new things, that are maybe 15 yo.<p>In topics like compiler optimization, is not like there are many books which describe this kind of algorithms.
    • nebezb10 hours ago
      Learning something old can be surprising. Enjoying that learning can be delightful.<p>Seems like the author is both surprised and delighted with an optimization they learned of today. Surely you’ve been in the same situation before.
  • phplovesong11 hours ago
    This exact content was posted a few months ago. Is this AI or just a copy paste job?
    • ForceBru7 hours ago
      You&#x27;re probably thinking of another post (<a href="https:&#x2F;&#x2F;xania.org&#x2F;202512&#x2F;11-pop-goes-the-weasel-er-count" rel="nofollow">https:&#x2F;&#x2F;xania.org&#x2F;202512&#x2F;11-pop-goes-the-weasel-er-count</a>) where an entire loop was optimized to a single instruction
    • mattgodbolt7 hours ago
      This exact content was only posted today? :)