4 comments

  • rbanffy0 minutes ago
    Interesting and fun read - we are well into the terrain of what was completely impossible to do back then. Now I can't wait to see a faster AppleSoft ROM ;-)
  • HarHarVeryFunny56 minutes ago
    If you assume that A * 10 isn&#x27;t going to overflow, so that ASL A moves 0 into the carry flag (so no need for CLC), then instead of using the undocumented RRA opcode, you can just do:<p>sta $00<p>asl a<p>asl a<p>adc $00<p>asl a<p>This is also 7 bytes, but is faster since adc $00 is 3 cycles, vs rra $00 being 5 cycles.<p>The A = max(A, X) example is certainly interesting, but not very useful since it loops through the code twice (very slow) and assumes that $8a is available. The much faster obvious version only adds one byte:<p>stx $00<p>cmp $00<p>bcs done<p>txa<p>done:
    • Aransentin36 minutes ago
      Sure. Note that I picked those examples to demonstrate the two fairly quirky classes of things the optimizer tends to find. If the programmer has different requirements they can specify that, and it&#x27;ll spit out the examples you gave (or something equivalent).
      • HarHarVeryFunny20 minutes ago
        I like the idea of exhaustive search, which the simplicity of the 6502 seems ideally suited for, but the search speed seems a bit limiting. I wonder if there&#x27;s not potential for more generation restriction (e.g. code can only use a specific N bytes of zero page) and heavy search pruning to speed it up? If it could generate optimal 20-30 op sequences in semi-reasonable time that&#x27;d make it very useful.
        • Aransentin13 minutes ago
          Having it only use operations that use a specific set of zero-page addresses is already supported, yep!<p>20-30 ops is probably impossible, unfortunately. The combinatorial explosion is just too enormous.
          • peterfirefly6 minutes ago
            e-graphs have been repeatedly reinvented across decades for many purposes. One of them is superoptimization.<p><a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;E-graph" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;E-graph</a><p><a href="https:&#x2F;&#x2F;www.cs.cornell.edu&#x2F;courses&#x2F;cs6120&#x2F;2025sp&#x2F;blog&#x2F;superopt&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.cs.cornell.edu&#x2F;courses&#x2F;cs6120&#x2F;2025sp&#x2F;blog&#x2F;supero...</a><p><a href="https:&#x2F;&#x2F;github.com&#x2F;philzook58&#x2F;awesome-egraphs" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;philzook58&#x2F;awesome-egraphs</a>
  • kstrauser1 hour ago
    That’s incredibly clever and a fun read. Well done!<p>I imagine lots of demo coders glancing back and forth between that writeup and their own carefully hand-tuned assembly.
    • Aransentin1 hour ago
      Thank you!<p>Demo coding is indeed the primary usecase for this, and the reason for why I started tinkering on it in the first place. That, and people who make homebrewed NES &#x2F; C64 video games should find it fairly useful for optimizing tight loops and such.
  • vibecoderking9350 minutes ago
    Great