2 comments

  • yobbo3 hours ago
    Looks very nice, but I can&#x27;t find numerical gradient checks, which is helpful when verifying that backward pass is correct:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;markusheimerl&#x2F;gpt&#x2F;blob&#x2F;main&#x2F;transformer&#x2F;attention&#x2F;attention.c" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;markusheimerl&#x2F;gpt&#x2F;blob&#x2F;main&#x2F;transformer&#x2F;a...</a>
    • markusheimerl47 minutes ago
      I deleted the numerical checks a while back after confirming the backward pass is correct to keep the code base lean - running <a href="https:&#x2F;&#x2F;github.com&#x2F;markusheimerl&#x2F;gpt&#x2F;blob&#x2F;main&#x2F;transformer&#x2F;attention&#x2F;test.c" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;markusheimerl&#x2F;gpt&#x2F;blob&#x2F;main&#x2F;transformer&#x2F;a...</a> is also somewhat of a confirmation that the backward pass is correct, since an analytically incorrect backward pass cant fit perfectly to synthetic data.