Which one is more important: more parameters or more computation? (2021)

23 points by jxmorris121 day ago

3 comments

vorticalbox1 hour ago
This reminds me of <a href="https://dnhkng.github.io/posts/rys/" rel="nofollow">https://dnhkng.github.io/posts/rys/</a><p>David looks into the LLM finds the thinking layers and cut duplicates then and put them back to back.<p>This increases the LLM scores with basically no over head.<p>Very interesting read.
l4tq317 minutes ago
[dead]
34ylsh24 minutes ago
[flagged]