3 comments

  • fc417fc80247 minutes ago
    &gt; they lack an explicit architecture for the executive control of attention found in humans<p>Deceptive terminology strikes again! The &quot;attention&quot; mechanism in transformers appears (to my understanding at least) to have about as much to do with human attention as the &quot;neurons&quot; in a multi-layer perceptron have to do with biological neurons.<p>That said, the core premise of building in something that mimics executive function is an intriguing one (which I assume has been explored before but it&#x27;s not something I&#x27;m familiar with).
  • ivanvoid1 hour ago
    this is a nice study but i don’t think it’s actually good argument
  • quotemstr1 hour ago
    The first thing I do when I see a paper that claims transformers fundamentally can&#x27;t do X or Y is to look at the models under test:<p>&gt; To evaluate generalizability, we conducted tests of GPT-5 (41), Claude Opus 4.1 (42), and Gemini 2.5 Pro (43) from 2025 September<p>The problem with empirical negative results on LLMs is that they can&#x27;t rule out that the alleged deficiencies disappear with increased scale and the right fine-tuning. It&#x27;s like saying my dog has trouble with subject-verb agreement, so meat brains are &quot;fundamentally limited in their capacity for grammar&quot;.<p>I can accept that current LLMs (even latest generation) might exhibit cognitive gaps similar to those we see in humans with deficient executive function, I can&#x27;t accept these gaps as evidence of fundamental limits of the transformer <i>architecture</i>. LLMs are universal function approximators. Executive function is a function. Yes, yes, it&#x27;s well-known that transformers have a circuit complexity limit set by layer count and whatever. The limit disappears once you allow for autoregression. Nobody cares about the limits of AI inside a single forward pass.<p>I have high confidence that with the right sort of training, executive function gaps in LLM can be addressed. I&#x27;m not convinced that the problem is the architecture per se.
    • derbOac8 minutes ago
      You might be completely correct, although my hunch is this is something that would require a change in architecture rather than increases in scale.<p>The failure points happen in a fairly simple task (Stroop) with increases in repetition of trials. It&#x27;s not like the number of colors or color words is increasing, which is the sort of thing I might expect if it had to do with the size of the LLM.<p>On the other hand who knows. I agree that model scale changes make a lot of things a moving target.<p>At first I thought this paper was kind of odd, but then I felt like it was maybe possibly onto something important. Intuitively I could see the possibility that whatever is causing this failure in the Stroop task might be related to the tendency of LLMs to be &quot;derailable&quot;.