8 comments

  • dwa359221 hours ago
    Why weren't these attacks tested on the frontier models? The models they tested these on can also be fooled by poems and rhymes.
  • simonw22 hours ago
    It concerns me that anyone with anything important to protect might trust what this paper calls &quot;Injection detectors deployed to protect LLM agents&quot; - Llama Guard and the like.<p>There are unlimited combinations of tokens that can be used to attack an LLM system. The idea that some kind of &quot;detector&quot; can catch them all just feels inherently absurd to me.
  • buppermint22 hours ago
    The paper title is a bit misleading. The tested detectors and models here are small and rather dated (Llama 3.1 8B and Gemini Flash 2.0 - these are basically in the level of a modern 1B model), and the actual paper says this only shows vulnerability in small model systems.
  • BarryMilo22 hours ago
    This is an &quot;uh oh&quot; moment, isn&#x27;t it?
  • yurukusa21 hours ago
    [flagged]
  • [flagged]
  • hottrends21 hours ago
    [flagged]
  • aaditya7919 hours ago
    [dead]