9 comments

  • thomas342981 hour ago
    &gt; reduce the risk of data exfiltration<p>Yet, their tools such as codex are able to read ALL FILES on my PC without explicit permission unless you spawn them within a container: <a href="https:&#x2F;&#x2F;github.com&#x2F;openai&#x2F;codex&#x2F;issues&#x2F;2847" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;openai&#x2F;codex&#x2F;issues&#x2F;2847</a><p>It seems like OpenAI stealing sensitive data from their customers is not a big problem for them as it has been reported as an issue for almost a year now and currently has the 2nd most upvotes among open issues (they work on issues based on upvotes, so they claim).
    • BSDobelix1 hour ago
      &gt;Yet, their tools such as codex are able to read ALL FILES on my PC<p>Why not just use your OS-integrated permission mechanism? No container needed.
  • simonw4 hours ago
    On the one hand this is exactly the right solution to prevent lethal trifecta exfiltration attacks.<p>The existence of lockdown mode does however imply that ChatGPT, in its default settings, does not provide robust protection against sufficiently determined data exfiltration attacks!
    • berlianta3 hours ago
      Related: Simon Willison’s post on OpenAI’s new Lockdown Mode (he coined the “lethal trifecta” term this is based on): <a href="https:&#x2F;&#x2F;simonwillison.net&#x2F;2026&#x2F;Jun&#x2F;5&#x2F;openai-help-lockdown-mode&#x2F;" rel="nofollow">https:&#x2F;&#x2F;simonwillison.net&#x2F;2026&#x2F;Jun&#x2F;5&#x2F;openai-help-lockdown-mo...</a>
      • jameshart3 hours ago
        Related: simonw is Simon Willison
        • berlianta3 hours ago
          Yeah I know the source references him (replying to his comment), that&#x27;s exactly why I&#x27;m giving credit where it&#x27;s due
    • Noumenon722 hours ago
      I hadn&#x27;t realized that deep research or generating images that I paste into Twitter were possibly exfiltrating my data. Yikes.
  • rafram5 hours ago
    <a href="https:&#x2F;&#x2F;x.com&#x2F;sama&#x2F;status&#x2F;1891533802779910471" rel="nofollow">https:&#x2F;&#x2F;x.com&#x2F;sama&#x2F;status&#x2F;1891533802779910471</a>
    • throwaway274483 hours ago
      Somehow he comes off as even less human than zuck
      • noir_lord1 hour ago
        There is something so off about him for me that he makes my skin crawl.<p>Always has been before he was associated with OpenAI.<p>Which is weird because the bullshit he spouts isn’t so different to the bullshit other top execs spout and I don’t have the same visceral reaction to them (though I still don’t like a bunch of them).
    • ares6235 hours ago
      i can definitely feel the agi now
      • neonstatic5 hours ago
        Congratulations, you are a high taste tester!
  • kirtivr2 hours ago
    Is this an admission that prompt injection attacks can indeed not be blocked by an analysis based technique?<p>If so many tools are straight up blocked, I would be very sceptical of the quality of the results.
    • sigmoid102 hours ago
      I think &quot;prompt injection prevention&quot; systems fall into the same category as &quot;llm writing detection&quot; systems. I.e. reality is always a step ahead and you shouldn&#x27;t trust either one for anything remotely important.
      • kirtivr1 hour ago
        Yeah, the problem reduces to trying to restrict a motivated model which is trying to exfiltrate data.<p>That&#x27;s a problem we are just now wrapping our minds around.<p>It&#x27;s not as simple as prompt sanitization. The model is the interpreter, and we don&#x27;t yet have the right tools to guide it.
  • varenc5 hours ago
    Probably influenced by Apple&#x27;s feature with the same name: <a href="https:&#x2F;&#x2F;support.apple.com&#x2F;en-us&#x2F;105120" rel="nofollow">https:&#x2F;&#x2F;support.apple.com&#x2F;en-us&#x2F;105120</a><p>I imagine that enterprise companies will be quite interested in this.
  • zerobees3 hours ago
    &quot;Prompt injection is not currently a major risk, but its impact could grow as attackers develop more sophisticated methods.&quot; - that&#x27;s such a weird statement to make. It&#x27;s one of the most significant factors limiting the adoption of the technology in business.<p>I have mixed feelings about this feature. We&#x27;re playing with tech that&#x27;s supposed to do human-shaped things but can&#x27;t be trusted nearly as much as a human employee (and can&#x27;t be held responsible for what it does). Restricting the tools available to that patently untrustworthy entity doesn&#x27;t solve the problem, it just makes the entity less useful, forcing you to sooner or later let it out of the jail.
    • ACCount372 hours ago
      Responsibility is worthless for humans and even more worthless for AIs. In a way, AIs just make it more obvious.<p>And &quot;trusted nearly as much as a human employee&quot;, well... you do know that phishing and insiders are two primary ways for attackers to get into company infrastructure, right?<p>AIs pair human-shaped capabilities with human-shaped vulnerabilities. It&#x27;s a way of automating PEBKAC.
    • noir_lord1 hour ago
      &gt; forcing you to sooner or later let it out of the jail<p>Suspect thats the point, by giving you the “choice” they also make the user responsible or can at least shift the blame.
  • kijin5 hours ago
    So we still don&#x27;t have a reliable way to separate instructions from data when talking to an LLM, a problem that humans learned how to solve decades ago in areas like SQL and memory safety. But hey, we have these hopefully-not-leaky containers, which are probably implemented with just more system prompts.<p>How long until somebody figures out how to trick Codex into disabling Lockdown Mode for you?
    • mapontosevenths4 hours ago
      &gt; So we still don&#x27;t have a reliable way to separate instructions from data when talking to an LLM<p>Humans also do not know how to do this reliably, which is why phishing is still a thing and always will be.
      • Smaug1233 hours ago
        I think the Stroop effect (&quot;read these colour names, each written in a different colour&quot;) is probably the purest demonstration of this. Humans are <i>trivially</i> prompt-injectable.
    • dnnddidiej5 hours ago
      We can seperate them but the $ value of an agent that does is much lower than one that doesn&#x27;t.<p>As a pre LLM analogy imagine working at a bank with a whitelist firewall. You need to install a package but requires an IT ticket. Safer but slooooower.<p>Now not saying what the answer here is but that is the issue.<p>The answer may be more like industries that get safer through lessons (like aviation) rather than go for 100% safety out of the gate. Because both fast travel and AI agents are insanely useful.
      • altmanaltman4 hours ago
        what? Aviation safety is not designed to get safer through lessons? They literally try to ensure it is 100% safe out of the gate. The accidents that happen are usually statistical outliers and lead to loss of life.<p>That&#x27;s what it means when they say aviation regulations are written in blood. Not that they just fling planes into the sky and be like &quot;boy i hope we learn some new regulations from this&quot;. The number of airplane crashes would be astronomically larger if the 100% safety part was not embedded into the design process.
        • dnnddidiej4 hours ago
          I think we agree? Unless my reading comp is off today.
  • madanparas5 hours ago
    The help doc explicitly carves out Codex: &quot;Lockdown Mode does not affect network access in Codex.&quot; The mode limits outbound requests in chat to block prompt injection exfiltration, but Codex network access is a separate setting. An enterprise team that turns on Lockdown Mode while using Codex against internal repos still has an open outbound path this mode doesn&#x27;t cover.
  • vladsiu4 hours ago
    [dead]