4 comments

  • WhiteDawn47 minutes ago
    First you need to get through the safety net. I’ve had many productive gpt5.4 sessions hit a roadblock of “ethicality” and pollute the context with multiple rounds of trying to convince it to continue
  • mertcikla1 hour ago
    why does this read like an openai ad?
  • nsingh22 hours ago
    These plots are terrible. Why is categorical data connected across categories with lines? Why not just use bar plots?<p>Like in the &quot;Web Vulns in OSS&quot; plot, white box data for Opus 4.7 is not available, but the absurd linear interpolation across categories implies it should be near 60.
    • scottyah1 hour ago
      It&#x27;s just an ad thinly disguised as useful data.
    • wmf1 hour ago
      I think the x axis is meant to be time but they screwed it up.
  • strange_quark1 hour ago
    Wasn&#x27;t it already confirmed that small open-weight models were able to detect most of the same headline vulns as mythos? How is this any different?
    • stanfordkid1 hour ago
      No, they are able to detect errors when pointed at them but they have a lot of false positives... making them functionally useless for a large unknown codebase. They also can&#x27;t build and run an exploit post-identification. Mythos can find vulnerabilities (purportedly) and actually validate them by building and running exploits. This makes it functional and usable for hacking.
    • nardons1 hour ago
      Do you have a source for this? Not doubting it, but I would like to have something concrete the next time the Mythos horse manure is cited.
      • skirmish6 minutes ago
        Probably this: <a href="https:&#x2F;&#x2F;aisle.com&#x2F;blog&#x2F;ai-cybersecurity-after-mythos-the-jagged-frontier" rel="nofollow">https:&#x2F;&#x2F;aisle.com&#x2F;blog&#x2F;ai-cybersecurity-after-mythos-the-jag...</a>