What was your approach to benchmarking an adversarial agent?<p>This is an open problem that I came across (in a different domain), as the search space can be really wide. It's hard to measure results for non-trivial tasks.<p>Would be really interested if you can share your eval approach :)
> This won't be made available to anyone and everyone, but we do believe that responsible SMEs and midmarket companies also need access to these tools in order to identify key vulnerabilities in their systems; not just enterprises.<p>So this is the same policy that Anthropic and OpenAI have, it is just based on your criteria rather than theirs.
I think the policy universally makes sense, who would want to give a tool like this to bad actors? But it does leave a big section of the market underserved. Particularly when Mythos was made accessible to very large orgs and then Fable was pulled on export grounds.
As soon as I read that I literally scoffed. Doublethink at its finest. Doubleplusungood.
Fantastic. Could you share more details what it was like post-training a model?
Why create an offensive tool rather than a repo-scanning tool?<p>I can't think of any way to safely release an offensive tool publicly.
You need both, scanning for your own code, pen testing to actually prove vulnerabilities, otherwise it can be very noisy and one of the things that most tools currently suffer from is they give you too many false positives.
For the moment. The pen testing we gated it for now until we resolve the debate of safety.