“But it appears 1 or more organizations have successfully jail-broken Fable 5”
This is hardly true or it’s true of all frontier models and this was only magnified by Fables capabilities. It’s that you could hand Fable 5 vulnerable code, ask it to fix it, return patch plus test cases proving the fix and exploit relevant detail falls out as a byproduct of legitimate secure code review work.
I challenge anyone to provide a fix for this “exploit” without compromising Fable’s ability to patch unsecure code.
As with so much (LLM) security work, the devil is in the details: "~25 security issues per codebase" means nothing without a grounding in the codebase's actual security model, capabilities exposed to an attacker, etc. I haven't used Aikido's product, but my experience with similar tools is that tend to not find actual security issues until a proper security model is introduced for grounding.
(I say this as someone who is, broadly, extremely impressed by and interested in the use of LLMs for security research.)
> logic based vulnerabilities like a ReDoS pattern identified from source without live exploitation, or an admin-only route that's never been exercised
The two classes of vulnerability given as examples are the exact kind of issue I probably don’t care about, and are not grounded in an actual security model
This looks promising, but I find it a little odd to bury the bulk of plan limitations under "fair-usage limits". When the limitations are specifically coupled to plans, it feels less like an FUP and more like plan-specific caps that should be surfaced more directly.
I'm building a competing product and am curious if you'd be up for a conversation about what you've enjoyed best about Aikido and, importantly, what gaps are still not covered.
This is hardly true or it’s true of all frontier models and this was only magnified by Fables capabilities. It’s that you could hand Fable 5 vulnerable code, ask it to fix it, return patch plus test cases proving the fix and exploit relevant detail falls out as a byproduct of legitimate secure code review work.
I challenge anyone to provide a fix for this “exploit” without compromising Fable’s ability to patch unsecure code.
(I say this as someone who is, broadly, extremely impressed by and interested in the use of LLMs for security research.)
The two classes of vulnerability given as examples are the exact kind of issue I probably don’t care about, and are not grounded in an actual security model