>As originally conceived, this is sort of like a “dangerous capability” eval for steg.I am actually just about to start building something very similar to this for the AISI’s evals bounty program.
>As originally conceived, this is sort of like a “dangerous capability” eval for steg.
I am actually just about to start building something very similar to this for the AISI’s evals bounty program.