Thank you so much for bringing up that paper and finding the exact page most relevant! I learned a lot reading those pages. You’re a true researcher, take my strong upvote.
My idea consists of a “hammer” and a “nail.” GDM’s paper describes a “hammer” very similar to mine (perhaps superior), but lacks the “nail.”
The fact the hammer they invented resembles the hammer I invented is evidence in favour of me: I’m not badly confused :). I shouldn’t be sad that my hammer invention already exists.[1]
The “nail” of my idea is making the Constitutional AI self-critique behave like a detective, using its intelligence to uncover the most damning evidence of scheming/dishonesty. This detective behaviour helps achieve the premises of the “Constitutional AI Sufficiency Theorem.”
The “hammer” of my idea is reinforcement learning to reward it for good detective work, with humans meticulously verifying its proofs (or damning evidence) of scheming/dishonesty.
Thank you so much for bringing up that paper and finding the exact page most relevant! I learned a lot reading those pages. You’re a true researcher, take my strong upvote.
My idea consists of a “hammer” and a “nail.” GDM’s paper describes a “hammer” very similar to mine (perhaps superior), but lacks the “nail.”
The fact the hammer they invented resembles the hammer I invented is evidence in favour of me: I’m not badly confused :). I shouldn’t be sad that my hammer invention already exists.[1]
The “nail” of my idea is making the Constitutional AI self-critique behave like a detective, using its intelligence to uncover the most damning evidence of scheming/dishonesty. This detective behaviour helps achieve the premises of the “Constitutional AI Sufficiency Theorem.”
The “hammer” of my idea is reinforcement learning to reward it for good detective work, with humans meticulously verifying its proofs (or damning evidence) of scheming/dishonesty.
It does seems like a lot of my post describes my hammer invention in detail, and is no longer novel :/