It could be like verifying math test solutions. I’m not sure about the granularity of process based supervision, but it could be less weird if an AI just has to justify how it got to an answer rather than just giving out the answer.
It could be like verifying math test solutions. I’m not sure about the granularity of process based supervision, but it could be less weird if an AI just has to justify how it got to an answer rather than just giving out the answer.