I think the skepticism about the protein folder was “we can’t make something effective because we can’t optimize enough / search hard enough”, where my skepticism about alignment is “we can’t make something aligned because we can’t aim optimization processes well enough”. Part of how we can’t aim search processes is that we don’t have easily testable proxy measurements that are bound up with alignment strongly enough. What would be the evaluation function for AlignmentFold?
I think the skepticism about the protein folder was “we can’t make something effective because we can’t optimize enough / search hard enough”, where my skepticism about alignment is “we can’t make something aligned because we can’t aim optimization processes well enough”. Part of how we can’t aim search processes is that we don’t have easily testable proxy measurements that are bound up with alignment strongly enough. What would be the evaluation function for AlignmentFold?