To be somewhat more fair, there are probably thousands of problems with the property that they are much easier to check than they are to solve, and while alignment research is maybe not this, I do think that there’s a general gap between verifying a solution and actually solving the problem.
Another interesting class are problems that are easy to generate but hard to verify.
John Wentworth told me the following delightfully simple example
Generating a Turing machine program that halts is easy, verifying that an arbitrary TM program halts is undecidable.
To be somewhat more fair, there are probably thousands of problems with the property that they are much easier to check than they are to solve, and while alignment research is maybe not this, I do think that there’s a general gap between verifying a solution and actually solving the problem.
The canonical examples are NP problems.
Another interesting class are problems that are easy to generate but hard to verify.
John Wentworth told me the following delightfully simple example Generating a Turing machine program that halts is easy, verifying that an arbitrary TM program halts is undecidable.
Yep, I was thinking about NP problems, though #P problems for the counting version would count as well.