Perhaps we end up in a regime where this is possible, but expensive enough that we can only afford a bit of it.
If that’s the case, the more sample-efficient the RL is, the better off we are — since as the total number of RL rewards we need decreases, we can afford to invest more in the quality of each one.
Even more speculative thoughts:
As an extreme example, you could imagine an AI that’s as sample-efficient as a human child. Suppose an AI company had a virtual copy of von Neumann’s brain age 5, which they could run at 10x speed, and had to raise him to become an intent-aligned adult.
The resources at their disposal would in principle enable a lot of oversight and thoughtful parenting and teaching, so in principle they could afford to do pretty well at this task for a while. It’s like a school with a 100-1 teacher-student ratio. They might fail in practice, of course.
Once his outputs become sufficiently hard to check, they might have a lot more difficulty, e.g. maybe it’s hard to tell whether his advice on the Manhattan Project is subtly wrong in a way designed to sabotage the project.
More speculative thoughts:
Perhaps we end up in a regime where this is possible, but expensive enough that we can only afford a bit of it.
If that’s the case, the more sample-efficient the RL is, the better off we are — since as the total number of RL rewards we need decreases, we can afford to invest more in the quality of each one.
Even more speculative thoughts:
As an extreme example, you could imagine an AI that’s as sample-efficient as a human child. Suppose an AI company had a virtual copy of von Neumann’s brain age 5, which they could run at 10x speed, and had to raise him to become an intent-aligned adult.
The resources at their disposal would in principle enable a lot of oversight and thoughtful parenting and teaching, so in principle they could afford to do pretty well at this task for a while. It’s like a school with a 100-1 teacher-student ratio. They might fail in practice, of course.
Once his outputs become sufficiently hard to check, they might have a lot more difficulty, e.g. maybe it’s hard to tell whether his advice on the Manhattan Project is subtly wrong in a way designed to sabotage the project.