How learning efficiently applies to alignment research
As we are trying to optimize for actually solving the problem, we should not fall into the trap of learning just to learn. We should instead focus on learning efficiently with respect to how it helps us generate insights that lead to a solution for alignment. This is also the framing we should have in mind when we are building tools for augmenting alignment researchers.
With the above in mind, I expect that part of the value of learning efficiently involves some of the following:
Efficient learning involves being hyper-focused on identifying the core concepts and how they all relate to one another. This mode of approaching things seems like it helps us attack the core of alignment much more directly and bypasses months/years of working on things that are only tangential.
Developing a foundation of a field seems key to generating useful insights. The goal is not to learn everything but to build a foundation that allows you to bypass spending way too much time tackling sub-optimal sub-problems or dead-ends for way too long. Part of the foundation-building process should reduce the time it shapes you into an exceptional alignment researcher rather than a knower-of-things.
As John Wentworth says with respect to the Game Tree of Alignment: “The main reason for this exercise is that (according to me) most newcomers to alignment waste years on tackling not-very-high-value sub-problems or dead-end strategies.”
Lastly, many great innovations have not come from unique original ideas. There’s an iterative process passed amongst researchers and it seems often the case that the greatest ideas come from simply merging ideas that were already lying around. Learning efficiently (and storing those learnings for later use) allows you to increase the number of ideas you can merge together. If you want to do that efficiently, you need to improve your ability to identify which ideas are worth storing in your mental warehouse to use for a future merging of ideas.
My model of (my) learning is that if the goal is sufficiently far, learning directly towards the goal is goodharting a likely wrong metric.
The only method which worked for me for very distant goals is following my curiosity and continuously internalizing new info, such that the curiosity is well informed about current state and the goal.
Curiosity is certainly a powerful tool for learning! I think any learning system which isn’t taking advantage of it is sub-optimal. Learning should be guided by curiosity.
The thing is, sometimes we need to learn things we aren’t so curious about. One insight I Iearned from studying learning is that you can do specific things to make yourself more curious about a given thing and harness the power that comes with curiosity.
Ultimately, what this looks like is to write down questions about the topic and use them to guide your curious learning process. It seems that this is how efficient top students end up learning things deeply in a shorter amount of time. Even for material they care little about, they are able to make themselves curious and be propelled forward by that.
That said, my guess is that goodharting the wrong metric can definitely be an issue, but I’m not convinced that relying on what makes you naturally curious is the optimal strategy for solving alignment. Either way, it’s something to think about!
By the way, I’ve just added a link to a video by a top competitive programmer on how to learn hard concepts. In the video and in the iCanStudy course, both talk about the concept of caring about what you are learning (basically, curiosity). Gaining the skill to care and become curious is an essential part of the most effective learning. However, contrary to popular belief, you don’t have to be completely guided by what makes you naturally curious! You can learn how to become curious (or care) about any random concept.
How learning efficiently applies to alignment research
As we are trying to optimize for actually solving the problem, we should not fall into the trap of learning just to learn. We should instead focus on learning efficiently with respect to how it helps us generate insights that lead to a solution for alignment. This is also the framing we should have in mind when we are building tools for augmenting alignment researchers.
With the above in mind, I expect that part of the value of learning efficiently involves some of the following:
Efficient learning involves being hyper-focused on identifying the core concepts and how they all relate to one another. This mode of approaching things seems like it helps us attack the core of alignment much more directly and bypasses months/years of working on things that are only tangential.
Developing a foundation of a field seems key to generating useful insights. The goal is not to learn everything but to build a foundation that allows you to bypass spending way too much time tackling sub-optimal sub-problems or dead-ends for way too long. Part of the foundation-building process should reduce the time it shapes you into an exceptional alignment researcher rather than a knower-of-things.
As John Wentworth says with respect to the Game Tree of Alignment: “The main reason for this exercise is that (according to me) most newcomers to alignment waste years on tackling not-very-high-value sub-problems or dead-end strategies.”
Lastly, many great innovations have not come from unique original ideas. There’s an iterative process passed amongst researchers and it seems often the case that the greatest ideas come from simply merging ideas that were already lying around. Learning efficiently (and storing those learnings for later use) allows you to increase the number of ideas you can merge together. If you want to do that efficiently, you need to improve your ability to identify which ideas are worth storing in your mental warehouse to use for a future merging of ideas.
My model of (my) learning is that if the goal is sufficiently far, learning directly towards the goal is goodharting a likely wrong metric.
The only method which worked for me for very distant goals is following my curiosity and continuously internalizing new info, such that the curiosity is well informed about current state and the goal.
Curiosity is certainly a powerful tool for learning! I think any learning system which isn’t taking advantage of it is sub-optimal. Learning should be guided by curiosity.
The thing is, sometimes we need to learn things we aren’t so curious about. One insight I Iearned from studying learning is that you can do specific things to make yourself more curious about a given thing and harness the power that comes with curiosity.
Ultimately, what this looks like is to write down questions about the topic and use them to guide your curious learning process. It seems that this is how efficient top students end up learning things deeply in a shorter amount of time. Even for material they care little about, they are able to make themselves curious and be propelled forward by that.
That said, my guess is that goodharting the wrong metric can definitely be an issue, but I’m not convinced that relying on what makes you naturally curious is the optimal strategy for solving alignment. Either way, it’s something to think about!
By the way, I’ve just added a link to a video by a top competitive programmer on how to learn hard concepts. In the video and in the iCanStudy course, both talk about the concept of caring about what you are learning (basically, curiosity). Gaining the skill to care and become curious is an essential part of the most effective learning. However, contrary to popular belief, you don’t have to be completely guided by what makes you naturally curious! You can learn how to become curious (or care) about any random concept.