What are some important tasks you’ve found too cognitively taxing to get in the flow of doing?
One thing that I’d like to consider for Accelerating Alignment is to build tools that make it easier to get in the habit of cognitively demanding tasks by reducing the cognitive load necessary to do the task. This is part of the reason why I think people are getting such big productivity gains from tools like Copilot.
One way I try to think about it is like getting into the habit of playing guitar. I typically tell people to buy an electric guitar rather than an acoustic guitar because the acoustic is typically much more painful for your fingers. You are already doing a hard task of learning an instrument, try to reduce the barrier to entry by eliminating one of the causes of friction. And while you’re at it, don’t put your guitar in a case or in a place that’s out of your way, make it ridiculously easy to just pick up and play. In this example, it’s not cognitively taxing, but it is some form of tax that produces friction.
It is possible that we could have much more people tackling the core of alignment if it was less mentally demanding to get to that point and contribute to a solution. It’s possible that some level of friction for some tasks is making it so people are more likely to opt for what is easy (and potentially leads to fake progress on a solution to alignment). One such example might be understanding some difficult math. Another might be communicating your research in a way that is understandable to others.
I think it’s worth thinking in this frame when coming up with ways to accelerate alignment research by augmenting researchers.
For developing my hail mary alignment approach, the dream would be to be able to load enough of the context of the idea into a LLM that it could babble suggestions (since the whole doc won’t fit in the context window, maybe randomizing which parts beyond the intro are included for diversity?), then have it self-critique those suggestions automatically in different threads in bulk and surface the most promising implementations of the idea to me for review. In the perfect case I’d be able to converse with the model about the ideas and have that be not totally useless, and pump good chains of thought back into the fine-tuning set.
What are some important tasks you’ve found too cognitively taxing to get in the flow of doing?
One thing that I’d like to consider for Accelerating Alignment is to build tools that make it easier to get in the habit of cognitively demanding tasks by reducing the cognitive load necessary to do the task. This is part of the reason why I think people are getting such big productivity gains from tools like Copilot.
One way I try to think about it is like getting into the habit of playing guitar. I typically tell people to buy an electric guitar rather than an acoustic guitar because the acoustic is typically much more painful for your fingers. You are already doing a hard task of learning an instrument, try to reduce the barrier to entry by eliminating one of the causes of friction. And while you’re at it, don’t put your guitar in a case or in a place that’s out of your way, make it ridiculously easy to just pick up and play. In this example, it’s not cognitively taxing, but it is some form of tax that produces friction.
It is possible that we could have much more people tackling the core of alignment if it was less mentally demanding to get to that point and contribute to a solution. It’s possible that some level of friction for some tasks is making it so people are more likely to opt for what is easy (and potentially leads to fake progress on a solution to alignment). One such example might be understanding some difficult math. Another might be communicating your research in a way that is understandable to others.
I think it’s worth thinking in this frame when coming up with ways to accelerate alignment research by augmenting researchers.
For developing my hail mary alignment approach, the dream would be to be able to load enough of the context of the idea into a LLM that it could babble suggestions (since the whole doc won’t fit in the context window, maybe randomizing which parts beyond the intro are included for diversity?), then have it self-critique those suggestions automatically in different threads in bulk and surface the most promising implementations of the idea to me for review. In the perfect case I’d be able to converse with the model about the ideas and have that be not totally useless, and pump good chains of thought back into the fine-tuning set.