I’m thinking of a slightly different plan than “increase the rate of people being able to think seriously about the problem” I’d like to convince people who already understand the problem to accept that pause is unlikely and alignment is not known to be impossibly hard even on short timelines. …
...Getting entirely new people to understand the hard parts of the problem and then understand all of the technical skills or theoretical subtleties is another route. I haven’t thought as much about that one because I don’t have a public platform,
I think it’s useful to think of “rate of competent people think seriously about the right problems” is, like, the “units” of success for various flavors of plans here. There are different bottlenecks.
I currently think the rate-limiting reagent is “people who understand the problem”. And I think that’s in turn rate-limited on:
“the problem is sort of wonky and hard with bad feedbackloops and there’s a cluster of attitudes and skills you need to have any traction sitting and grokking the problem.”
“we don’t have much ability to evaluate progress on the problem, which in turn means it’s harder to provide a good funding/management infrastructure for it.”
Better education can help with your first problem, although that pulls people who understand the problem away from working on it.
I agree that the difficulty of evaluating progress is a big problem. One solution is to just fund more alignment research. I am dismayed if it’s true that Open Phil is holding back available funding because they don’t see good projects. Just fund them and get more donations later when the whole world is properly more freaked out. If it’s bad research now, at least those people will spend some time thi8nking about and debating what might be better research.
I’d also love to see funding directly on people understanding the whole problem including the several hard parts. It is a lot easier to evaluate whether someone is learning a curriculum than doing good research. Exposing people to a lot of perspectives and arguments and sort of paying and forcing them to think hard about it should at least improve their choice of research and understanding of the problem.
I definitely agree that understanding the problem is the rate-limiting factor. I’d argue that it’s not just the technical problem you need to understand, but the surrounding factors, eg how likely is a pause or slowdown and how likely is it we reach AGI how soon on the default path. I’m afraid some of our best technical thinkers understand the technical problem but are confused about how unlikely it is that any approach but directLLM descendents will be the first critical attempt at aligning AGI. But arguments for or against that are quite complex.
I think it’s useful to think of “rate of competent people think seriously about the right problems” is, like, the “units” of success for various flavors of plans here. There are different bottlenecks.
I currently think the rate-limiting reagent is “people who understand the problem”. And I think that’s in turn rate-limited on:
“the problem is sort of wonky and hard with bad feedbackloops and there’s a cluster of attitudes and skills you need to have any traction sitting and grokking the problem.”
“we don’t have much ability to evaluate progress on the problem, which in turn means it’s harder to provide a good funding/management infrastructure for it.”
Better education can help with your first problem, although that pulls people who understand the problem away from working on it.
I agree that the difficulty of evaluating progress is a big problem. One solution is to just fund more alignment research. I am dismayed if it’s true that Open Phil is holding back available funding because they don’t see good projects. Just fund them and get more donations later when the whole world is properly more freaked out. If it’s bad research now, at least those people will spend some time thi8nking about and debating what might be better research.
I’d also love to see funding directly on people understanding the whole problem including the several hard parts. It is a lot easier to evaluate whether someone is learning a curriculum than doing good research. Exposing people to a lot of perspectives and arguments and sort of paying and forcing them to think hard about it should at least improve their choice of research and understanding of the problem.
I definitely agree that understanding the problem is the rate-limiting factor. I’d argue that it’s not just the technical problem you need to understand, but the surrounding factors, eg how likely is a pause or slowdown and how likely is it we reach AGI how soon on the default path. I’m afraid some of our best technical thinkers understand the technical problem but are confused about how unlikely it is that any approach but directLLM descendents will be the first critical attempt at aligning AGI. But arguments for or against that are quite complex.