Having written all this down in one place, it’s hard not to feel some hopelessness that all of these problems can be made legible to the relevant people, even with a maximum plausible effort.
I think that a major focus should be on prioritizing these problems based on how plausible a story you can tell for a catastrophic outcome if the problem remains unsolved, conditional on an AI that is corrigible and aligned in the ordinary sense.
I suppose coming up with such a clear catastrophe story for a problem is more or less the same thing as legibilizing it, which reinforces my point from the previous thread that a priori, it seems likely to me that illegible problems won’t tend to be as important to solve.
The longer a problem has been floating around without anyone generating a clear catastrophe story for it, the greater probability we should assign that it’s a “terminally illegible” problem which just won’t cause a catastrophe if it’s unsolved.
Maybe it would be good to track how much time has been spent attempting to come up with a clear catastrophe story for each problem, so people can get a sense of when diminishing research returns are reached for a given problem? Perhaps researchers who make attempts should leave a comment in this thread indicating how much time they spent trying to generate catastrophe stories for each problem?
Perhaps it’s worth concluding on a point from a discussion between @WillPetillo and myself under the previous post, that a potentially more impactful approach (compared to trying to make illegible problems more legible), is to make key decisionmakers realize that important safety problems illegible to themselves (and even to their advisors) probably exist, therefore it’s very risky to make highly consequential decisions (such as about AI development or deployment) based only on the status of legible safety problems.
I still think the best way to do this is to identify at least one problem which initially seemed esoteric and illegible, and eventually acquired a clear and compelling catastrophe story. Right now this discussion all seems rather hypothetical. From my perspective, the problems on your list seem to fall into two rough categories: legible problems which seem compelling, and super-esoteric problems like “Beyond Astronomical Waste” which don’t need to be solved prior to creation of an aligned AI. Off the top of my head I haven’t noticed a lot of problems moving from one category to the other by my lights? So just speaking for myself, this list hasn’t personally convinced me that esoteric and illegible problems should receive much more scarce resources, although I admit I only took a quick skim.
Thanks for making this list!
I think that a major focus should be on prioritizing these problems based on how plausible a story you can tell for a catastrophic outcome if the problem remains unsolved, conditional on an AI that is corrigible and aligned in the ordinary sense.
I suppose coming up with such a clear catastrophe story for a problem is more or less the same thing as legibilizing it, which reinforces my point from the previous thread that a priori, it seems likely to me that illegible problems won’t tend to be as important to solve.
The longer a problem has been floating around without anyone generating a clear catastrophe story for it, the greater probability we should assign that it’s a “terminally illegible” problem which just won’t cause a catastrophe if it’s unsolved.
Maybe it would be good to track how much time has been spent attempting to come up with a clear catastrophe story for each problem, so people can get a sense of when diminishing research returns are reached for a given problem? Perhaps researchers who make attempts should leave a comment in this thread indicating how much time they spent trying to generate catastrophe stories for each problem?
I still think the best way to do this is to identify at least one problem which initially seemed esoteric and illegible, and eventually acquired a clear and compelling catastrophe story. Right now this discussion all seems rather hypothetical. From my perspective, the problems on your list seem to fall into two rough categories: legible problems which seem compelling, and super-esoteric problems like “Beyond Astronomical Waste” which don’t need to be solved prior to creation of an aligned AI. Off the top of my head I haven’t noticed a lot of problems moving from one category to the other by my lights? So just speaking for myself, this list hasn’t personally convinced me that esoteric and illegible problems should receive much more scarce resources, although I admit I only took a quick skim.