Personal fit. Surely, some people have tried working on s-risks in different roles for some substantial period of time but haven’t found an angle from which they can contribute given their particular skills.
Related to the “personal fit” explanation: I’d argue that the skills required to best reduce s-risks have much overlap with the skills to make alignment progress (see here).
At least, I think this goes for directly AI-related s-risks, which I consider most concerning, but I put significantly lower probabilities on them than you do.
For s-risks conditioned on humans staying in control over the future, we maybe wouldn’t gain much from explicitly modelling AI takeoff and engaging in all the typical longtermist thought. Therefore, some things that reduce future disvalue don’t have to look like longtermism? For instance, common sense ways to improve society’s rationality, coordination abilities, and values. (Maybe there’s a bit of leverage to gain from thinking explicitly about how AI will change things.) The main drawback to those types of interventions is (1) disvalue at stake might be smaller than the disvalue for directly AI-related s-risks conditional on the scenarios playing out, and (2) it only matters how society thinks and what we value if humans actually stay in control over the future, which arguably seems pretty unlikely.
Yeah… When it comes to the skill overlap, having alignment research aided by future pre-takeoff AIs seems dangerous. Having s-risk research aided that way seems less problematic to me. That might make it accessible (now or in a year) for people who have struggled with alignment research. I also wonder whether there is maybe still more time for game-theoretic research in s-risks than three is in alignment. The s-risk-related problems might be easier, so they can perhaps still be solved in time. (NNTR, just thinking out loud.)
Personal fit. Surely, some people have tried working on s-risks in different roles for some substantial period of time but haven’t found an angle from which they can contribute given their particular skills.
Related to the “personal fit” explanation: I’d argue that the skills required to best reduce s-risks have much overlap with the skills to make alignment progress (see here).
At least, I think this goes for directly AI-related s-risks, which I consider most concerning, but I put significantly lower probabilities on them than you do.
For s-risks conditioned on humans staying in control over the future, we maybe wouldn’t gain much from explicitly modelling AI takeoff and engaging in all the typical longtermist thought. Therefore, some things that reduce future disvalue don’t have to look like longtermism? For instance, common sense ways to improve society’s rationality, coordination abilities, and values. (Maybe there’s a bit of leverage to gain from thinking explicitly about how AI will change things.) The main drawback to those types of interventions is (1) disvalue at stake might be smaller than the disvalue for directly AI-related s-risks conditional on the scenarios playing out, and (2) it only matters how society thinks and what we value if humans actually stay in control over the future, which arguably seems pretty unlikely.
Yeah… When it comes to the skill overlap, having alignment research aided by future pre-takeoff AIs seems dangerous. Having s-risk research aided that way seems less problematic to me. That might make it accessible (now or in a year) for people who have struggled with alignment research. I also wonder whether there is maybe still more time for game-theoretic research in s-risks than three is in alignment. The s-risk-related problems might be easier, so they can perhaps still be solved in time. (NNTR, just thinking out loud.)