Oh, I already completely agree with that. But quite frankly I don’t have the skills to contribute to AI development meaningfully in a technical sense, or the right kind of security mindset to think anyone should trust me to work on safety research. And of course, all the actual plans I’ve seen anyone talk about are full of holes, and many seem to rely on something akin to safety-by-default for at least part of the work, whether they admit it or not. Which I hope ends up not being true, but if someone decides to roll the dice on the future that way, then it’s best to try to load the dice at least a little with higher-quality writing on what humans think and want for themselves and the future.
And yeah, I agree you should be worried about this getting so many upvotes, including mine. I sure am. I place this kind of writing under why-the-heck-not-might-as-well. There aren’t anywhere near enough people or enough total competence trying to really do anything to make this go well, but there are enough that new people trying more low-risk things is likely to be either irrelevant or net-positive. Plus I can’t really imagine ever encountering a plan, even a really good one, where this isn’t a valid rejoinder:
Agreed! This is net useful. As long as nobody relies on it. Like every other approach to alignment, to differing degrees.
WRT you not having the skills to help: if you are noting holes in plans, you are capable of helping. Alignment has not been reduced to a technical problem; it has many open conceptual problems, ranging from society-level to more technical/fine-grained theory. Spotting holes in plans and clearly explaining why they are that is among th most valuable work. As far as I know, nobody has a full plan that works if the technical part is done well. So helping with plans is absolutely crucial.
Volunteer effort on establishing and improving plans is among the most important work. We shouldn’t assum that the small teams within orgs are going to do this conceptual work adequately. It should be open-sourced and have as much volunteer help as possible. As long as it’s effort toward deconfusion, and it’s reasonably well-thought-out and communicated, it’s net helpful, and this type of effort could make the difference.
Oh, I already completely agree with that. But quite frankly I don’t have the skills to contribute to AI development meaningfully in a technical sense, or the right kind of security mindset to think anyone should trust me to work on safety research. And of course, all the actual plans I’ve seen anyone talk about are full of holes, and many seem to rely on something akin to safety-by-default for at least part of the work, whether they admit it or not. Which I hope ends up not being true, but if someone decides to roll the dice on the future that way, then it’s best to try to load the dice at least a little with higher-quality writing on what humans think and want for themselves and the future.
And yeah, I agree you should be worried about this getting so many upvotes, including mine. I sure am. I place this kind of writing under why-the-heck-not-might-as-well. There aren’t anywhere near enough people or enough total competence trying to really do anything to make this go well, but there are enough that new people trying more low-risk things is likely to be either irrelevant or net-positive. Plus I can’t really imagine ever encountering a plan, even a really good one, where this isn’t a valid rejoinder:
Agreed! This is net useful. As long as nobody relies on it. Like every other approach to alignment, to differing degrees.
WRT you not having the skills to help: if you are noting holes in plans, you are capable of helping. Alignment has not been reduced to a technical problem; it has many open conceptual problems, ranging from society-level to more technical/fine-grained theory. Spotting holes in plans and clearly explaining why they are that is among th most valuable work. As far as I know, nobody has a full plan that works if the technical part is done well. So helping with plans is absolutely crucial.
Volunteer effort on establishing and improving plans is among the most important work. We shouldn’t assum that the small teams within orgs are going to do this conceptual work adequately. It should be open-sourced and have as much volunteer help as possible. As long as it’s effort toward deconfusion, and it’s reasonably well-thought-out and communicated, it’s net helpful, and this type of effort could make the difference.