But I overall think working on alignment is largely more urgent. Being able to understand what’s going on at all inside a neural net, and advocating that companies be required to understand what’s going on before developing new/bigger/better models, seems like a convergent goal relevant to both human extinction and astronomical suffering.
Fwiw, Lukas’s comment link to a post arguing against that and I 100% agree with it. I think the “Alignment will solve s-risks as well anyway” is one the most untrue and harmful widespread memes in the EA/LW community.
Nod (fyi I vaguely remembered that comment but couldn’t find it a second time while I was writing my own answer)
I do think “AI targeted at optimizing a good goal” is more likely to near miss if precautions aren’t taken and I do think that’s quite important. I did carefully not say “alignment automatically solves s-risks”, I said it was a convergent goal that seemed more important to me overall. I do think that’s a reasonable thing to disagree on.
I suppose my shooting range metaphor falls short here. Maybe alignment is like teaching a kid to be an ace race car driver, and s-risks are accidents on normal roads. There it also depends on the details whether the ace race car driver will drive safely on normal roads.
Fwiw, Lukas’s comment link to a post arguing against that and I 100% agree with it. I think the “Alignment will solve s-risks as well anyway” is one the most untrue and harmful widespread memes in the EA/LW community.
Nod (fyi I vaguely remembered that comment but couldn’t find it a second time while I was writing my own answer)
I do think “AI targeted at optimizing a good goal” is more likely to near miss if precautions aren’t taken and I do think that’s quite important. I did carefully not say “alignment automatically solves s-risks”, I said it was a convergent goal that seemed more important to me overall. I do think that’s a reasonable thing to disagree on.
I suppose my shooting range metaphor falls short here. Maybe alignment is like teaching a kid to be an ace race car driver, and s-risks are accidents on normal roads. There it also depends on the details whether the ace race car driver will drive safely on normal roads.