I got to 43% p(Doom) by picking a very imprecise 50% based on feels. And then every few weeks something would happen in the news, and I would get more or less worried, and I would adjust it a few percent up and down. For a while it was up around the 70s, and now it’s down to 41%. I feel like the adjustments are more intellectually defensible than the original choice of number. So precision does not reflect accuracy.
My last two adjustments:
I move p(Doom) up every time something that was predicted years ago as part of a doom scenario actually happens. In this case, it was a measurement of the rate of Claude proposing evil courses of action. It was gradually increasing over the last few versions, and then suddenly dropped to zero. Did Claude become perfectly moral? No, it got smart enough to know when it was being tested, and was always going to be nice in that situation. I predicted this in something like 2002. It was creepy to see it happen.
I moved p(Doom) down when a bunch of prominent people signed a statement that we shouldn’t build superintelligences. The issue seems to be getting some traction, like nuclear disarmament did in the 1950s. It’s very preliminary, but moving in the right direction.
I got to 43% p(Doom) by picking a very imprecise 50% based on feels. And then every few weeks something would happen in the news, and I would get more or less worried, and I would adjust it a few percent up and down. For a while it was up around the 70s, and now it’s down to 41%. I feel like the adjustments are more intellectually defensible than the original choice of number. So precision does not reflect accuracy.
My last two adjustments:
I move p(Doom) up every time something that was predicted years ago as part of a doom scenario actually happens. In this case, it was a measurement of the rate of Claude proposing evil courses of action. It was gradually increasing over the last few versions, and then suddenly dropped to zero. Did Claude become perfectly moral? No, it got smart enough to know when it was being tested, and was always going to be nice in that situation. I predicted this in something like 2002. It was creepy to see it happen.
I moved p(Doom) down when a bunch of prominent people signed a statement that we shouldn’t build superintelligences. The issue seems to be getting some traction, like nuclear disarmament did in the 1950s. It’s very preliminary, but moving in the right direction.