307th comments on I Would Have Solved Alignment, But I Was Worried That Would Advance Timelines

307th 21 Oct 2023 9:01 UTC
5 points
4
> I just think it’s extraordinarily important to be doing things on a case-by-case basis here. Like, let’s say I want to work at OpenAI, with the idea that I’m going to advocate for safety-promoting causes, and take actions that are minimally bad for timelines.

Notice that this is phrasing AI safety and AI timelines as two equal concerns that are worth trading off against each other. I don’t think they are equal, and I think most people would have far better impact if they completely struck “I’m worried this will advance timelines” from their thinking and instead focused solely on “how can I make AI risk better”.

I considered talking about why I think this is the case psychologically, but for the piece I felt it was more productive to focus on the object level arguments for why the tradeoffs people are making are bad. But to go into the psychological component a bit:

-Loss aversion: The fear of making AI risk worse is greater than the joy of making it better.

-Status quo bias: Doing something, especially something like working on AI capabilities, is seen as giving you responsibility for the problem. We see this with rhetoric against AGI labs—many in the alignment community will level terrible accusations against them, all while having to admit when pressed that it is plausible they are making AI risk better.

-Fear undermining probability estimates: I don’t know if there’s a catchy phrase for this one but I think it’s real. The impacts of any actions you take will be very muddy, indirect, and uncertain, because this is a big, long term problem. When you are afraid, this makes you view uncertain positive impacts with suspicion and makes you see uncertain negative impacts as more likely. So people doubt tenuous contributions to AI safety like “AI capability researchers worried about AI risk lend credibility to the problem, thereby making AI risk better”, but view tenuous contributions to AI risk like “you publish a capabilities paper, thereby speeding up timelines, making AI risk worse” as plausible.
- Steven Byrnes 21 Oct 2023 12:14 UTC
  6 points
  2
  Parent
  Notice that this is phrasing AI safety and AI timelines as two equal concerns that are worth trading off against each other. I don’t think they are equal, and I think most people would have far better impact if they completely struck “I’m worried this will advance timelines” from their thinking and instead focused solely on “how can I make AI risk better”.
  This seems confused in many respects. AI safety is the thing I care about. I think AI timelines are a factor contributing to AI safety, via having more time to do AI safety technical research, and maybe also other things like getting better AI-related governance and institutions. You’re welcome to argue that shorter AI timelines other things equal do not make safe & beneficial AGI less likely—i.e., you can argue for: “Shortening AI timelines should be excluded from cost-benefit analysis because it is not a cost in the first place.” Some people believe that, although I happen to strongly disagree. Is that what you believe? If so, I’m confused. You should have just said it directly. It would make almost everything in this OP besides the point, right? I understood this OP to be taking the perspective that shortening AI timelines is bad, but the benefits of doing so greatly outweigh the costs, and the OP is mainly listing out various benefits of being willing to shorten timelines.
  Putting that aside, “two equal concerns” is a strange phrasing. The whole idea of cost-benefit analysis is that the costs and benefits are generally not equal, and we’re trying to figure out which one is bigger (in the context of the decision in question).
  If someone thinks that shortening AI timelines is bad, then I think they shouldn’t and won’t ignore that. If they estimate that, in a particular decision, they’re shortening AI timelines infinitesimally, in exchange for a much larger benefit, then they shouldn’t ignore that either. I think “shortening AI timelines is bad but you should completely ignore that fact in all your actions” is a really bad plan. Not all timeline-shortening actions have infinitesimal consequence, and not all associated safety benefits are much larger. In some cases it’s the other way around—massive timeline-shortening for infinitesimal benefit. You won’t know which it is in a particular circumstance if you declare a priori that you’re not going to think about it in the first place.
  …psychologically…
  I think another “psychological” factor is a deontological / Hippocratic Oath / virtue kind of thing: “first, do no harm”. Somewhat relatedly, it can come across as hypocritical if someone is building AGI on weekdays and publicly advocating for everyone to stop building AGI on weekends. (I’m not agreeing or disagreeing with this paragraph, just stating an observation.)
  We see this with rhetoric against AGI labs—many in the alignment community will level terrible accusations against them, all while having to admit when pressed that it is plausible they are making AI risk better.
  I think you’re confused about the perspective that you’re trying to argue against. Lots of people are very confident, including “when pressed”, that we’d probably be in a much better place right now if the big AGI labs (especially OpenAI) had never been founded. You can disagree, but you shouldn’t put words in people’s mouths.
  - 307th 21 Oct 2023 14:05 UTC
    1 point
    0
    Parent
    The focus of the piece is on the cost of various methods taken to slow down AI timelines, with the thesis being that across a wide variety of different beliefs about the merit of slowing down AI, these costs aren’t worth it. I don’t think it’s confused to be agnostic about the merits of slowing down AI when the tradeoffs being taken are this bad.
    
    Views on the merit of slowing down AI will be highly variable from person to person and will depend on a lot of extremely difficult and debatable premises that are nevertheless easy to have an opinion on. There is a place for debating all of those various premises and trying to nail down what exactly the benefit is of slowing down AI; but there is also a place for saying “hey, stop getting pulled in by that bike-shed and notice how these tradeoffs being taken are not worth it given pretty much any view on the benefit of slowing down AI”.
    
    > I think you’re confused about the perspective that you’re trying to argue against. Lots of people are very confident, including “when pressed”, that we’d probably be in a much better place right now if the big AGI labs (especially OpenAI) had never been founded. You can disagree, but you shouldn’t put words in people’s mouths.
    
    I was speaking from experience, having seen this dynamic play out multiple times. But yes, I’m aware that others are extremely confident in all kinds of specific and shaky premises.