Wei Dai comments on The Commitment Races problem

Wei Dai 14 Jul 2023 23:27 UTC
LW: 2 AF: 2
0
AF

But yeah also I think that AGIs will be by default way better than humans at this sort of stuff.

What’s your reasons for thinking this? (Sorry if you already explained this and I missed your point, but it doesn’t seem like you directly addressed my point that if AGIs learn from or defer to humans, they’ll be roughly human-level at this stuff?)

When you say “the top tier of rational superintelligences exploits everyone else” I say that is analogous to “the most rational/clever/capable humans form an elite class which rules over and exploits the masses.” So I’m like yeah, kinda sorta I expect that to happen, but it’s typically not that bad?

I think it could be much worse than current exploitation, because technological constraints prevent current exploiters from extracting full value from the exploited (have to keep them alive for labor, can’t make them too unhappy or they’ll rebel, monitoring for and repressing rebellions is costly). But with superintelligence and future/acausal threats, an exploiter can bypass all these problems by demanding that the exploited build an AGI aligned to itself and let it take over directly.
- Daniel Kokotajlo 15 Jul 2023 13:41 UTC
  LW: 2 AF: 2
  0
  AF Parent
  I agree that if AGIs defer to humans they’ll be roughly human-level, depending on which humans they are deferring to. If I condition on really nasty conflict happening as a result of how AGI goes on earth, a good chunk of my probability mass (and possibly the majority of it?) is this scenario. (Another big chunk, possibly bigger, is the “humans knowingly or unknowingly build naive consequentialists and let rip” scenario, which is scarier because it could be even worse than the average human, as far as I know). Like I said, I’m worried.
  
  If AGIs learn from humans though, well, it depends on how they learn, but in principle they could be superhuman.
  
  Re: analogy to current exploitation: Yes there are a bunch of differences which I am keen to study, such as that one. I’m more excited about research agendas that involve thinking through analogies like this than I am about what people interested in this topic seem to do by default, which is think about game theory and Nash bargaining and stuff like that. Though I do agree that both are useful and complementary.