Error

LW server reports: not allowed.

This probably means the post has been deleted or moved back to the author's drafts.

Yair Halberstadt 24 Jun 2022 6:38 UTC
10 points
If an army of human level AGIs could work together to solve problems we currently can’t superhumanly fast, then they combined would effectively be an AGI, and we would have to make sure they were aligned with us first.
- joshc 24 Jun 2022 18:21 UTC
  1 point
  Parent
  I know that this is a common argument against amplification, but I’ve never found it super compelling.
  
  People often point to evil corporations to show that unaligned behavior can emerge from aligned humans, but I don’t think this analogy is very strong. Humans in fact do not share the same goals and are generally competing with each other over resources and power, which seems like the main source of inadequate equilibria to me.
  If everyone in the world was a copy of Eliezer, I don’t think we would have a coordination problem around building AGI. They would probably have an Eliezer government that is constantly looking out for emergent misalignment and suggesting organizational changes to squash it. Since everyone in this world is optimizing for making AGI go well and not for profit or status among their Eliezer peers, all you have to do is tell them what the problem is and what they need to do to fix it. You don’t have to threaten them with jail time or worry that they will exploit loopholes in Eliezer law.
  
  I think it is quite likely that I am missing something here and it would be great if you could flush this argument out a little more or direct me towards a post that does.
Daniel Kokotajlo 24 Jun 2022 16:30 UTC
2 points
typo: Ajaya should be Ajeya
- joshc 25 Jun 2022 1:20 UTC
  1 point
  Parent
  Thanks!
Ericf 24 Jun 2022 12:09 UTC
1 point
Humans are trained on a tiny unique subset of available training data. I would expect multiple instances of a set of AI software trained on close to the same set of data to think very similar to each-other, and not provide more creative capability than a single AI with more bandwidth.
- joshc 24 Jun 2022 17:45 UTC
  1 point
  Parent
  That’s a good point. I guess I don’t expect this to be a big problem because:
  1. I think 1,000,000 copies of myself could still get a heck of a lot done.
  2. The first human-level AGI might be way more creative than your average human. It would probably be trained on data from billions of humans, so all of those different ways of thinking could be latent in the model.
  2. The copies can potentially diverge. I’m expecting the first transformative model to be stateful and be able to meta-learn. This could be as simple as giving a transformer read and write access to an external memory and training it over longer time horizons. The copies could meta-learn on different data and different sub-problems and bring different perspectives to the table.