that paperclips or self-replication or free energy are worth the side effect of murdering or causing the undue suffering of billions of conscious beings...
make a leap from “we have AGI or near-AGI that isn’t capable of causing much harm or suffering” to “we have an ASI that disregards human death and suffering for the sake of accomplishing some goal, because it is SOOOooo smart that it made some human-incomprehensible logical leap that actually a solar-system wide Dyson Sphere is more important than not killing billions of humans”.
1) One of the foundational insights of alignment discussions is the Orthogonality Thesis, which says that this is absolutely 100% allowed. You can be arbitrarily intelligent AND value arbitrary things. An arbitrary unaligned ASI values all of humanity at 0, so anything it values at ε > 0 is infinitely more valuable to it, and worth sacrificing all of humanity for it.
2) In no way are current LLMs even “moderately aligned”. The fact that current LLMs can be jailbroken to do things they were supposedly trained not to do should be more than enough counterevidence to make this obvious.
intelligent beings seem to trend towards conscientious behavior
3) There are highly intelligent human sociopaths, but that hardly matters: you’re comparing intelligent humans to intelligent aliens, and deciding that aliens must be humanlike and care about conscious beings, just because all examples of intelligence you’ve seen so far in reality are humans. You can’t generalize in this manner.
1) One of the foundational insights of alignment discussions is the Orthogonality Thesis, which says that this is absolutely 100% allowed. You can be arbitrarily intelligent AND value arbitrary things. An arbitrary unaligned ASI values all of humanity at 0, so anything it values at ε > 0 is infinitely more valuable to it, and worth sacrificing all of humanity for it.
2) In no way are current LLMs even “moderately aligned”. The fact that current LLMs can be jailbroken to do things they were supposedly trained not to do should be more than enough counterevidence to make this obvious.
3) There are highly intelligent human sociopaths, but that hardly matters: you’re comparing intelligent humans to intelligent aliens, and deciding that aliens must be humanlike and care about conscious beings, just because all examples of intelligence you’ve seen so far in reality are humans. You can’t generalize in this manner.