Thank you for the detailed critique! I agree with it, except for 2) being not another point of divergence, but a point where mankind might have returned back on a track similar to ours. What I envisioned was alternate-universe Replika rediscovering the LLMs,[1] driving users to suicide and conservatives or the USG raising questions about instilling human values into LLMs. Alas, this scenario is likely implausible, as evidenced by the lack of efforts to deal with Meta’s unaligned chatbots.
As for 1), the bus factor is hard to determine. What Yudkowsky did was to discover the AGI risks and to either accelerate the race or manage it in a safer way. Any other person capable of independently discovering the AI risks was likely infected[2] with Yudkowsky’s AI risk-related memes. But we don’t know the amount of these other people capable of discovering the risks.
P.S. The worst-case scenario is that the event of a Yudkowsky-like figure EVER emerging was highly unlikely. In this case the event could itself arguably be evidence that the world is a simulation.
In addition, users might express interest in making the companions smart enough to, say, write an essay for them or to check kids’ homework. If Replika did it, the LLMs would have to be scaled up and up...
Edited to add: For comparison, when making my post about colonialism in space, I wasn’t aware about Robin Hanson having made a similar model. What I did differently was to produce an argument potentially implying that there are two attractors and that one can align the AIs to one of the attractors even if alignment in SOTA meaning stays unsolved.
Thank you for the detailed critique! I agree with it, except for 2) being not another point of divergence, but a point where mankind might have returned back on a track similar to ours. What I envisioned was alternate-universe Replika rediscovering the LLMs,[1] driving users to suicide and conservatives or the USG raising questions about instilling human values into LLMs. Alas, this scenario is likely implausible, as evidenced by the lack of efforts to deal with Meta’s unaligned chatbots.
As for 1), the bus factor is hard to determine. What Yudkowsky did was to discover the AGI risks and to either accelerate the race or manage it in a safer way. Any other person capable of independently discovering the AI risks was likely infected[2] with Yudkowsky’s AI risk-related memes. But we don’t know the amount of these other people capable of discovering the risks.
P.S. The worst-case scenario is that the event of a Yudkowsky-like figure EVER emerging was highly unlikely. In this case the event could itself arguably be evidence that the world is a simulation.
In addition, users might express interest in making the companions smart enough to, say, write an essay for them or to check kids’ homework. If Replika did it, the LLMs would have to be scaled up and up...
Edited to add: For comparison, when making my post about colonialism in space, I wasn’t aware about Robin Hanson having made a similar model. What I did differently was to produce an argument potentially implying that there are two attractors and that one can align the AIs to one of the attractors even if alignment in SOTA meaning stays unsolved.