I would have liked to see those who disagree with this comment engage with it more substantially. One reason I think that we’re likely to have a warning shot is that LLM-based AIs are pretty consistently overconfident. Also, AI Control schemas have a probabalistic chance of catching misaligned AIs.
I would have liked to see those who disagree with this comment engage with it more substantially. One reason I think that we’re likely to have a warning shot is that LLM-based AIs are pretty consistently overconfident. Also, AI Control schemas have a probabalistic chance of catching misaligned AIs.