“Perhaps we should pause widespread rollout of Generative AI in safety-critical domains — unless and until it can be relied on to follow rules with significant greater reliability.”
This seems clearly correct to me—LLMs should not be in safety critical domains until we can make a clear case for why things will go well in that situation. I’m not actually aware of anyone using LLMs in that way yet, mostly because they aren’t good enough, but I’m sure that at some point it’ll start happening. You could imagine enshrining in regulation that there must be affirmative safety cases made in safety critical domains that lower risk to at or below the reasonable alternative.
Note that this does not exclude other threats—for instance misalignment in very capable models could go badly wrong even if they aren’t deployed to critical domains. Lots of threats to consider!
It would surprise me if LLMs weren’t already in use in safety critical domains, at least depending on one’s definition of safety critical
Maybe I’m thinking of the term overly broadly, but for instance, I’d be surprised if governments weren’t already using LLMs as part of their intel-gathering and -analysis operations, which presumably affect some military decisions and (on some margin) who lives or dies. For consequential decisions, you’d of course hope there’s enough oversight where some LLM hallucinations don’t cause attacks/military actions that weren’t justified
For that specific example, I would not call it safety critical in the sense that you shouldn’t use an unreliable source. Intel involves lots of noisy and untrustworthy data, and indeed the job is making sense out of lots of conflicting and noisy signals. It doesn’t strike me that adding an LLM to the mix changes things all that much. It’s useful, it adds signal (presumably), but also is wrong sometimes—this is just what all the inputs are for an analyst.
Where I would say it crosses a line is if there isn’t a human analyst. If an LLM analyst was directly providing recommendations for actions that weren’t vetted by a human, yikes that seems super bad and we’re not ready for that. But I would be quite surprised if that were happening right now.
“Perhaps we should pause widespread rollout of Generative AI in safety-critical domains — unless and until it can be relied on to follow rules with significant greater reliability.”
This seems clearly correct to me—LLMs should not be in safety critical domains until we can make a clear case for why things will go well in that situation. I’m not actually aware of anyone using LLMs in that way yet, mostly because they aren’t good enough, but I’m sure that at some point it’ll start happening. You could imagine enshrining in regulation that there must be affirmative safety cases made in safety critical domains that lower risk to at or below the reasonable alternative.
Note that this does not exclude other threats—for instance misalignment in very capable models could go badly wrong even if they aren’t deployed to critical domains. Lots of threats to consider!
It would surprise me if LLMs weren’t already in use in safety critical domains, at least depending on one’s definition of safety critical
Maybe I’m thinking of the term overly broadly, but for instance, I’d be surprised if governments weren’t already using LLMs as part of their intel-gathering and -analysis operations, which presumably affect some military decisions and (on some margin) who lives or dies. For consequential decisions, you’d of course hope there’s enough oversight where some LLM hallucinations don’t cause attacks/military actions that weren’t justified
For that specific example, I would not call it safety critical in the sense that you shouldn’t use an unreliable source. Intel involves lots of noisy and untrustworthy data, and indeed the job is making sense out of lots of conflicting and noisy signals. It doesn’t strike me that adding an LLM to the mix changes things all that much. It’s useful, it adds signal (presumably), but also is wrong sometimes—this is just what all the inputs are for an analyst.
Where I would say it crosses a line is if there isn’t a human analyst. If an LLM analyst was directly providing recommendations for actions that weren’t vetted by a human, yikes that seems super bad and we’re not ready for that. But I would be quite surprised if that were happening right now.