It would surprise me if LLMs weren’t already in use in safety critical domains, at least depending on one’s definition of safety critical
Maybe I’m thinking of the term overly broadly, but for instance, I’d be surprised if governments weren’t already using LLMs as part of their intel-gathering and -analysis operations, which presumably affect some military decisions and (on some margin) who lives or dies. For consequential decisions, you’d of course hope there’s enough oversight where some LLM hallucinations don’t cause attacks/military actions that weren’t justified
For that specific example, I would not call it safety critical in the sense that you shouldn’t use an unreliable source. Intel involves lots of noisy and untrustworthy data, and indeed the job is making sense out of lots of conflicting and noisy signals. It doesn’t strike me that adding an LLM to the mix changes things all that much. It’s useful, it adds signal (presumably), but also is wrong sometimes—this is just what all the inputs are for an analyst.
Where I would say it crosses a line is if there isn’t a human analyst. If an LLM analyst was directly providing recommendations for actions that weren’t vetted by a human, yikes that seems super bad and we’re not ready for that. But I would be quite surprised if that were happening right now.
It would surprise me if LLMs weren’t already in use in safety critical domains, at least depending on one’s definition of safety critical
Maybe I’m thinking of the term overly broadly, but for instance, I’d be surprised if governments weren’t already using LLMs as part of their intel-gathering and -analysis operations, which presumably affect some military decisions and (on some margin) who lives or dies. For consequential decisions, you’d of course hope there’s enough oversight where some LLM hallucinations don’t cause attacks/military actions that weren’t justified
For that specific example, I would not call it safety critical in the sense that you shouldn’t use an unreliable source. Intel involves lots of noisy and untrustworthy data, and indeed the job is making sense out of lots of conflicting and noisy signals. It doesn’t strike me that adding an LLM to the mix changes things all that much. It’s useful, it adds signal (presumably), but also is wrong sometimes—this is just what all the inputs are for an analyst.
Where I would say it crosses a line is if there isn’t a human analyst. If an LLM analyst was directly providing recommendations for actions that weren’t vetted by a human, yikes that seems super bad and we’re not ready for that. But I would be quite surprised if that were happening right now.