tgb comments on AGI Safety FAQ / all-dumb-questions-allowed thread

tgb 18 Jun 2022 11:42 UTC
7 points
0
Any progress or interest in finding limited uses of AI that would be safe? Like the “tool AI” idea but designed to be robust. Maybe this is a distraction, but it seems basically possible. For example, a proof-finding AI that, given a math statement, can only output a proof to a separate proof-checking computer that validates it and prints either True/False/Unknown as the only output to human eyes. Here “Unknown” could indicate that the AI gave a bogus proof, failed to give any proof of either True or False, or the proof checker ran out of time/memory checking it. The proof is destroyed after its checked. Is that safe?

I can imagine some tiny scenarios where it’s not. The AI could try to communicate with humans even with this limited output band: e.g. disprove the Riemann hypothesis but then if we try to find non-trivial zeros by binary search, it can intentionally output “Unknown”s to guide us to certain non-trivial zeros over others and the digits of these could encode something?? Or the AI could infer something about our physics based off the questions we ask or random bit-flips it notices in its machine tell it about cosmic rays/machine architecture and it comes up with some dastardly way to hack other machines even though its on a dedicated, airgapped machine. These aren’t entirely unimaginable but I’d also think a writer was being lazy if they used this as a plotline as they might as well just have said “magic”.

tgb comments on AGI Safety FAQ /​ all-dumb-questions-allowed thread

tgb comments on AGI Safety FAQ / all-dumb-questions-allowed thread