I’d suggest “AIs are trained, not designed” for a 5-word message to the public. Yes, that does mean that if we catch them doing something they shouldn’t, the best we can do to get them to stop is to let them repeat it then hit them with the software equivalent of a rolled up newspaper and tell them “Bad neural net!”, and hope they figure out what we’re mad about. So we have some control, but it’s not like an engineering process. [Admittedly this isn’t quite a fair description for e.g. Constitutional AI: that’s basically delegating the rolled-up-newspaper duty to a second AI and giving that one verbal instructions.]
I’d suggest “AIs are trained, not designed” for a 5-word message to the public. Yes, that does mean that if we catch them doing something they shouldn’t, the best we can do to get them to stop is to let them repeat it then hit them with the software equivalent of a rolled up newspaper and tell them “Bad neural net!”, and hope they figure out what we’re mad about. So we have some control, but it’s not like an engineering process. [Admittedly this isn’t quite a fair description for e.g. Constitutional AI: that’s basically delegating the rolled-up-newspaper duty to a second AI and giving that one verbal instructions.]
That’s actually a really clear mental image. For those conversations where I have a few sentences instead of five public-facing words, I might use it.