Would an elegant implementation of AGI be safe? What about an elegant and deeply understood implementation?
My impression is that it’s possible for an elegant and deeply understood implementation to block particular failure modes. An AI with a tightly bounded probability distribution can’t get pascal mugged.
Being well understood isn’t sufficient. Often it just means you know how the AI will kill you.
My impression is that it’s possible for an elegant and deeply understood implementation to block particular failure modes. An AI with a tightly bounded probability distribution can’t get pascal mugged.
Being well understood isn’t sufficient. Often it just means you know how the AI will kill you.
And “Safe” implies blocking all failure modes.