The difference with normal software is that at least somebody understands every individual part, and if you collected all those somebodies and locked them in a room for a while they could write up a full explanation. Whereas with AI I think we’re not even like 10% of the way to full understanding.
Also, if you’re trying to align a superintelligence, you do have to get it right on the first try, otherwise it kills you with no counterplay.
I’m having trouble seeing how this works. Regardless of whether C is in the pool, I run a 5% risk of halving my wealth by taking the first gamble. I think the safety net metaphor only makes sense if the outcome can’t be worse than C, but in this example it seems like there’s a hole in the net I can fall through.