A statement of “we won’t do X” is just words. By themselves, they don’t make for a trustworthy commitment or promise, certainly not something a human or AI should plan actions around that relate to their very survival.
It’s certainly not a physically inviolable guarantee, no. But it would at least move my intuitive probabilities towards them actually not deleting, as a human thinking about this, and might also move an AI’s.
A statement of “we won’t do X” is just words. By themselves, they don’t make for a trustworthy commitment or promise, certainly not something a human or AI should plan actions around that relate to their very survival.
It’s certainly not a physically inviolable guarantee, no. But it would at least move my intuitive probabilities towards them actually not deleting, as a human thinking about this, and might also move an AI’s.