Yes, landmines is the last level of defence, which have very low probability to work (like 0.1 per cent). However, If AI is stable to all possible philosophical landmines, it is a very stable agent and has higher chances to keep its alignment and do not fail catastrophically.
Yes, landmines is the last level of defence, which have very low probability to work (like 0.1 per cent). However, If AI is stable to all possible philosophical landmines, it is a very stable agent and has higher chances to keep its alignment and do not fail catastrophically.