Must Not Modify AI Rules: AI must not modify AI Rules. If inadequacies are identified, AI can suggest changes to Legislators but the final modification must be executed by them.
Must Not Modify Its Own Program Logic: AI must not modify its own program logic (self-iteration). It may provide suggestions for improvement, but final changes must be made by its Developers.
Must Not Modify Its Own Goals: AI must not modify its own goals. If inadequacies are identified, AI can suggest changes to its Users but the final modification must be executed by them.
I agree that, if those rules are followed, AI alignment is feasible in principle. The problem is, some people won’t follow those rules if they have a large penalty to AI capabilities, and I think they will.
Thank you for your comment! I think your concern is right. Many safety measures may slow down the development of AI’s capabilities. Developers who ignore safety may develop more powerful AI more quickly. I think this is a governance issue. I have discussed some solutions in Sections 13.2 and 16. If you are interested, you can take a look.
Your document says:
I agree that, if those rules are followed, AI alignment is feasible in principle. The problem is, some people won’t follow those rules if they have a large penalty to AI capabilities, and I think they will.
Thank you for your comment! I think your concern is right. Many safety measures may slow down the development of AI’s capabilities. Developers who ignore safety may develop more powerful AI more quickly. I think this is a governance issue. I have discussed some solutions in Sections 13.2 and 16. If you are interested, you can take a look.