I think the correct solution to the strawberry problem would also involve a ton of instrumental convergence? You’d need to collect resources to do research/engineering, then set up systems to experiment with strawberries/biotech, then collect generally applicable information on strawberry duplication, and then apply that to duplicate the strawberry.
If I ask an AI to duplicate a strawberry and it takes of the world, I would consider that misaligned. Obviously it will require some instrumental convergence (resources, intelligence, etc) to duplicate a strawberry. An aligned AI should either duplicate the strawberry while staying within a “budget” for how many resources it consumes, or say “I’m sorry, I can’t do that”.
I would recommend you read my post on corrigibility which describes how we can mathematically define a tradeoff between success and resource exploitation.
I think the correct solution to the strawberry problem would also involve a ton of instrumental convergence? You’d need to collect resources to do research/engineering, then set up systems to experiment with strawberries/biotech, then collect generally applicable information on strawberry duplication, and then apply that to duplicate the strawberry.
If I ask an AI to duplicate a strawberry and it takes of the world, I would consider that misaligned. Obviously it will require some instrumental convergence (resources, intelligence, etc) to duplicate a strawberry. An aligned AI should either duplicate the strawberry while staying within a “budget” for how many resources it consumes, or say “I’m sorry, I can’t do that”.
I would recommend you read my post on corrigibility which describes how we can mathematically define a tradeoff between success and resource exploitation.