I think that this might not end up being a problem if the value learning agent can communicate with Alice (e.g. in the context of CIRL). If they don’t get any info from moral philosophers, then they should probably maximise something like the expectation of her utility function for the same reason that Alice does. If they do get info, they can just give Alice that info, see what she does, and act accordingly. I think the real problem comes in in the realistic case where Alice isn’t handling moral uncertainty perfectly, so the value learning agent shouldn’t actually maximise the weighted sum of the utility functions she’s uncertain over.
I think that this might not end up being a problem if the value learning agent can communicate with Alice (e.g. in the context of CIRL). If they don’t get any info from moral philosophers, then they should probably maximise something like the expectation of her utility function for the same reason that Alice does. If they do get info, they can just give Alice that info, see what she does, and act accordingly. I think the real problem comes in in the realistic case where Alice isn’t handling moral uncertainty perfectly, so the value learning agent shouldn’t actually maximise the weighted sum of the utility functions she’s uncertain over.