I think it’s absolutely feasible, but my idea of what a solution looks like is probably in a minority (if I had to guess, maybe of ~30%?)
All you have to do is understand what it is you mean by the AI fulfilling human values, in a way that can be implemented in the architecture and training procedure of a prosaic AI. Easy peasy, lemon squeezy.
The majority of other feasible-ers is mostly dominated by Paulians right now, who want to solve the problem without having to understand that complicated human values thing. Typically by trusting in humans and giving them big awesome planning powers, or using their oversight and feedback to choose good things.
I think it’s absolutely feasible, but my idea of what a solution looks like is probably in a minority (if I had to guess, maybe of ~30%?)
All you have to do is understand what it is you mean by the AI fulfilling human values, in a way that can be implemented in the architecture and training procedure of a prosaic AI. Easy peasy, lemon squeezy.
The majority of other feasible-ers is mostly dominated by Paulians right now, who want to solve the problem without having to understand that complicated human values thing. Typically by trusting in humans and giving them big awesome planning powers, or using their oversight and feedback to choose good things.
Thank you for the reply!