What’s worrisome about our current situation, however, is that we’re currently putting way more effort into making AIs smart than we are into making them benevolent.
Smart is too broad and encompasses too many aspects, into most of which we are not putting any effort either. Namely we put zero effort into properly motivating AIs to act reasonably in the real world—e.g. there is AIXI that wants it’s button pressed but would gladly destroy itself, but there isn’t equally formalized Clippy that wants to make real world paperclips and won’t destroy itself because that would undermine paperclip production. Sure, there’s the self driving car that gets from point A to point B in minimum time, but it can’t and wont search for ways to destroy obstacles on it’s path—it doesn’t really have a goal to move a physical box of metal from point A to point B, it’s solving various equations that do not actually capture the task well enough to have solutions such as ‘kill everyone to reduce traffic congestion’. If you had a genie and asked it to move metal from point A to point B, in magical stories, genie could do something unintended, while in reality you have to be awfully specific just to get it to work at all, and if you want it to look for ways to kill everyone to reduce traffic congestion, you need to be even more specific.
Also, the ‘smart’ does in fact encompass restrictions. It’s easy to make an AI that enumerates every possible solution—but it won’t be ‘smart’ on any given hardware. On given hardware, ‘smartness’ implies restrictions of the solution space. On any given hardware, the AI that is pondering how to kill everyone or how to create hyperdrive or the like (trillion irrelevancies which it won’t have computing time to make any progress at), when the task is driving the car, is dumber than AI that does not.
Smart is too broad and encompasses too many aspects, into most of which we are not putting any effort either. Namely we put zero effort into properly motivating AIs to act reasonably in the real world—e.g. there is AIXI that wants it’s button pressed but would gladly destroy itself, but there isn’t equally formalized Clippy that wants to make real world paperclips and won’t destroy itself because that would undermine paperclip production. Sure, there’s the self driving car that gets from point A to point B in minimum time, but it can’t and wont search for ways to destroy obstacles on it’s path—it doesn’t really have a goal to move a physical box of metal from point A to point B, it’s solving various equations that do not actually capture the task well enough to have solutions such as ‘kill everyone to reduce traffic congestion’. If you had a genie and asked it to move metal from point A to point B, in magical stories, genie could do something unintended, while in reality you have to be awfully specific just to get it to work at all, and if you want it to look for ways to kill everyone to reduce traffic congestion, you need to be even more specific.
Also, the ‘smart’ does in fact encompass restrictions. It’s easy to make an AI that enumerates every possible solution—but it won’t be ‘smart’ on any given hardware. On given hardware, ‘smartness’ implies restrictions of the solution space. On any given hardware, the AI that is pondering how to kill everyone or how to create hyperdrive or the like (trillion irrelevancies which it won’t have computing time to make any progress at), when the task is driving the car, is dumber than AI that does not.