Kaj_Sotala comments on Evaluating the feasibility of SI’s plan

Kaj_Sotala 10 Jan 2013 14:52 UTC
12 points
0
As for my own work for SI, I’ve been trying to avoid the assumption of there necessarily being a hard takeoff right away, and to somewhat push towards a direction that also considers the possibility of a safe singularity through an initial soft takeoff and more heuristic AGIs. (I do think that there will be a hard takeoff eventually, but an extended softer takeoff before it doesn’t seem impossible.) E.g. this is from the most recent draft of the Responses to Catastrophic AGI Risk paper:

As a brief summary of our views, in the medium term, we think that the proposals of AGI confinement (section 4.1.), Oracle AI (section 5.1.), and motivational weaknesses (section 5.6.) would have promise in helping create safer AGIs. These proposals share in common the fact that although they could help a cautious team of researchers create an AGI, they are not solutions to the problem of AGI risk, as they do not prevent others from creating unsafe AGIs, nor are they sufficient in guaranteeing the safety of sufficiently intelligent AGIs. Regulation (section 3.3.) as well as “merge with machines” (section 3.4.) proposals could also help to somewhat reduce AGI risk. In the long run, we will need the ability to guarantee the safety of freely-acting AGIs. For this goal, value learning (section 5.2.5.) would seem like the most reliable approach if it could be made to work, with human-like architectures (section 5.3.4.) a possible alternative which seems less reliable but possibly easier to build. Formal verification (section 5.5.) seems like a very important tool in helping to ensure the safety of our AGIs, regardless of the exact approach that we choose.

Here, “human-like architectures” also covers approaches such as OpenCog. To me, a two-pronged approach, both developing a formal theory of Friendliness, and trying to work with the folks who design heuristic AGIs to make them more safe, would seem like the best bet. Not only would it help to make the heuristic designs safer, it could also give SI folks the kinds of skills that would be useful in actually implementing their formally specified FAI later on.