Proposal for “Open Problems in Friendly AI”

lukeprog1 Jun 2012 2:06 UTC

33 points

Series: How to Purchase AI Risk Reduction

One more project SI is considering...

When I was hired as an intern for SI in April 2011, one of my first proposals was that SI create a technical document called Open Problems in Friendly Artificial Intelligence. (Here is a preview of what the document would be like.)

When someone becomes persuaded that Friendly AI is important, their first question is often: “Okay, so what’s the technical research agenda?”

So You Want to Save the World maps out some broad categories of research questions, but it doesn’t explain what the technical research agenda is. In fact, SI hasn’t yet explained much of the technical research agenda yet.

Much of the technical research agenda should be kept secret for the same reasons you might want to keep secret the DNA for a synthesized supervirus. But some of the Friendly AI technical research agenda is safe to explain so that a broad research community can contribute to it.

This research agenda includes:

Second-order logical version of Solomonoff induction.
Non-Cartesian version of Solomonoff induction.
Construing utility functions from psychologically realistic models of human decision processes.
Formalizations of value extrapolation. (Like Christiano’s attempt.)
Microeconomic models of self-improving systems (e.g. takeoff speeds).
...and several others open problems.

The goal would be to define the open problems as formally and precisely as possible. Some will be more formalizable than others, at this stage. (As a model for this kind of document, see Marcus Hutter’s Open Problems in Universal Induction and Intelligence.)

Nobody knows the open problems in Friendly AI research better than Eliezer, so it would probably be best to approach the project this way:

Eliezer spends a month writing an “Open Problems in Friendly AI” sequence for Less Wrong.
Luke organizes a (fairly large) research team for presenting these open problems with greater clarity and thoroughness, in the mainstream academic form.
These researchers collaborate for several months to put together the document, involving Eliezer when necessary.
SI publishes the final document, possibly in a journal.

Estimated cost:

2 months of Eliezer’s time.
150 hours of Luke’s time.
$40,000 for contributed hours from staff researchers, remote researchers, and perhaps domain experts (as consultants) from mainstream academia.

What links here?

lukeprog1 Jun 2012 2:06 UTC

33 points

14 comments1 min readLW link Archive

Open Problems

Vladimir_Nesov 1 Jun 2012 21:32 UTC
18 points
0
The implicit analogy drawn in the introduction between Eliezer Yudkowsky and both Henri Poincare and David Hilbert gives a bad arrogance vibe.
Normal_Anomaly 1 Jun 2012 12:18 UTC
13 points
0
This is the sort of thing I want to see more of from SI: both the technical research agenda info and the regular posts by Luke about what SI is doing right now. Thanks!
RobertLumley 1 Jun 2012 2:59 UTC
9 points
0
Might it be worth tagging each of these potential proposals with a certain tag so we could look at them all and evaluate them comparatively?
- lukeprog 1 Jun 2012 3:15 UTC
  5 points
  0
  Parent
  Here is a table of contents.
Wei Dai 6 Jun 2012 0:31 UTC
7 points
0
- Second-order logical version of Solomonoff induction.
I’m not sure this is the right problem. See this post I just made.
- Non-Cartesian version of Solomonoff induction.
UDT seems to solve this well enough that I no longer consider it a major open problem. Is this not your or Eliezer’s evaluation?
Nisan 2 Jun 2012 15:49 UTC
7 points
0
These problems sound really really interesting.
asr 1 Jun 2012 5:28 UTC
2 points
0
Interesting direction.

Couple small questions:
- Who is the intended audience for this document? Would you be able to name specific researchers who you are hoping to influence?
- As an alternative formulation, what’s the community in which you hope to publish this?
- Why is Eliezer-time measured in months, and Luke-time in hours?
- Do you expect to involve folks who haven’t previously been involved with SIAI? If so, when?
- How large a research team / author list would you expect the final version to have? Is fairly large “5” or “15″?
- Solvent 1 Jun 2012 6:24 UTC
  5 points
  0
  Parent
  
  Why is Eliezer-time measured in months, and Luke-time in hours?
  
  That’s a good question, especially considering that 250 hours is on the order of months (6 weeks at 40 hours/week, or 4 weeks at 60 hours/week).
  
  EDIT: Units confusion
  - lukeprog 1 Jun 2012 9:45 UTC
    4 points
    0
    Parent
    Oops, I meant 150 hours for me.
    
    Eliezer’s time is measured in months because he tracks his time in days not hours, so I have an easier time predicting how many days (which I can convert to months) something will take Eliezer to complete, rather than how many hours it will take him to complete.
    What links here?
    lukeprog's comment on Proposal for “Open Problems in Friendly AI” by lukeprog (1 Jun 2012 16:16 UTC; 1 point)
  - faul_sname 1 Jun 2012 17:32 UTC
    2 points
    0
    Parent
    I get 6 weeks at 40 hours/week.
    - Solvent 3 Jun 2012 23:33 UTC
      0 points
      0
      Parent
      Yep, thanks for that.
- lukeprog 1 Jun 2012 16:16 UTC
  1 point
  0
  Parent
  
  Who is the intended audience for this document?
  
  Every smart person who is fairly persuaded to care about AI risk and then asks us, “Okay, so what’s the technical research agenda?” This is a lot of people.
  
  what’s the community in which you hope to publish this?
  
  It doesn’t matter much. It’s something we would email to particular humans who are already interested.
  
  Why is Eliezer-time measured in months, and Luke-time in hours?
  
  See here.
  
  Do you expect to involve folks who haven’t previously been involved with SIAI? If so, when?
  
  Possibly, e.g. domain experts in micro-econ. When we need them.
  
  How large a research team / author list would you expect the final version to have? Is fairly large “5” or “15″?
  
  My guess is 10-ish.
[deleted] 29 Jul 2015 10:03 UTC
0 points
0

Much of the technical research agenda should be kept secret for the same reasons you might want to keep secret the DNA for a synthesized supervirus. But some of the Friendly AI technical research agenda is safe to explain so that a broad research community can contribute to it.

I’m uncomfortable with this.

Since this 2012, has MIRI updated it’s stance on self-censoring of the AI research agenda and can this be demonstrated with reference to formerly censored material or otherwise?

If not, are there alternative friendly AI focused organisations who accept donations and censor differently or don’t censor?

Thanks for your disclosures Lukeprog, I appreciate the general candor and accountability. It was also nice to read that you were an SI intern in 2011 - quickly you rose to the top! :)
What links here?
- [deleted]'s comment on Open thread, Sep. 14 - Sep. 20, 2015 by MrMind (16 Sep 2015 15:23 UTC; -9 points)
- Deworming a movement by [deleted] (30 Aug 2015 9:25 UTC; -9 points)
private_messaging 1 Jun 2012 12:25 UTC
−13 points
0

Second-order logical version of Solomonoff induction. Non-Cartesian version of Solomonoff induction.

This begs the question: do you even know what Solomonoff induction is? (edit: to be honest my best guess is that you don’t even know the terms with which to know the terms with which to know the terms… a couple dozen layers deep, with which to know what it is. The topic is pretty complicated, but looks pretty simple)

Construing utility functions

If you manage to construct an utility function (and by construct, i mean formally define in mathematics, construct from elementary operations) that actually defines the real world quantities for an agent to maximize (as opposed to finding maximums of functions in the abstract mathematical sense), that’ll be a step towards robot apocalypse and away from the currently safe approaches that simply won’t work like you guys think an utility maximizer would work and are subsequently safe (in the sense of not leading to the doom scenarios that you predict to arise from ‘utility maximization’). (I am pretty sure you won’t manage to construct it though, and even if you do nobody competent enough to implement this would be dumb enough to implement this)