RSS

Or­a­cle AI

TagLast edit: 27 Nov 2020 16:35 UTC by Multicore

An Oracle AI is a regularly proposed solution to the problem of developing Friendly AI. It is conceptualized as a super-intelligent system which is designed for only answering questions, and has no ability to act in the world. The name was first suggested by Nick Bostrom.

See also

Safety

The question of whether Oracles – or just keeping an AGI forcibly confined—are safer than fully free AGIs has been the subject of debate for a long time. Armstrong, Sandberg and Bostrom discuss Oracle safety at length in their Thinking inside the box: using and controlling an Oracle AI. In the paper, the authors review various methods which might be used to measure an Oracle’s accuracy. They also try to shed some light on some weaknesses and dangers that can emerge on the human side, such as psychological vulnerabilities which can be exploited by the Oracle through social engineering. The paper discusses ideas for physical security (“boxing”), as well as problems involved with trying to program the AI to only answer questions. In the end, the paper reaches the cautious conclusion of Oracle AIs probably being safer than free AGIs.

In a related work, Dreams of Friendliness, Eliezer Yudkowsky gives an informal argument stating that all oracles will be agent-like, that is, driven by its own goals. He rests on the idea that anything considered “intelligent” must choose the correct course of action among all actions avaliable. That means that the Oracle will have many possible things to believe, although very few of them are correct. Therefore believing the correct thing means some method was used to select the correct belief from the many incorrect beliefs. By definition, this is an optimization process which has a goal of selecting correct beliefs.

One can then imagine all the things that might be useful in achieving the goal of “have correct beliefs”. For instance, acquiring more computing power and resources could help this goal. As such, an Oracle could determine that it might answer more accurately and easily to a certain question if it turned all matter outside the box to computronium, therefore killing all the existing life.

Taxonomy

Based on an old draft by Daniel Dewey, Luke Muehlhauser has published a possible taxonomy of Oracle AIs, broadly divided between True Oracular AIs and Oracular non-AIs.

True Oracular AIs

Given that true AIs are goal-oriented agents, it follows that a True Oracular AI has some kind of oracular goals. These act as the motivation system for the Oracle to give us the information we ask and nothing else.

It is first noted that such a True AI is not actually nor causally isolated from the world, as it has at least an input (questions and information) and an output (answers) channel. Since we expect such an intelligent agent to be able to have a deep impact on the world even through these limited channels, it can only be safe if its goals are fully compatible with human goals.

This means that a True Oracular AI has to have a full specification of human values, thus making it a FAI-complete problem – if we could achieve such skill and knowledge we could just build a Friendly AI and bypass the Oracle AI concept.

Oracular non-AIs

Any system that acts only as an informative machine, only answering questions and has no goals is by definition not an AI at all. That means that a non-AI Oracular is but a calculator of outputs based on inputs. Since the term in itself is heterogeneous, the proposals made for a sub-division are merely informal.

An Advisor can be seen as a system that gathers data from the real world and computes the answer to an informal “what we ought to do?” question. They also represent a FAI-complete problem.

A Question-Answerer is a similar system that gathers data from the real world but coupled with a question. It then somehow computes the answer. The difficulty can lay on distinguishing it from an Advisor and controlling the safety of its answers.

Finally, a Predictor is seen as a system that takes a corpus of data and produces a probability distribution over future possible data. There are some proposed dangers with predictors, namely exhibiting goal-seeking behavior which does not converge with humanity goals and the ability to influence us through the predictions.

Further reading & References

Or­a­cles: re­ject all deals—break su­per­ra­tional­ity, with superrationality

Stuart_Armstrong5 Dec 2019 13:51 UTC
19 points
4 comments8 min readLW link

Or­a­cle AI: Hu­man be­liefs vs hu­man values

Stuart_Armstrong22 Jul 2015 11:54 UTC
4 points
14 comments1 min readLW link

Why safe Or­a­cle AI is eas­ier than safe gen­eral AI, in a nutshell

Stuart_Armstrong3 Dec 2011 12:33 UTC
5 points
66 comments1 min readLW link

A tax­on­omy of Or­a­cle AIs

lukeprog8 Mar 2012 23:14 UTC
24 points
54 comments4 min readLW link

A Proof Against Or­a­cle AI

yamar696 Mar 2020 21:42 UTC
10 points
3 comments1 min readLW link

Brain em­u­la­tions and Or­a­cle AI

Stuart_Armstrong14 Oct 2011 17:51 UTC
10 points
5 comments1 min readLW link

Yet an­other safe or­a­cle AI proposal

jacobt26 Feb 2012 23:45 UTC
4 points
33 comments12 min readLW link

Con­test: $1,000 for good ques­tions to ask to an Or­a­cle AI

Stuart_Armstrong31 Jul 2019 18:48 UTC
56 points
156 comments3 min readLW link

Un­der a week left to win $1,000! By ques­tion­ing Or­a­cle AIs.

Stuart_Armstrong25 Aug 2019 17:02 UTC
12 points
2 comments1 min readLW link

In defense of Or­a­cle (“Tool”) AI research

Steven Byrnes7 Aug 2019 19:14 UTC
20 points
11 comments4 min readLW link

Su­per­in­tel­li­gence 15: Or­a­cles, ge­nies and sovereigns

KatjaGrace23 Dec 2014 2:01 UTC
11 points
30 comments7 min readLW link

Re­sults of $1,000 Or­a­cle con­test!

Stuart_Armstrong17 Jun 2020 17:44 UTC
55 points
2 comments1 min readLW link

Or­a­cles, se­quence pre­dic­tors, and self-con­firm­ing predictions

Stuart_Armstrong3 May 2019 14:09 UTC
21 points
0 comments3 min readLW link

The Parable of Pre­dict-O-Matic

abramdemski15 Oct 2019 0:49 UTC
245 points
41 comments14 min readLW link5 nominations4 reviews

Book Re­view: AI Safety and Security

Michaël Trazzi21 Aug 2018 10:23 UTC
51 points
2 comments11 min readLW link

A Treach­er­ous Turn Timeline—Chil­dren, Seed AIs and Pre­dict­ing AI

Michaël Trazzi21 May 2019 19:58 UTC
8 points
6 comments3 min readLW link

Or­a­cle de­sign as de-black-boxer.

Stuart_Armstrong2 Sep 2016 13:38 UTC
0 points
0 comments1 min readLW link

Or­a­cle ma­chines for au­to­mated philosophy

Nisan17 Feb 2015 15:10 UTC
1 point
0 comments4 min readLW link

Coun­ter­fac­tual Or­a­cles = on­line su­per­vised learn­ing with ran­dom se­lec­tion of train­ing episodes

Wei_Dai10 Sep 2019 8:29 UTC
44 points
26 comments3 min readLW link

Epiphe­nom­e­nal Or­a­cles Ig­nore Holes in the Box

SilentCal31 Jan 2018 20:08 UTC
15 points
8 comments2 min readLW link

Bounded Or­a­cle Induction

Diffractor28 Nov 2018 8:11 UTC
25 points
0 comments9 min readLW link

Break­ing Or­a­cles: su­per­ra­tional­ity and acausal trade

Stuart_Armstrong25 Nov 2019 10:40 UTC
25 points
15 comments1 min readLW link

Co­op­er­a­tive Oracles

Diffractor1 Sep 2018 8:05 UTC
18 points
9 comments12 min readLW link

Reflec­tive or­a­cles as a solu­tion to the con­verse Law­vere problem

SamEisenstat29 Nov 2018 3:23 UTC
24 points
0 comments7 min readLW link

Reflex­ive Or­a­cles and su­per­ra­tional­ity: Pareto

Stuart_Armstrong24 May 2017 8:35 UTC
14 points
0 comments2 min readLW link

Reflex­ive Or­a­cles and su­per­ra­tional­ity: pris­oner’s dilemma

Stuart_Armstrong24 May 2017 8:34 UTC
14 points
1 comment4 min readLW link

Reflec­tive or­a­cles and superationality

Stuart_Armstrong18 Nov 2015 12:30 UTC
7 points
0 comments6 min readLW link

An Or­a­cle stan­dard trick

Stuart_Armstrong3 Jun 2015 14:17 UTC
7 points
33 comments1 min readLW link

Co­op­er­a­tive Or­a­cles: Non­ex­ploited Bargaining

Scott Garrabrant3 Jun 2017 0:39 UTC
6 points
0 comments3 min readLW link

Co­op­er­a­tive Or­a­cles: Strat­ified Pareto Op­tima and Al­most Strat­ified Pareto Optima

Scott Garrabrant3 Jun 2017 0:38 UTC
5 points
0 comments4 min readLW link

Co­op­er­a­tive Or­a­cles: Introduction

Scott Garrabrant3 Jun 2017 0:36 UTC
4 points
2 comments2 min readLW link

Prob­a­bil­is­tic Or­a­cle Machines and Nash Equilibria

jessicata6 Feb 2015 1:14 UTC
5 points
0 comments1 min readLW link

Reflec­tive or­a­cles and the pro­cras­ti­na­tion paradox

jessicata26 Mar 2015 22:18 UTC
3 points
0 comments2 min readLW link

Three Or­a­cle designs

Stuart_Armstrong20 Jul 2016 15:16 UTC
2 points
0 comments1 min readLW link

An Or­a­cle stan­dard trick

Stuart_Armstrong3 Jun 2015 14:25 UTC
2 points
0 comments1 min readLW link

Stan­dard ML Or­a­cles vs Coun­ter­fac­tual ones

Stuart_Armstrong10 Oct 2018 20:01 UTC
17 points
5 comments6 min readLW link

A Safer Or­a­cle Setup?

ofer9 Feb 2018 12:16 UTC
5 points
4 comments4 min readLW link

Multibit re­flec­tive oracles

Benya_Fallenstein25 Jan 2015 2:23 UTC
5 points
0 comments8 min readLW link

Non-ma­nipu­la­tive oracles

Stuart_Armstrong6 Feb 2015 17:05 UTC
3 points
0 comments1 min readLW link

Find­ing re­flec­tive or­a­cle dis­tri­bu­tions us­ing a Kaku­tani map

jessicata2 May 2017 2:12 UTC
1 point
0 comments2 min readLW link

From halt­ing or­a­cles to modal logic

Benya_Fallenstein3 Feb 2015 19:26 UTC
1 point
0 comments6 min readLW link

Op­ti­miser to Oracle

Stuart_Armstrong22 Sep 2015 10:27 UTC
0 points
0 comments1 min readLW link

Prob­lems with Coun­ter­fac­tual Oracles

Michaël Trazzi11 Jun 2019 18:10 UTC
13 points
22 comments3 min readLW link

Coun­ter­fac­tu­als and re­flec­tive oracles

Nisan5 Sep 2018 8:54 UTC
9 points
0 comments6 min readLW link

Search Eng­ines and Oracles

HalMorris8 Jul 2014 14:27 UTC
8 points
8 comments2 min readLW link

Fo­rum Digest: Reflec­tive Oracles

jessicata22 Mar 2015 4:02 UTC
6 points
0 comments3 min readLW link

Re­source-Limited Reflec­tive Oracles

Diffractor6 Jun 2018 2:50 UTC
4 points
0 comments4 min readLW link

Re­source-Limited Reflec­tive Oracles

Diffractor6 Jun 2018 2:50 UTC
4 points
0 comments4 min readLW link

Sim­plic­ity pri­ors with re­flec­tive oracles

Benya_Fallenstein15 Nov 2014 6:39 UTC
1 point
0 comments6 min readLW link

Self-con­firm­ing prophe­cies, and sim­plified Or­a­cle designs

Stuart_Armstrong28 Jun 2019 9:57 UTC
6 points
1 comment5 min readLW link

Safe ques­tions to ask an Or­a­cle?

Stuart_Armstrong27 Jan 2012 18:33 UTC
3 points
41 comments1 min readLW link

UDT in the Land of Prob­a­bil­is­tic Oracles

jessicata8 Feb 2015 9:13 UTC
4 points
0 comments3 min readLW link

A model of UDT with a halt­ing oracle

cousin_it18 Dec 2011 14:18 UTC
66 points
102 comments2 min readLW link

Analysing: Danger­ous mes­sages from fu­ture UFAI via Oracles

Stuart_Armstrong22 Nov 2019 14:17 UTC
22 points
16 comments4 min readLW link

An Idea For Cor­rigible, Re­cur­sively Im­prov­ing Math Oracles

jimrandomh20 Jul 2015 3:35 UTC
6 points
0 comments2 min readLW link

Is it pos­si­ble to build a safe or­a­cle AI?

Karl20 Apr 2011 12:54 UTC
1 point
25 comments1 min readLW link

Strat­egy Non­con­vex­ity In­duced by a Choice of Po­ten­tial Oracles

Diffractor27 Jan 2018 0:41 UTC
2 points
0 comments3 min readLW link

Op­ti­miz­ing ar­bi­trary ex­pres­sions with a lin­ear num­ber of queries to a Log­i­cal In­duc­tion Or­a­cle (Car­toon Guide)

Donald Hobson23 Jul 2020 21:37 UTC
3 points
2 comments2 min readLW link

AI or­a­cles on blockchain

Caravaggio6 Apr 2021 20:13 UTC
1 point
0 comments3 min readLW link
No comments.