RSS

De­bate (AI safety tech­nique)

De­bate is a pro­posed tech­nique for al­low­ing hu­man eval­u­a­tors to get cor­rect and helpful an­swers from ex­perts, even if the eval­u­a­tor is not them­selves an ex­pert or able to fully ver­ify the an­swers [1]. The tech­nique was sug­gested as part of an ap­proach to build ad­vanced AI sys­tems that are al­igned with hu­man val­ues, and to safely ap­ply ma­chine learn­ing tech­niques to prob­lems that have high stakes, but are not well-defined (such as ad­vanc­ing sci­ence or in­crease a com­pany’s rev­enue) [2, 3].

Wri­teup: Progress on AI Safety via Debate

5 Feb 2020 21:04 UTC
95 points
16 comments33 min readLW link

AI Safety via Debate

ESRogs
5 May 2018 2:11 UTC
40 points
12 comments1 min readLW link
(blog.openai.com)

Thoughts on AI Safety via Debate

Vaniver
9 May 2018 19:46 UTC
88 points
12 comments6 min readLW link

[Question] How should AI de­bate be judged?

abramdemski
15 Jul 2020 22:20 UTC
48 points
27 comments6 min readLW link

Split­ting De­bate up into Two Subsystems

Nandi
3 Jul 2020 20:11 UTC
13 points
5 comments4 min readLW link

Thoughts on “AI safety via de­bate”

G Gordon Worley III
10 May 2018 0:44 UTC
34 points
4 comments5 min readLW link

Com­par­ing AI Align­ment Ap­proaches to Min­i­mize False Pos­i­tive Risk

G Gordon Worley III
30 Jun 2020 19:34 UTC
6 points
0 comments9 min readLW link

An overview of 11 pro­pos­als for build­ing safe ad­vanced AI

evhub
29 May 2020 20:38 UTC
153 points
29 comments38 min readLW link

Three men­tal images from think­ing about AGI de­bate & corrigibility

steve2152
3 Aug 2020 14:29 UTC
49 points
35 comments4 min readLW link

Syn­the­siz­ing am­plifi­ca­tion and debate

evhub
5 Feb 2020 22:53 UTC
39 points
10 comments4 min readLW link

Par­allels Between AI Safety by De­bate and Ev­i­dence Law

Cullen_OKeefe
20 Jul 2020 22:52 UTC
10 points
1 comment2 min readLW link
(cullenokeefe.com)

AI Safety De­bate and Its Applications

VojtaKovarik
23 Jul 2019 22:31 UTC
39 points
5 comments12 min readLW link

New pa­per: (When) is Truth-tel­ling Fa­vored in AI de­bate?

VojtaKovarik
26 Dec 2019 19:59 UTC
33 points
7 comments5 min readLW link
(medium.com)