RSS

Hu­mans Con­sult­ing HCH

TagLast edit: 17 Jul 2020 6:28 UTC by Ben Pace

Humans Consulting HCH (HCH) is a recursive acronym describing a setup where humans can consult simulations of themselves to help answer questions. It is a concept used in discussion of the iterated amplification proposal to solve the alignment problem.

It was first described by Paul Christiano in his post Humans Consulting HCH:

Consider a human Hugh who has access to a question-answering machine. Suppose the machine answers question Q by perfectly imitating how Hugh would answer question Q, if Hugh had access to the question-answering machine.

That is, Hugh is able to consult a copy of Hugh, who is able to consult a copy of Hugh, who is able to consult a copy of Hugh…

Let’s call this process HCH, for “Humans Consulting HCH.”

Hu­mans Con­sult­ing HCH

paulfchristiano25 Nov 2018 23:18 UTC
33 points
9 comments1 min readLW link

HCH and Ad­ver­sar­ial Questions

David Udell19 Feb 2022 0:52 UTC
15 points
7 comments26 min readLW link

Re­lat­ing HCH and Log­i­cal Induction

abramdemski16 Jun 2020 22:08 UTC
50 points
4 comments5 min readLW link

Paul’s re­search agenda FAQ

zhukeepa1 Jul 2018 6:25 UTC
126 points
74 comments19 min readLW link1 review

Episte­mol­ogy of HCH

adamShimi9 Feb 2021 11:46 UTC
17 points
2 comments10 min readLW link

HCH Spec­u­la­tion Post #2A

Charlie Steiner17 Mar 2021 13:26 UTC
42 points
7 comments9 min readLW link

Garrabrant and Shah on hu­man mod­el­ing in AGI

Rob Bensinger4 Aug 2021 4:35 UTC
60 points
10 comments47 min readLW link

Can HCH epistem­i­cally dom­i­nate Ra­manu­jan?

zhukeepa23 Feb 2019 22:00 UTC
34 points
4 comments2 min readLW link

HCH is not just Me­chan­i­cal Turk

William_S9 Feb 2019 0:46 UTC
42 points
6 comments3 min readLW link

A guide to Iter­ated Am­plifi­ca­tion & Debate

Rafael Harth15 Nov 2020 17:14 UTC
75 points
12 comments15 min readLW link

Pre­dict­ing HCH us­ing ex­pert advice

jessicata28 Nov 2016 3:38 UTC
7 points
2 comments4 min readLW link

HCH as a mea­sure of manipulation

orthonormal11 Mar 2017 3:02 UTC
1 point
7 comments1 min readLW link

Clar­ify­ing Fac­tored Cognition

Rafael Harth13 Dec 2020 20:02 UTC
23 points
2 comments3 min readLW link

Ideal­ized Fac­tored Cognition

Rafael Harth30 Nov 2020 18:49 UTC
34 points
6 comments11 min readLW link

FC fi­nal: Can Fac­tored Cog­ni­tion schemes scale?

Rafael Harth24 Jan 2021 22:18 UTC
17 points
0 comments17 min readLW link

Rant on Prob­lem Fac­tor­iza­tion for Alignment

johnswentworth5 Aug 2022 19:23 UTC
89 points
51 comments6 min readLW link

Weak HCH ac­cesses EXP

evhub22 Jul 2020 22:36 UTC
16 points
0 comments3 min readLW link

[Question] What’s wrong with these analo­gies for un­der­stand­ing In­formed Over­sight and IDA?

Wei Dai20 Mar 2019 9:11 UTC
35 points
3 comments1 min readLW link

[Question] What are the differ­ences be­tween all the iter­a­tive/​re­cur­sive ap­proaches to AI al­ign­ment?

riceissa21 Sep 2019 2:09 UTC
30 points
14 comments2 min readLW link

Univer­sal­ity Unwrapped

adamShimi21 Aug 2020 18:53 UTC
29 points
2 comments18 min readLW link

Towards for­mal­iz­ing universality

paulfchristiano13 Jan 2019 20:39 UTC
27 points
19 comments18 min readLW link

AIS 101: Task de­com­po­si­tion for scal­able oversight

Charbel-Raphaël25 Jul 2023 13:34 UTC
27 points
0 comments19 min readLW link
(docs.google.com)

Meta-execution

paulfchristiano1 Nov 2018 22:18 UTC
20 points
1 comment5 min readLW link

Univer­sal­ity and the “Filter”

maggiehayes16 Dec 2021 0:47 UTC
10 points
2 comments11 min readLW link

Map­ping the Con­cep­tual Ter­ri­tory in AI Ex­is­ten­tial Safety and Alignment

jbkjr12 Feb 2021 7:55 UTC
15 points
0 comments26 min readLW link