RSS

Andrew_Critch(Andrew Critch)

Karma: 4,002

This is Dr. Andrew Critch’s professional LessWrong account. Andrew is the CEO of Encultured AI, and works for ~1 day/​week as a Research Scientist at the Center for Human-Compatible AI (CHAI) at UC Berkeley. He also spends around a ½ day per week volunteering for other projects like the Berkeley Existential Risk initiative and the Survival and Flourishing Fund. Andrew earned his Ph.D. in mathematics at UC Berkeley studying applications of algebraic geometry to machine learning models. During that time, he cofounded the Center for Applied Rationality and SPARC. Dr. Critch has been offered university faculty and research positions in mathematics, mathematical biosciences, and philosophy, worked as an algorithmic stock trader at Jane Street Capital’s New York City office, and as a Research Fellow at the Machine Intelligence Research Institute. His current research interests include logical uncertainty, open source game theory, and mitigating race dynamics between companies and nations in AI development.

My May 2023 pri­ori­ties for AI x-safety: more em­pa­thy, more unifi­ca­tion of con­cerns, and less vil­ifi­ca­tion of OpenAI

Andrew_Critch24 May 2023 0:02 UTC
272 points
39 comments8 min readLW link

What Mul­tipo­lar Failure Looks Like, and Ro­bust Agent-Ag­nos­tic Pro­cesses (RAAPs)

Andrew_Critch31 Mar 2021 23:50 UTC
271 points
64 comments22 min readLW link1 review

Slow mo­tion videos as AI risk in­tu­ition pumps

Andrew_Critch14 Jun 2022 19:31 UTC
237 points
41 comments2 min readLW link1 review

Some AI re­search ar­eas and their rele­vance to ex­is­ten­tial safety

Andrew_Critch19 Nov 2020 3:18 UTC
204 points
37 comments50 min readLW link2 reviews

Con­scious­ness as a con­fla­tion­ary al­li­ance term for in­trin­si­cally val­ued in­ter­nal experiences

Andrew_Critch10 Jul 2023 8:09 UTC
190 points
46 comments11 min readLW link

Power dy­nam­ics as a blind spot or blurry spot in our col­lec­tive world-mod­el­ing, es­pe­cially around AI

Andrew_Critch1 Jun 2021 18:45 UTC
182 points
26 comments6 min readLW link

Acausal normalcy

Andrew_Critch3 Mar 2023 23:34 UTC
175 points
30 comments8 min readLW link

«Boundaries», Part 1: a key miss­ing con­cept from util­ity theory

Andrew_Critch26 Jul 2022 23:03 UTC
158 points
32 comments7 min readLW link

“Pivotal Act” In­ten­tions: Nega­tive Con­se­quences and Fal­la­cious Arguments

Andrew_Critch19 Apr 2022 20:25 UTC
138 points
55 comments7 min readLW link1 review

Mo­dal Fix­point Co­op­er­a­tion with­out Löb’s Theorem

Andrew_Critch5 Feb 2023 0:58 UTC
133 points
32 comments3 min readLW link

In­ter­gen­er­a­tional trauma im­ped­ing co­op­er­a­tive ex­is­ten­tial safety efforts

Andrew_Critch3 Jun 2022 8:13 UTC
128 points
29 comments3 min readLW link

GPT can write Quines now (GPT-4)

Andrew_Critch14 Mar 2023 19:18 UTC
111 points
30 comments1 min readLW link

An­nounc­ing En­cul­tured AI: Build­ing a Video Game

18 Aug 2022 2:16 UTC
103 points
26 comments4 min readLW link

Pivotal out­comes and pivotal processes

Andrew_Critch17 Jun 2022 23:43 UTC
95 points
31 comments4 min readLW link

«Boundaries», Part 3a: Defin­ing bound­aries as di­rected Markov blankets

Andrew_Critch30 Oct 2022 6:31 UTC
86 points
20 comments15 min readLW link

«Boundaries», Part 2: trends in EA’s han­dling of boundaries

Andrew_Critch6 Aug 2022 0:42 UTC
81 points
14 comments7 min readLW link

“Tech com­pany sin­gu­lar­i­ties”, and steer­ing them to re­duce x-risk

Andrew_Critch13 May 2022 17:24 UTC
75 points
11 comments4 min readLW link

«Boundaries», Part 3b: Align­ment prob­lems in terms of bound­aries

Andrew_Critch14 Dec 2022 22:34 UTC
72 points
7 comments13 min readLW link

Cu­rat­ing “The Epistemic Se­quences” (list v.0.1)

Andrew_Critch23 Jul 2022 22:17 UTC
65 points
12 comments7 min readLW link

TASRA: A Tax­on­omy and Anal­y­sis of So­cietal-Scale Risks from AI

Andrew_Critch13 Jun 2023 5:04 UTC
63 points
1 comment1 min readLW link