Andrew_Critch(Andrew Critch)

Karma: 4,002

This is Dr. Andrew Critch’s professional LessWrong account. Andrew is the CEO of Encultured AI, and works for ~1 day/week as a Research Scientist at the Center for Human-Compatible AI (CHAI) at UC Berkeley. He also spends around a ½ day per week volunteering for other projects like the Berkeley Existential Risk initiative and the Survival and Flourishing Fund. Andrew earned his Ph.D. in mathematics at UC Berkeley studying applications of algebraic geometry to machine learning models. During that time, he cofounded the Center for Applied Rationality and SPARC. Dr. Critch has been offered university faculty and research positions in mathematics, mathematical biosciences, and philosophy, worked as an algorithmic stock trader at Jane Street Capital’s New York City office, and as a Research Fellow at the Machine Intelligence Research Institute. His current research interests include logical uncertainty, open source game theory, and mitigating race dynamics between companies and nations in AI development.

My May 2023 priorities for AI x-safety: more empathy, more unification of concerns, and less vilification of OpenAI

Andrew_Critch24 May 2023 0:02 UTC

272 points

39 comments8 min readLW link

What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs)

Andrew_Critch31 Mar 2021 23:50 UTC

271 points

64 comments22 min readLW link 1 review

Slow motion videos as AI risk intuition pumps

Andrew_Critch14 Jun 2022 19:31 UTC

237 points

41 comments2 min readLW link 1 review

Some AI research areas and their relevance to existential safety

Andrew_Critch19 Nov 2020 3:18 UTC

204 points

37 comments50 min readLW link 2 reviews

Consciousness as a conflationary alliance term for intrinsically valued internal experiences

Andrew_Critch10 Jul 2023 8:09 UTC

190 points

46 comments11 min readLW link

Power dynamics as a blind spot or blurry spot in our collective world-modeling, especially around AI

Andrew_Critch1 Jun 2021 18:45 UTC

182 points

26 comments6 min readLW link

Acausal normalcy

Andrew_Critch3 Mar 2023 23:34 UTC

175 points

30 comments8 min readLW link

«Boundaries», Part 1: a key missing concept from utility theory

Andrew_Critch26 Jul 2022 23:03 UTC

158 points

32 comments7 min readLW link

“Pivotal Act” Intentions: Negative Consequences and Fallacious Arguments

Andrew_Critch19 Apr 2022 20:25 UTC

138 points

55 comments7 min readLW link 1 review

Modal Fixpoint Cooperation without Löb’s Theorem

Andrew_Critch5 Feb 2023 0:58 UTC

133 points

32 comments3 min readLW link

Intergenerational trauma impeding cooperative existential safety efforts

Andrew_Critch3 Jun 2022 8:13 UTC

128 points

29 comments3 min readLW link

GPT can write Quines now (GPT-4)

Andrew_Critch14 Mar 2023 19:18 UTC

111 points

30 comments1 min readLW link

Announcing Encultured AI: Building a Video Game

Andrew_Critch and Nick Hay

18 Aug 2022 2:16 UTC

103 points

26 comments4 min readLW link

Pivotal outcomes and pivotal processes

Andrew_Critch17 Jun 2022 23:43 UTC

95 points

31 comments4 min readLW link

«Boundaries», Part 3a: Defining boundaries as directed Markov blankets

Andrew_Critch30 Oct 2022 6:31 UTC

86 points

20 comments15 min readLW link

«Boundaries», Part 2: trends in EA’s handling of boundaries

Andrew_Critch6 Aug 2022 0:42 UTC

81 points

14 comments7 min readLW link

“Tech company singularities”, and steering them to reduce x-risk

Andrew_Critch13 May 2022 17:24 UTC

75 points

11 comments4 min readLW link

«Boundaries», Part 3b: Alignment problems in terms of boundaries

Andrew_Critch14 Dec 2022 22:34 UTC

72 points

7 comments13 min readLW link

Curating “The Epistemic Sequences” (list v.0.1)

Andrew_Critch23 Jul 2022 22:17 UTC

65 points

12 comments7 min readLW link

TASRA: A Taxonomy and Analysis of Societal-Scale Risks from AI

Andrew_Critch13 Jun 2023 5:04 UTC

63 points

1 comment1 min readLW link