Andrew_Critch

Karma: 5,258

This is Dr. Andrew Critch’s professional LessWrong account. Andrew is the CEO of Encultured AI, and works for ~1 day/week as a Research Scientist at the Center for Human-Compatible AI (CHAI) at UC Berkeley. He also spends around a ½ day per week volunteering for other projects like the Berkeley Existential Risk initiative and the Survival and Flourishing Fund. Andrew earned his Ph.D. in mathematics at UC Berkeley studying applications of algebraic geometry to machine learning models. During that time, he cofounded the Center for Applied Rationality and SPARC. Dr. Critch has been offered university faculty and research positions in mathematics, mathematical biosciences, and philosophy, worked as an algorithmic stock trader at Jane Street Capital’s New York City office, and as a Research Fellow at the Machine Intelligence Research Institute. His current research interests include logical uncertainty, open source game theory, and mitigating race dynamics between companies and nations in AI development.

Promoting enmity and bad vibes around AI safety

Andrew_Critch9 Mar 2026 0:53 UTC

35 points

38 comments4 min readLW link

Andrew_Critch’s Shortform

Andrew_Critch1 Mar 2026 18:45 UTC

8 points

111 comments1 min readLW link

Schelling Goodness, and Shared Morality as a Goal

Andrew_Critch28 Feb 2026 4:25 UTC

127 points

61 comments41 min readLW link

Cognitive Biases Contributing to AI X-risk — a deleted excerpt from my 2018 ARCHES draft

Andrew_Critch3 Dec 2024 9:29 UTC

48 points

2 comments5 min readLW link

LLM chatbots have ~half of the kinds of “consciousness” that humans believe in. Humans should avoid going crazy about that.

Andrew_Critch22 Nov 2024 3:26 UTC

86 points

53 comments5 min readLW link

My motivation and theory of change for working in AI healthtech

Andrew_Critch12 Oct 2024 0:36 UTC

188 points

40 comments14 min readLW link 1 review

Reformative Hypocrisy, and Paying Close Enough Attention to Selectively Reward It.

Andrew_Critch11 Sep 2024 4:41 UTC

53 points

13 comments3 min readLW link

Safety isn’t safety without a social model (or: dispelling the myth of per se technical safety)

Andrew_Critch14 Jun 2024 0:16 UTC

368 points

41 comments4 min readLW link 3 reviews

New contractor role: Web security task force contractor for AI safety announcements

Ethan Ashkie and Andrew_Critch

9 Oct 2023 18:36 UTC

11 points

0 comments2 min readLW link

(survivalandflourishing.com)

Consciousness as a conflationary alliance term for intrinsically valued internal experiences

Andrew_Critch10 Jul 2023 8:09 UTC

218 points

60 comments11 min readLW link 2 reviews

TASRA: A Taxonomy and Analysis of Societal-Scale Risks from AI

Andrew_Critch13 Jun 2023 5:04 UTC

64 points

1 comment1 min readLW link

My May 2023 priorities for AI x-safety: more empathy, more unification of concerns, and less vilification of OpenAI

Andrew_Critch24 May 2023 0:02 UTC

268 points

39 comments8 min readLW link

Job Opening: SWE to help build signature vetting system for AI-related petitions

Ethan Ashkie and Andrew_Critch

20 May 2023 19:02 UTC

52 points

0 comments1 min readLW link

GPT can write Quines now (GPT-4)

Andrew_Critch14 Mar 2023 19:18 UTC

112 points

30 comments1 min readLW link

Acausal normalcy

Andrew_Critch3 Mar 2023 23:34 UTC

203 points

40 comments8 min readLW link 1 review

Payor’s Lemma in Natural Language

Andrew_Critch2 Mar 2023 12:22 UTC

69 points

0 comments2 min readLW link

Modal Fixpoint Cooperation without Löb’s Theorem

Andrew_Critch5 Feb 2023 0:58 UTC

136 points

34 comments3 min readLW link 1 review

Löbian emotional processing of emergent cooperation: an example

Andrew_Critch17 Jan 2023 5:59 UTC

23 points

0 comments8 min readLW link

A Löbian argument pattern for implicit reasoning in natural language: Löbian party invitations

Andrew_Critch1 Jan 2023 17:39 UTC

23 points

8 comments7 min readLW link

Löb’s Lemma: an easier approach to Löb’s Theorem

Andrew_Critch24 Dec 2022 2:02 UTC

38 points

17 comments3 min readLW link