RSS

Andrew_Critch

Karma: 4,847

This is Dr. Andrew Critch’s professional LessWrong account. Andrew is the CEO of Encultured AI, and works for ~1 day/​week as a Research Scientist at the Center for Human-Compatible AI (CHAI) at UC Berkeley. He also spends around a ½ day per week volunteering for other projects like the Berkeley Existential Risk initiative and the Survival and Flourishing Fund. Andrew earned his Ph.D. in mathematics at UC Berkeley studying applications of algebraic geometry to machine learning models. During that time, he cofounded the Center for Applied Rationality and SPARC. Dr. Critch has been offered university faculty and research positions in mathematics, mathematical biosciences, and philosophy, worked as an algorithmic stock trader at Jane Street Capital’s New York City office, and as a Research Fellow at the Machine Intelligence Research Institute. His current research interests include logical uncertainty, open source game theory, and mitigating race dynamics between companies and nations in AI development.

Cog­ni­tive Bi­ases Con­tribut­ing to AI X-risk — a deleted ex­cerpt from my 2018 ARCHES draft

Andrew_CritchDec 3, 2024, 9:29 AM
48 points
2 comments5 min readLW link

LLM chat­bots have ~half of the kinds of “con­scious­ness” that hu­mans be­lieve in. Hu­mans should avoid go­ing crazy about that.

Andrew_CritchNov 22, 2024, 3:26 AM
76 points
53 comments5 min readLW link

My mo­ti­va­tion and the­ory of change for work­ing in AI healthtech

Andrew_CritchOct 12, 2024, 12:36 AM
178 points
37 comments14 min readLW link

Re­for­ma­tive Hypocrisy, and Pay­ing Close Enough At­ten­tion to Selec­tively Re­ward It.

Andrew_CritchSep 11, 2024, 4:41 AM
53 points
11 comments3 min readLW link

Safety isn’t safety with­out a so­cial model (or: dis­pel­ling the myth of per se tech­ni­cal safety)

Andrew_CritchJun 14, 2024, 12:16 AM
357 points
38 comments4 min readLW link

New con­trac­tor role: Web se­cu­rity task force con­trac­tor for AI safety announcements

Oct 9, 2023, 6:36 PM
11 points
0 comments2 min readLW link
(survivalandflourishing.com)

Con­scious­ness as a con­fla­tion­ary al­li­ance term for in­trin­si­cally val­ued in­ter­nal experiences

Andrew_CritchJul 10, 2023, 8:09 AM
214 points
54 comments11 min readLW link2 reviews

TASRA: A Tax­on­omy and Anal­y­sis of So­cietal-Scale Risks from AI

Andrew_CritchJun 13, 2023, 5:04 AM
64 points
1 comment1 min readLW link

My May 2023 pri­ori­ties for AI x-safety: more em­pa­thy, more unifi­ca­tion of con­cerns, and less vil­ifi­ca­tion of OpenAI

Andrew_CritchMay 24, 2023, 12:02 AM
268 points
39 comments8 min readLW link

Job Open­ing: SWE to help build sig­na­ture vet­ting sys­tem for AI-re­lated petitions

May 20, 2023, 7:02 PM
52 points
0 comments1 min readLW link

GPT can write Quines now (GPT-4)

Andrew_CritchMar 14, 2023, 7:18 PM
112 points
30 comments1 min readLW link

Acausal normalcy

Andrew_CritchMar 3, 2023, 11:34 PM
195 points
36 comments8 min readLW link1 review

Payor’s Lemma in Nat­u­ral Language

Andrew_CritchMar 2, 2023, 12:22 PM
62 points
0 comments2 min readLW link

Mo­dal Fix­point Co­op­er­a­tion with­out Löb’s Theorem

Andrew_CritchFeb 5, 2023, 12:58 AM
134 points
34 comments3 min readLW link1 review

Löbian emo­tional pro­cess­ing of emer­gent co­op­er­a­tion: an example

Andrew_CritchJan 17, 2023, 5:59 AM
23 points
0 comments8 min readLW link

A Löbian ar­gu­ment pat­tern for im­plicit rea­son­ing in nat­u­ral lan­guage: Löbian party invitations

Andrew_CritchJan 1, 2023, 5:39 PM
23 points
8 comments7 min readLW link

Löb’s Lemma: an eas­ier ap­proach to Löb’s Theorem

Andrew_CritchDec 24, 2022, 2:02 AM
30 points
16 comments3 min readLW link

«Boundaries», Part 3b: Align­ment prob­lems in terms of bound­aries

Andrew_CritchDec 14, 2022, 10:34 PM
72 points
7 comments13 min readLW link

Open tech­ni­cal prob­lem: A Quinean proof of Löb’s the­o­rem, for an eas­ier car­toon guide

Andrew_CritchNov 24, 2022, 9:16 PM
58 points
35 comments3 min readLW link1 review

«Boundaries», Part 3a: Defin­ing bound­aries as di­rected Markov blankets

Andrew_CritchOct 30, 2022, 6:31 AM
90 points
20 comments15 min readLW link