Chi Nguyen

Karma: 962

Working on making AI systems reason safely about decision theory and acausal interactions and eliciting their conceptual reasoning abilities, collaborating with Caspar Oesterheld and Emery Cooper.

I try to check the LessWrong infrequently. If you wanna get in touch, you can use my admonymous link and leave your email address there, so I can reply to you! (If you don’t include some contact details in your message, I can’t reply)

You can also just send me thoughts and questions anonymously!

How others can help me

Be interested in working on/implementing ideas from research on acausal cooperations! Or connect me with people who might be.

How I can help others

Ask me about acausal stuff!

Or any of my background: Before doing independent research, I worked for the Center on Long-Term Risk on s-risk reduction projects (hiring, community building, and grantmaking.) Previously, I was a guest manager at the EA Infrastructure Fund (2021), did some research for 1 Day Sooner on Human Challenge Trials for Covid vaccines (2020), did the summer research fellowship at FHI writing about IDA (2019), worked a few hours a week for CEA on local groups mentoring for a few months (2018), and helped a little bit with organizing EA Oxford (2018/19). I studied PPE at Oxford (2018-2021) and psychology in Freiburg (2015-2018.)

I also have things to say about mental health and advice for taking a break from work.

Chi Nguyen 21 Apr 2026 19:24 UTC
6 points
5
in reply to: Caspar Oesterheld’s comment on: Opus 4.7 Part 1: The Model Card
Hello! Just chiming in here to add a few things.
The test was developed by academics so FDT can’t be probed.
As one of the “academics” (scare quotes because I’m unfortunately quite academically uncredentialed), I just wanna note that our entire team actually cares a lot about this past an intellectual exercise in case people will read this as implying the opposite! I think of the entire point of my career as making decision theory and acausal dynamics go well in AIs (while also hopefully helping with automating alignment as a massive positive side effect) and I think others on the team feel similarly.
Yudkowsky notes that, given only responses to academically classic dilemmas, someone skilled in the art could still spot-check for FDT cross-patterns of responses [...] The better you understand decision theory, the more you do… something that EDT-vs-CDT cannot quite measure. IYKYK.
On a different note, I think the difference between FDT and empirically updateless EDT often feels overblown on LessWrong. I don’t actually know of cases where the two very clearly come apart, so I’m often confused when people seem confident in FDT’s superiority. There are cases where I’m not super sure what FDT would do, so they might come apart in those but if so, I think these cases haven’t been clearly articulated.

Conceptual reasoning dataset v0.1 available (AI for AI safety/AI for philosophy)

Chi Nguyen, Emery Cooper and Caspar Oesterheld

12 Nov 2025 1:12 UTC

18 points

0 comments3 min readLW link

Chi Nguyen 22 Oct 2025 21:14 UTC
4 points
2
in reply to: Menotim’s comment on: Omelas Is Perfectly Misread
[disclaimer: I haven’t actually read the short story]
This part from your quote
Yet it is their tears and anger, the trying of their generosity and the acceptance of their helplessness, which are perhaps the true source of the splendor of their lives. Theirs is no vapid, irresponsible happiness. They know that they, like the child, are not free. They know compassion. It is the existence of the child, and their knowledge of its existence, that makes possible the nobility of their architecture, the poignancy of their music, the profundity of their science. It is because of the child that they are so gentle with children. They know that if the wretched one were not there snivelling in the dark, the other one, the flute-player, could make no joyful music as the young riders line up in their beauty for the race in the sunlight of the first morning of summer.
makes me think that this is not just about rationalisation but about the (wrong) belief that there can be no happiness without suffering (or without evil which is slightly different). In fact, you could interpret it to think that the only reason that the child in misery is necessary for the other people’s happiness is because the other people believe it to be necessary. Because they believe that there must be suffering/evil for true happiness and they couldn’t cope otherwise. This would fit with the short story giving no causal explanation as to why the child has to be in misery to sustain the happiness of the others. If you take them at face value, these sentences actually read like they are the allegedly missing causal explanation: ” It is the existence of the child, and their knowledge of its existence, that makes possible the nobility of their architecture, the poignancy of their music, the profundity of their science.”
Slightly different point: The ones that walk away sort of remind me a bit of HPMOR’s Harry’s deliberate rejection of/accidental inability to accept necessary evils. (I think Harry has both of those going on.)

Chi Nguyen 17 Oct 2025 18:36 UTC
3 points
0
on: The Company Man
I didn’t know this was fiction when I started reading this and then started wondering as I become more and more disturbed and eventually stopped reading pretty early on at the $10 million dollars for the shrimp chef, at which point I was confident it’s fiction and confirming so in the comments. I reckon if I read the whole thing, it might haunt me in a way that I don’t want to just accidentally slide into because I thought it was differently valuable non-fiction. Maybe I’m the only one who is be this combination of dense, not reading the fiction tag, and sensitive, but I probably would have benefitted from some sort of trigger warning or fiction flagging. (Although I would understand if the author felt like that ruined the art and nobody else seemed to have this problem.)

Chi Nguyen 19 Sep 2025 21:16 UTC
3 points
3
in reply to: Fabien Roger’s comment on: Decision Theory Guarding is Sufficient for Scheming
This is a bit besides the point and not disagreeing with you, but I just wanna mention that I think the difference between son-of-CDT, what CDT wants to modify into, is very, very different from EDT for many of the things I consider most important, e.g. Evidential Cooperation in Large Worlds and most acausal trade. Just mentioning this because I often see people claim that it doesn’t make a difference which decision theory AI ends up with because they all modify to sufficiently similar things anyway. (Not saying you said that at all.)

Chi Nguyen 11 Feb 2025 20:18 UTC
1 point
0
in reply to: Rauno Arike’s comment on: Chi Nguyen’s Shortform
My current guess is that occasional volunteers are totally fine! There’s some onboarding cost but mostly, the cost on our side scales with the number of argument-critique pairs we get. Since the whole point is to have critiques of a large variety of quality, I don’t expect the nth argument-critque pair we get to be much more useable than the 1st one. I might be wrong about this one and change my mind as we try this out with people though!

(Btw I didn’t get a notification for your comment, so maybe better to dm if you’re interested.)

Chi Nguyen 9 Feb 2025 20:49 UTC
20 points
0
on: Chi Nguyen’s Shortform
Looking for help with an acausal safety project. If you’re interested or know someone who might be, it would be really great if you let me know/share
1. Help with acausal research and get mentoring to learn about decision theory
- Motivation: Caspar Oesterheld (inventor/discoverer of ECL/MSR), Emery Cooper and I are doing a project where we try to get LLMs to help us with our acausal research.
  - Our research is ultimately aimed at making future AIs acausally safe.
- Project: In a first step, we are trying to train an LLM classifier that evaluates critiques of arguments. To do so, we need a large number of both good and bad arguments about decision theory (and other areas of Philosophy.)
- How you’ll learn: If you would like to learn about decision theory, anthropics, open source game theory, …, we supply you with a curriculum. There’s a lot of leeway for what exactly you want to learn about. You go through the readings.
  - If you already know things and just want to test your ideas, you can optionally skip this step.
- Your contribution: While doing your readings, you, write up critiques of arguments you read.
- Bottom-line: We get to use your arguments/critiques for our projects and you get our feedback on them. (We have to read and label them for the project anyway.)
- Logistics: Unfortunately, you’d be a volunteer. I might be able to pay you a small amount out-of-pocket, but it’s not going to be very much. Caspar and Em are both university employed and I am similar in means to an independent researcher. We are also all non-Americans based in the US which makes it harder for us to acquire money for projects and such for boring and annoying reasons.
- Why are we good mentors: Caspar has dozens of publications on related topics. Em has a handful. And I have been around.
2. Be a saint and help with acausal research by doing tedious manual labor and getting little in return
We also need help with various grindy tasks that aren’t super helpful for learning, e.g. turning pdfs with equations etc. into sensible txts to feed to LLMs. If you’re motivated to help with that, we would be extremely grateful.

Chi Nguyen’s Shortform

Chi Nguyen9 Feb 2025 20:41 UTC

1 point

3 comments1 min readLW link

A dataset of questions on decision-theoretic reasoning in Newcomb-like problems

Caspar Oesterheld, Ethan Perez and Chi Nguyen

16 Dec 2024 22:42 UTC

51 points

1 comment2 min readLW link

(arxiv.org)

What is malevolence? On the nature, measurement, and distribution of dark traits

David Althaus, Chi Nguyen and Clare

23 Oct 2024 8:41 UTC

94 points

22 comments52 min readLW link

The case for more Alignment Target Analysis (ATA)

Chi Nguyen and ThomasCederborg

20 Sep 2024 1:14 UTC

27 points

13 comments17 min readLW link

Chi Nguyen 20 Jun 2024 16:52 UTC
21 points
23
in reply to: William_S’s comment on: Ilya Sutskever created a new AGI startup
I don’t trust Ilya Sutskever to be the final arbiter of whether a Superintelligent AI design is safe and aligned. We shouldn’t trust any individual,
I’m not sure how I feel about the whole idea of this endeavour in the abstract—but as someone who doesn’t know Ilya Sutskever and only followed the public stuff, I’m pretty worried that he in particular runs it if decision-making is on the “by an individual” level and even if not. Running this safely will likely require lots of moral integrity and courage. The board drama made it look to me like Ilya disqualified himself from having enough of that.
Lightly held because I don’t know the details but just from the public stuff I’ve seen I don’t know why I should at all believe that Ilya has sufficient moral integrity and courage for this project even if he might “mean well” at the moment.

Chi Nguyen 20 May 2024 14:12 UTC
12 points
7
on: OpenAI: Exodus
Greg Brockman and Sam Altman (cosigned):
[...]
First, we have raised awareness of the risks and opportunities of AGI so that the world can better prepare for it. We’ve repeatedly demonstrated the incredible possibilities from scaling up deep learning
chokes on coffee

Chi Nguyen 20 May 2024 13:50 UTC
17 points
2
on: OpenAI: Exodus
From my point of view, of course profit maximizing companies will…maximize profit. It never was even imaginable that these kinds of entities could shoulder such a huge risk responsibly.
Correct me if I’m wrong but isn’t Conjecture legally a company? Maybe their profit model isn’t actually foundation models? Not actually trying to imply things, just thought the wording was weird in that context and was wondering whether Conjecture has a different legal structure than I thought.

Chi Nguyen 20 May 2024 13:30 UTC
4 points
0
on: OpenAI: Exodus
minus Cullen O’Keefe who worked on policy and legal (so was not a clear cut case of working on safety),
I think Cullen was on the same team as Daniel (might be misremembering), so if you count Daniel, I’d also count Cullen. (Unless you wanna count Daniel because he previously was more directly part of technical AI safety research at OAI.)

Chi Nguyen 16 May 2024 11:37 UTC
1 point
0
in reply to: Maxime Riché’s comment on: How to do conceptual research: Case study interview with Caspar Oesterheld
Yes! Edited the main text to make it clear

Chi Nguyen 15 May 2024 13:02 UTC
2 points
0
in reply to: Edwin Evans’s comment on: How to do conceptual research: Case study interview with Caspar Oesterheld
The “entity giving the payout” in practice for ECL would be just the world states you end up in and requires you to care about the environment of the person you’re playing the PD with.
So, defecting might be just optimising my local environment for my own values and cooperating would be optimising my local environment for some aggregate of my own values and the values of the person I’m playing with. So, it only works if there are positive-sum aggregates and if each player cares about what the other does to their local environment.

How to do conceptual research: Case study interview with Caspar Oesterheld

Chi Nguyen14 May 2024 15:09 UTC

48 points

5 comments9 min readLW link

Chi Nguyen 3 May 2024 18:34 UTC
4 points
0
on: Which skincare products are evidence-based?
I watched and read a ton of Lab Muffin Beauty Science when I got into skincare. Apart from Sunscreen, I think a lot of it is trial and error with what has good short-term effects. I’m not sure about long-term effects at all tbh. Lab Muffin Beauty Science is helpful for figuring out your skin type, leads for which products to try first, and how to use them. (There’s a fair number of products you wanna ramp up slowly and even by the end only use on some days.)

Chi Nguyen 2 May 2024 18:13 UTC
16 points
3
on: Please stop publishing ideas/insights/research about AI
Are there types of published alignment research that you think were (more likely to be) good to publish? If so, I’d be curious to see a list.

Chi Nguyen

Con­cep­tual rea­son­ing dataset v0.1 available (AI for AI safety/​AI for philos­o­phy)

Chi Nguyen’s Shortform

A dataset of ques­tions on de­ci­sion-the­o­retic rea­son­ing in New­comb-like problems

What is malev­olence? On the na­ture, mea­sure­ment, and dis­tri­bu­tion of dark traits

The case for more Align­ment Tar­get Anal­y­sis (ATA)

How to do con­cep­tual re­search: Case study in­ter­view with Cas­par Oesterheld

Conceptual reasoning dataset v0.1 available (AI for AI safety/AI for philosophy)

A dataset of questions on decision-theoretic reasoning in Newcomb-like problems

What is malevolence? On the nature, measurement, and distribution of dark traits

The case for more Alignment Target Analysis (ATA)

How to do conceptual research: Case study interview with Caspar Oesterheld