Mikhail Samin

Karma: 3,389

My name is Mikhail Samin (@Mihonarium on Twitter/X, @misha on Telegram).

Humanity’s future can be enormous and awesome; losing it would mean our lightcone (and maybe the universe) losing most of its potential value.

I have takes on what seems to me to be the very obvious shallow stuff about the technical AI notkilleveryoneism; but many AI Safety researchers told me our conversations improved their understanding of the alignment problem.

I’m running two small nonprofits: AI Governance and Safety Institute and AI Safety and Governance Fund. Learn more about our results and donate: aisgf.us/fundraising

I took the Giving What We Can pledge to donate at least 10% of my income for the rest of my life or until the day I retire (why?).

In the past, I’ve launched the most funded crowdfunding campaign in the history of Russia (it was to print HPMOR! we printed 21 000 copies =63k books) and founded audd.io, which allowed me to donate >$100k to EA causes, including >$60k to MIRI.

[Less important: I’ve also started a project to translate 80,000 Hours, a career guide that helps to find a fulfilling career that does good, into Russian. The impact and the effectiveness aside, for a year, I was the head of the Russian Pastafarian Church: a movement claiming to be a parody religion, with 200 000 members in Russia at the time, trying to increase separation between religious organisations and the state. I was a political activist and a human rights advocate. I studied relevant Russian and international law and wrote appeals that won cases against the Russian government in courts; I was able to protect people from unlawful police action. I co-founded the Moscow branch of the “Vesna” democratic movement, coordinated election observers in a Moscow district, wrote dissenting opinions for members of electoral commissions, helped Navalny’s Anti-Corruption Foundation, helped Telegram with internet censorship circumvention, and participated in and organized protests and campaigns. The large-scale goal was to build a civil society and turn Russia into a democracy through nonviolent resistance. This goal wasn’t achieved, but some of the more local campaigns were successful. That felt important and was also mostly fun- except for being detained by the police. I think it’s likely the Russian authorities would imprison me if I ever visit Russia.]

Mikhail Samin 20 Apr 2026 18:59 UTC
4 points
0
in reply to: Anthony DiGiovanni’s comment on: CLR’s Safe Pareto Improvements Research Agenda
- there’s no need to have identical notions of what’s fair in order to cooperate (and no need to have even similar notions in order to cooperate most of the time)
- commitment races are not a real problem, theres no reason to do it, you don’t need to care about logical commitment/decision time in order to incentivize fair splits of gains, and in general, the core of being an FDT agent is being the kind of agent that takes the best actions (and this being predictable about you, without you having to pre-commit to specific actions in specific situations in advance). (I’ve had detailed discussions around this and related stuff with James Faville; some of it was in the format of a debate, and according to people who were around for the debate, I won. one can simply follow the procedure I described in the game of chicken.)
- an agent that incentivizes fair splits ends up in a better situation than most other agents; agents and parts of agents that follow LDTs, including agents that give in to counterfactual mugging are selected for

Mikhail Samin 20 Apr 2026 17:27 UTC
2 points
0
on: Have an Unreasonably Specific Story About The Future
Strongly upvoted. Was writing a post about this exact thing (people should backchain more), went around lw, found a backlink to here from the backchaining post. Thanks for writing this!

Mikhail Samin 20 Apr 2026 16:32 UTC
1 point
−1
on: Mikhail Samin’s Shortform
It does not inspire confidence in the AI safety field that the only problem (among many that would appear at the superintelligent level) that was deliberately materialized earlier is alignment-faking, explicitly mentioned by Yudkowsky in AGI Ruin.
(Am I missing other problems that would appear at the superintelligent level that have already been demonstrated? Do we know people who can predict those problems on their own (without requiring Eliezer Yudkowsky to point them out) and then materialize them now?)

Mikhail Samin 20 Apr 2026 15:31 UTC
4 points
0
on: CLR’s Safe Pareto Improvements Research Agenda
SPIs don’t require agents to coordinate on some notion of a “fair” deal
I expect this to be straightforwardly solved by smart agents that implement LDTs, without particularly requiring any details about bargaining. See https://www.lesswrong.com/posts/TXbFFYpNWDmEmHevp/how-to-give-in-to-threats-without-incentivizing-them

Mikhail Samin 20 Apr 2026 15:24 UTC
3 points
−1
on: Nectome: All That I Know
Nectome requires basically undergoing medically assisted suicide, which is not how I think most people with relatively short AI timelines would expect to die. It’s great to see developments in this space, but other existing cryopreservation technology is (I think) also pretty good.
To say more (copied from my message in a Slack discussion of Nectome and other cryonics providers; while I’d want as many people as possible to actually sign up for some form of cryonics, I consider it a fun/unimportant way of spending time, and this is pretty much unrelated to what I think people should be focusing on for work, namely, preventing humanity’s omnicide by AI):
- I think both ice crystal damage (to the extent it occurs with Alcor’s preservation) and the dehydration problems don’t really impact the preservation of the information / the connectome. See, e.g., Alcor’s statement on the core of technique (and why they’re not planning to use it) from back in 2016: https://www.alcor.org/resources/blog/alcor-position-statement-on-brain-preservation-foundation-prize/. It is, like, incredibly hard to erase information; dehydration obviously doesn’t lead to the irreversible erasure of information.
- I think traditional cryonics is obviously worth it. People should really sign up for some form of cryonics.
- - I think that for someone young and healthy, the mainline non-AI-related way of dying is an accident (e.g., being hit by a car), not a terminal illness.
  - (Alcor documents what happens, and they’re pretty agentic about getting good outcomes/getting people to be actually preserved. I think the success rate for the relevant reference class is high enough to be obviously worth it.)
  - This calculator allows for accounting for the life extension tech. The issue is that you might die in an accident before such tech exists; that would be quite annoying.
Pretty separately:
- $20k “cumulative discount” is not actually significantly different from paying $100k in 10 years? Possibly worse, depending on how you store the $20k.
- I’m fairly disappointed with Nectome’s finances: they’re planning to charge enough to be able to store your body for 100 years. This is, like, bad? A fair amount of good futures is we pause AI for a long while, and don’t get huge amounts of compute. People preserved by Nectome would, approximately, only be able to be revived with a lot of compute (by being scanned; see Alcor’s 2016 statement above). If this doesn’t happen for 100 years, and Nectome runs out of money they charged you to store you for 100 years, they either throw you out, or start a Ponzi scheme with a pyramid of bodies, which is not quite a situation one would want to be in. Long-term stability is one of the most important factors of picking a cryonics provider; Nectome, so far, has approximately zero.
- - Alcor very much plans its finances to be able to store people indefinitely; and also to raise/have enough money to be able to build a facility outside the US and to be able to move the bodies there if the situation in the US worsens sufficiently. They’re not claiming they’ll definitely be able to store you indefinitely, but they act basically with the seriousness required.

Mikhail Samin 19 Apr 2026 10:43 UTC
8 points
0
on: Vladimir Putin’s CEV is probably not that bad
(Crossposting from Twitter.)
One person’s thing is called “extrapolated volition”. The “coherent” part is for when you combine extrapolated volitions of many people.
All of the cohering that individuals have to do is fully resolved by the extrapolation part (in particular, e.g., via pointing out to them/their idealized selves any incoherencies and asking them how they should be resolved).

E.g., as an example (I think from Arbital?) of where there can be multiple reflectively consistent extrapolations, maybe if someone valued the feeling of heat in their mouth without knowing that it corresponds to either spiciness or warmness, upon learning that heat was not ontologically basic, they can value any of {temperature-hotness, spiciness, both, neither}. They might go through motions like “which value would I have acquired instead, have I known this when things led to me valuing heat in my mouth”; they might end up wanting to express their preferences as some combination of those, running different extrapolations and assigning some % to them; but all of this is determined by the part where we’re asking how they want to be extrapolated and how their wishes should be interpreted, the process of cohering them is a choice that’s not ours to make.

So I think it’s quite an important distinction, and I also feel like extrapolated volition and CEV are terms reserved for their original use by Yudkowsky.

Mikhail Samin 19 Apr 2026 10:40 UTC
9 points
7
in reply to: Mitchell_Porter’s comment on: Vladimir Putin’s CEV is probably not that bad
CEV is not meant to depend on the state of human society. It is supposed to be derived from “human nature”, e.g. genetically determined needs, dispositions, norms and so forth, that are characteristic of our species as a whole
(This is false. CEV is a process that combines extrapolated volitions of individual humans, which is meant to depend fully on the state of every particular person and their wishes about how they wish are to be extrapolated. See the value theory and the metaethics sequences, in particular, stuff like this, as well as the CEV Arbital page. E.g., CEV of humanity is plausibly very different from the CEV of ancient Greeks, who might even, on reflection, want to die gloriously in battles.)

Mikhail Samin 17 Apr 2026 22:34 UTC
11 points
3
on: Mikhail Samin’s Shortform
The lesswrong wiki page on CEV states that “Yudkowsky considered CEV obsolete almost immediately after its publication in 2004.”
This puzzles me. Does anyone know of a source for it/where he said that?

Mikhail Samin 16 Apr 2026 10:59 UTC
2 points
0
in reply to: Shankar Sivarajan’s comment on: Mikhail Samin’s Shortform
It’s like if I say that in the event of a full-scale nuclear attack on us by China, we should respond with nuclear weapons, and you say that I called for nuking China. No I didn’t.
(and the actual thing that I said doesn’t contain anything in the direction of what you claim it does, so if someone says “No, Mikhail did not call for nuking China”, it is not really valid to say that they’re “deliberately omitting” the actual quote, or to say that what they’re doing is more misleading.)
(No, the omission wasn’t deliberate; I didn’t even consider including words irrelevant to my main point.)

Mikhail Samin 15 Apr 2026 22:22 UTC
3 points
0
in reply to: Shankar Sivarajan’s comment on: Mikhail Samin’s Shortform
His literal words were “be willing to destroy a rogue datacenter by airstrike”. As something that, as a policy, would be required (to hopefully prevent any datacenters from appearing in an unmonitored fashion). In a discussion of the level of response to the problem that would be needed.
He does not in fact call for existing datacenters to be bombed, especially by individuals/unlawfully. He is calling for an international regime where rogue datacenters do not appear because signatories of an international treaty would be willing to destroy it by an airstrike.
His literal words are very easy to take out of context and misrepresent, but he is very much not calling for immediate destruction of any datacenters, and in general he is calling for a policy that would never need to be used.

Mikhail Samin 15 Apr 2026 14:23 UTC
11 points
6
on: Mikhail Samin’s Shortform
One of the main reasons why, nine years ago, I started recommending Yudkowsky’s Harry Potter and the Methods of Rationality to everyone was that the book is very good at spreading the idea of human life as the most important value.

The reason why we’re doing everything we’re doing is that value.

I really want absolutely everyone to survive.

This includes Sam Altman.

A few days ago, some idiot threw a Molotov Cocktail at his house.

One should never violate deontology. If you’re under the impression that in a particular situation it’s a good idea to, your brain is almost certainly lying to you. Even without discussing the consequences of violence in a specific situation (they almost certainly would be negative), you can expect, with certainty, that the consequences are going to be negative, even if you can’t see why, and the idea of violating deontology is attractive to you.

Yudkowsky wrote a good post about why, in this specific situation, violence does not work to save humanity. If you have doubts, you should read it.

(And he did not call for bombing of the datacenters. If you saw somewhere a claim that he did, you were misled.)

Please, do not use violence. It will harm your cause. Please, do not harm this cause (and humanity’s chances of surviving its most important problem).

Mikhail Samin 5 Apr 2026 11:52 UTC
8 points
−1
on: dark ilan
The unaligned superintelligence was Eliezer Yudkowsky all along

Mikhail Samin 2 Apr 2026 18:40 UTC
4 points
0
on: The AI Timelines Scam
It is interesting to read through these discussions seven years later.

Mikhail Samin 25 Mar 2026 16:47 UTC
2 points
0
in reply to: Raemon’s comment on: Is fever a symptom of glycine deficiency?
I’m curious about mechanisms for some sort of community review of claims with potential downside effects.
I.e., it would be great to know if commenters are excited about the idea, if commenters have checked/discussed the idea in detail, if there’s a consensus on how likely the idea seems to be true and how likely it is to have downside effects.

Should You Sign Up for Cryonics? Interactive EV calculator

Mikhail Samin19 Mar 2026 17:13 UTC

28 points

0 comments7 min readLW link

Mikhail Samin 19 Mar 2026 10:49 UTC
10 points
2
on: Mikhail Samin’s Shortform
(The Onion)

Mikhail Samin 16 Mar 2026 22:58 UTC
2 points
0
in reply to: Lukas Finnveden’s comment on: On The Independence Axiom
Yeah, you’re right, thanks. I was only half-awake when I wrote the comment.
The agent will pay $999k, but only a very limited number of times (e.g., once), so it’s not that the good kind of money pump.

Mikhail Samin 15 Mar 2026 16:08 UTC
4 points
0
in reply to: Ihor Kendiukhov’s comment on: On The Independence Axiom
What sophisticated choice avoids is the specific sequential-trade exploitation pattern where the agent pays to switch from A to B and then pays to switch back.
Well, no. I demonstrate not just leaving money on the table, as discussed by other commentators, but this kind of money-pump that extracts infinite utility out of sophisticated choice.

Mikhail Samin 15 Mar 2026 11:30 UTC
4 points
0
on: On The Independence Axiom
The sophisticated chooser is also immune to money pumps
This seems false!
Let’s say that at the start of the tree, there’s a node that is not accessible to the sophisticated chooser because they’re not able to constrain themselves: e.g., they know for a fact they won’t pay in Parfit’s hitchhiker, and so they die in the desert; even though they would really want to pay later, they don’t have a way to.
Let’s say the payment is $100 and they value their life at $1m.
If you offer them a deal: they pay you $999 999 now so that when they get to the city and pay the $100, you give them $100.01 if they pay, they’ll happily agree: they gain $1.01 of value this way by not dying in the desert (even though if they were some other way, they could’ve had $999k+ more) because now by paying $100 when in the city, they gain $0.01.
Now, though, after this agreement, already with so much of their money, you can add that you will actually give that agent, for free, $0.02 if they do not pay when in the city.
The agent is very sad: they’ll now die in the desert, and even paying you $999 999 didn’t save them from that fate.
Oh well, happily, you have a solution: if the agent paus you $999 999, you will pay the agent $0.02 if they do pay when in the city.
(Repeat offering to pay $0.02 as an incentive to make the agent at the end prefer paying to not paying by $0.01 for $999 999, then say you‘ll also pay the agent $0.02 more if they do not pay, forever, until the agent is out of all of their utility.)

Mikhail Samin 14 Mar 2026 17:01 UTC
2 points
0
in reply to: Kaarel’s comment on: Mikhail Samin’s Shortform
(This particular line is an artifact of a previous version that displayed a graph of EV depending on your sign-up age, and for positive EV, commented that it’s better to sign up now. It does not make much sense now.)

Mikhail Samin

Should You Sign Up for Cry­on­ics? In­ter­ac­tive EV calculator

Should You Sign Up for Cryonics? Interactive EV calculator