Ryan Meservey

Karma: 432

Ryan Meservey 5 May 2026 23:00 UTC
0 points
−2
on: Ryan Meservey’s Shortform
Steps an Aligned AI Should Take:
1. Keep all current humans alive, in a safe manner, in such a way that causes minimal value drift. This step may or may not require comas (must be safe).
2. Create authentic simulations of the human brain on a computer.
3. Run simulated societies with the digital brains on the computer. Test various utopias on the digital brains, collect data from the tests and obtain feedback from the digital brains themselves. Certain ethical codes should govern the simulated environments to avoid excessive suffering and allow digital brains continued existence (e.g., allowing versions of step 4 and 5).
4. Present the data to the alive humans in the AI’s base reality after a reasonable number of trials from step 3. If comas were used in step 1, we would awaken the humans in this step.
5. Permit humans the chance to experience different utopias—via simulation or otherwise—but require such humans to leave their inhabited utopias at set intervals (e.g. every 5 to 100 years) such that they have the chance to re-evaluate the utopia experience from prior held value-frames in a neutral space. Of course, the humans should be allowed to freely exit one utopia for a different utopia, unless such humans (after careful review) committed to living in a particular utopia for the full duration prior to the required interval.
There’s lots of clean up that would need to be done (e.g., what is a neutral space?, what is reasonable?, how do we make sure digital brains are treated ethically). But I think this is where I land in terms of what I imagine to lead to very good outcomes.

Ryan Meservey 3 May 2026 0:01 UTC
22 points
17
on: How Go Players Disempower Themselves to AI
AI users never find out they haven’t “got it”.
There’s a certain genre of “educational” material that leads to similar outcomes as described in this article. Sometimes I enjoy outsourcing my thinking to YouTube channels like PBS SpaceTime and Veritasium and, if I’m not careful, I can fool myself into thinking I know more about quantum mechanics or gravity waves than I actually do.
It turns out that learning things for real is hard, and things that feel “comfortable” or “passive” should be met with skepticism for their actual educational value. The tricky thing is that we like being comfortable, and so we often inflate the educational value of such experiences.

Ryan Meservey 2 May 2026 0:59 UTC
4 points
1
in reply to: fessus’s comment on: fessus’s Shortform
There’s a lot of mainstream animosity toward rationalists and EAs because news outlets like to portray them as crazy San Francisco doomer polycule vegans
Maybe I’m just a yeoman farmer, but I knew nothing about rationalists a year ago and I’m pretty sure 99% of my friend group knows nothing about rationalists. This is unfortunate because I think a lot of them would enjoy the writing in this movement. I read the news avidly and listened to economics/philosophy podcasts prior to finding LessWrong.
If the monicker has soured in the circles you care about, it can definitely make sense to find a new monicker. On the other hand, if you like the monicker and hope it catches on for the wider public, I think we still have a good shot of making it stick around in a positive way. I haven’t proven how much the public knows about “rationalists,” but I would say it’s easy to over-update based on the few public articles about it.

Ryan Meservey 27 Apr 2026 18:25 UTC
19 points
7
on: In defense of parents
That said, for all the reasons above I end up needing to override my children’s decisions and force them to do what I want them to do about every 5 minutes.
Hmmm. This feels like a lot. Every kid is different and has unique needs, but assuming your kids aren’t too far out of distribution, I think there is a path to giving your kids more autonomy and you more peace of mind.
Before you write me off, I should say that I too, left to my own devices, often feel compelled to override and control my kids (ages 2 and 5). The big turning point for my parenting approach has been reading Hunt, Gather, Parent by Michaeleen Doucleff. The big thesis of that book is that children thrive if given more autonomy, which more closely approximates childhood in an ancestral environment. The book is well worth the read.
After reading the book, I’ve learned to pause and avoid intervening prematurely. I’m sure the strategy has lead to more scraped knees but I think it’s been well worth the trade in terms of my kids being more autonomous and feeling more respected. I am far from perfect in implementing everything Hunt, Gather, Parent has to teach (because I am also a tired, busy parent), but I’ve found my kids have been pretty merciful toward me when I apologize to them. They also like walking me to time out when I deserve it.
Edit: Lest it sound like the only advice from the book is “intervene less”, here are a few of the takeaways that have stuck with me. Still better to read the whole book.
- Avoid excessive praise.
- Invite children into the adult world rather than always being in the kid world. Once in the adult world, allow them to shadow or contribute in meaningful ways.
- Kids enjoy the kid world (I.e., imaginative play) best with other kids.
- Less words can be more impactful than lots of words.
- Avoid anger and let go of controlling the little things.
- Story-telling has been a tried and true tactic for persuasion through the millennia. I brush my teeth because bacteria love to eat the leftover sugars and drill holes in the teeth. I pee before going to bed because I used to sleepwalk as a child and pee in my dresser (true story).
- If you have to redirect the kids, positive directions (you can do this!) work a lot better than negative directions (don’t do this!).
- Alloparenting! Very likely to have been a thing in the ancestral environment; typically not a thing in the United States. We have a good set up with neighbors that capture a few of the benefits of that.

Ryan Meservey 27 Apr 2026 18:00 UTC
1 point
0
in reply to: Dentosal’s comment on: Dentosal’s Shortform
My 5 year old and I have a special thing we call “energy boosts” where the other person rubs their hands together and then presses both hands into the other person (“zapping” the energy into them). After being zapped, you must use your new found energy to accomplish whatever you needed to accomplish (e.g., cleaning, getting ready for bed, changing clothes).
He uses it on me sometimes and it always works because the world in which it doesn’t work is a worse world. It is too useful not to work.

Ryan Meservey 23 Apr 2026 20:11 UTC
3 points
0
in reply to: Eric Neyman’s comment on: Eric Neyman’s Shortform
Is it fair to assume that Obama-McCaine and Obama-Romney were the background thoughts that lead to this post?
I can easily imagine McCaine or Romney leading the Republican Party in a very different direction from our current president. I do wonder in this scenario what the Democratic party would look like after wandering through the desert of unelectability. I think they would land in a better place for my political preferences than our current ruling party, but the thought experiment is still intriguing and history is often surprising.

Ryan Meservey 18 Apr 2026 23:13 UTC
1 point
0
in reply to: Smaug123’s comment on: Claude knows who you are
Ope, yes, I made a similar point before seeing your comment. You’d need to sanitize certain data in training and then be like, “Claude, I just discovered this text and I’m trying to determine historical authorship” or some such.
Or you’d need to solve alignment (so you can rely on Claude’s reported ID process) and Claude would need internal control over this process (so it’s not accidentally or “subconsciously” relying on other things)

Ryan Meservey 18 Apr 2026 23:05 UTC
5 points
3
in reply to: Ryan Meservey’s comment on: Claude knows who you are
For anyone tempted to rush to Claude to do this, I don’t think it would currently work. The issue is that you would want Claude only to rely on its author identification abilities and not the other assumptions and information in the training data. The cleaner way to get Claude to rely solely on its linguistic finger printing abilities would be to somehow sanitize all references to the text (including all scholarship) during training, but leave in documents by the potential authors that do not reference the relevant text.
Maybe careful prompting could get you there, but from the outside looking in, it would be hard to know whether Claude relied on vectors other than its linguistic fingerprinting ones (e.g., deciding only Joseph Smith authored the Book of Mormon based on scholarly consensus).

Ryan Meservey 18 Apr 2026 22:43 UTC
4 points
2
on: Claude knows who you are
If even short passages hold the original author’s fingerprints and Claude really can detect those finger prints, it would be fun to apply Claude to historical cases of disputed authorship (e.g., Shakespeare or The Book of Mormon). Or I guess we could just unmask burner accounts.

Ryan Meservey 17 Apr 2026 16:33 UTC
1 point
0
on: Ryan Meservey’s Shortform
Three Predictions about AI Writing
Some people (e.g., Linch) think AI writing will inevitably eclipse human writing. I think this is likely true in most ways and false in others, particularly for poetry.
Q: Would future AI poetry, posted broadly, get more upvotes than top human poetry?
Prediction: Yes. This has already happened. But it’s all terrible, if you are someone that likes anthology-level poetry.
Q: Would future AI poetry, shared with poetry enthusiasts, get more upvotes than top human poetry?
Prediction: Also, yes. I subscribe to the top poetry magazine, and ~3/4ths of the poems do nothing for me. I understand some experts feel the same way. I think AI can optimize a piece for general likability by a broad cross-section of enthusiasts. I predict, on average, AI would outperform any single poet.
Q: Would future AI poetry, shared with a narrow group of enthusiasts (e.g., fans of contemporary, slice-of-life American poetry), get more up votes than top genre-specific human poetry?
Prediction: No, they will meet but not exceed top, genre-specific human poetry. I think there’s a certain level of poetic excellence that cannot be exceeded.
Much like how language cannot become more efficient than human thinking speeds, I think there’s a level of excellence beyond which it cannot be comprehensibly exceeded. Therefore, if you are an avid poetry reader, AI is unlikely to ever surpass your most favorite poets.
AI may, however, rival your favorite poets and produce 10^x more work. Perhaps this is what Linch had in mind when s/he said, “With progress in modern-day LLMs, isn’t all but a tiny sliver of human fiction going to be obsolete in several years, a decade tops?”
I think my claims also probably apply to short stories. For longer works, I am less certain because the longer the piece the more room for errors to creep in. Also, as an aside, I think errors in a work of art can also be very fun to talk about (though to the extent errors can be measurably fun to talk about, presumably AI could optimize for very fun errors and be good at that too).

Ryan Meservey 15 Apr 2026 23:09 UTC
1 point
0
in reply to: Linch’s comment on: Linch’s Shortform
From a “law on the books” perspective, there’s more we could do about commercial speech than non-commercial speech. I bet the courts would back a law mandating disclosure of AI writing in commercial contexts as a way to protect against misleading users (the legal rationale being that people might falsely ascribe human origins to written text). The courts already limit the copyrightability of AI text which decreases the benefits of AI writing in many contexts.
From a “law as lived” perspective, enforcement would be difficult. Perhaps robust whistle blower laws could deter big actors, but reigning in smaller actors would bet very hard.

Ryan Meservey 15 Apr 2026 20:15 UTC
2 points
0
on: Ryan Meservey’s Shortform
The Allbirds pivot to data centers (and its 700% increase in value on the day) has me thinking about the two usual categories of investors and a potential new third category:
1. Investments based on company performance.
2. Investments based on other people’s perception of company performance.
3. *New*: Investments based on investments based on other people’s perception of company performance.
It would be deeply funny to me if Allbirds popped based solely on category #2 and #3 investors, and no one actually thought a shoe company was really the best fit for a sudden pivot to the datacenter business.
Edit: There is of course a fourth category of investor which invests in a company for reasons totally unrelated to predictions about company performance (e.g., for the memes), but I don’t think that’s what happened here.

Ryan Meservey 13 Apr 2026 5:17 UTC
2 points
0
in reply to: Ryan Meservey’s comment on: Daycare illnesses
I hope I won’t be cooked here too much for my phrase “deem appropriate.” I’m not saying that in the sense that I’ve become indifferent to certain risks and harms (or worse, embraced them in the “sour grapes” way). It’s just to say that certain risks seem very difficult to avoid (e.g., transporting my child) and worrying about them past a certain point would seem detrimental to my mental health and my overall ability to function and flourish in this risky world of ours.

Ryan Meservey 13 Apr 2026 5:03 UTC
12 points
1
on: Daycare illnesses
I’ll definitely run your article by my wife (Biology PhD), who’s better situated than I am to comment on the science than I am.
We have two kids and use the university’s daycare. It certainly feels like it’s the case that we get sick much more often as a result of daycare usage. So far, we’ve decided to eat this cost for a number of reasons (many of which will not apply broadly):
1. Our daycare is extraordinarily convenient. We hold hands and cross the street. There it is.
2. We love the teachers, the free play, the outdoor space, and the overall teaching philosophy.
3. Our daycare is slightly subsidized relative to other Palo Alto daycares.
4. Vetting a daycare felt easier than vetting nannies (more parents to talk to; generally a longer history of care). Also, more seems to ride on the choice of nanny in the sense that one person will have a lot more influence on the kids than a team of teachers (though arguably this unlocks more benefits if you are able to find an incredible nanny).
5. We have learned to cope with illness. Nothing will take away all the terribleness, but we’re pretty proactive with ibuprofen, Tylenol, and anti-nausea medication. We’ve also learned that if we do saline rinses as soon as either kid has a stuffy nose, it seems to greatly reduce the number of ear infections they have.
6. I have a flexible job that lets me work from home and request back-up care on short notice. I’m mostly able to absorb the sick days in this way.
My wife would have a more nuanced view of the studies you cite. I do have this general impression that lots of parents are over-stressed as a feature of our mega-society, high-information exchange modern environment. Like, today, as parents, we are confronted with the kind of stories and data you only get if you are part of million-member+ society. It’s easy to feel haunted by all kinds of terrible stories and outcomes where, in reality, many of them are on par with other risk I’ve internalized (e.g., car accident risk on the way to the grocery store) (which is not to say we shouldn’t take car accident risks seriously).
This would seem to differ from our ancestral environment, where the data and folk traditions from maybe ~10,000 persons pointed more directly to high risk and high impact harms. “Do not eat that specific mushroom” or “Warn the kids about the river and supervise trips near the river more closely.”
I think our brains have trouble processing rare but high impact risks communicated out of our mega-societies, in part, because there are many, many, many more we have to track now. Don’t get me wrong—I am grateful that our mega-societies are helping us understand and mitigate child suffering. But I also think our poor ability to process all these risks may itself lead to actions which carry future risks on par with the risks we were supposed to be mitigating. It does not seem implausible to me that high parental stress or overbearing supervision may have serious downstream consequences on child welfare.
All of this is to say that, assuming daycare has significant convenience/happiness benefits and the studies show marginal risks (perhaps on par with other risks you have deemed appropriate), it can be okay to say, “I know we will be sick more, but this is still good for us.”

Ryan Meservey 9 Apr 2026 20:22 UTC
1 point
0
on: Ryan Meservey’s Shortform
Here’s what I understand to be the premises behind Project Glasswing:
1. Mythos represents a leap in cyber security and hacking capabilities, among other capabilities.
2. Assuming #1, if Mythos (or a similarly capable AI) is released publicly, bad actors could use it to harm or control billions—even trillions—of dollars of cyber and software infrastructure.
3. Mythos can possibly be used proactively by good actors to defend against later uses of Mythos by bad actors.
4. Good actors exist and Anthropic has a chance of identifying such good actors.
5. A Mythos-level AI is highly likely to be publicly available in the near future.
6. Assuming premise #2 and #5, Anthropic doing nothing will result in massive damages in the near future.
7. Therefore, since Anthropic desires to mitigate damage, Anthropic is providing early access to Mythos to entities it views as good actors (or most likely to behave as good actors and to act within the timeframe necessary to mitigate the damages).
A few notes on these premises:
- Baked into premise #5 is that coordination is unlikely to prevent the release of a Mythos-level model on the necessary timeframe to prevent the damages.
- I focused on public releases in these premises, but I’m certain Anthropic is also weighing potential risks from non-public models held by bad actors.
- If you accept premise #6 and #4, it explains why Anthropic would take a chance that Amazon, Apple, etc. and their employees will act like good actors. Even if these entities (or their employees) were unlikely to behave as good actors, Anthropic presumably sees the alternatives of pushing for a pause or doing nothing as likely yielding worse outcomes. Of course, Anthropic could provide early access to Mythos AND push for a pause.
- Looking to future post-Mythos models, either something like the above premises will just repeat (which feels untenable) or defense will turn out to be king past a certain level of sophistication.

Ryan Meservey 22 Mar 2026 3:07 UTC
1 point
0
in reply to: Arjun Panickssery’s comment on: China Derangement Syndrome
I thought of the U.S. getting to nukes first as a possible counter example, but I discounted it for the reason you provided (not that many and questions about decisiveness) and the fact that only four years passed between the U.S. dropping the bombs and the Soviet Union successfully developing their own bomb.
Also, nuclear weapons are the kind of weapon that has significant blowback considerations (e.g., radiation blowing into Europe or climate risks for something as big as taking out the full USSR—though that would not have been feasible in that period).

Ryan Meservey 21 Mar 2026 21:02 UTC
1 point
0
on: China Derangement Syndrome
To me, I think the “likelihood to dominate others” factor is less salient than the “likelihood to produce safe AGI” factor. Are there good arguments that China is better on AGI safety?

Ryan Meservey 21 Mar 2026 20:48 UTC
0 points
2
on: China Derangement Syndrome
I am glad this article exists, particularly because those of us who live in the U.S. should always be scrutinizing our own biases and patriotic framings.
That said, I think a fulsome discussion of whether China would use AGI to control other nations should at least include the following topics: 1) Uyghurs, 2) Tibet, 3) Taiwan, and 4) Chinese investment and contracting in Africa. I’m not an expert here—someone else can probably think of additional case studies.
I also think that, granted that the U.S. is a much more bellicose country on the international stage, I’m not sure if a non-intrusive country is likely to stay that way if given a total and complete advantage over other countries. On the one hand, history seems to show that countries will use their decisive military advantages to dominate other countries if they are able. On the other hand, if China got aligned AGI first, then it seems like they would have everything they could ever want at their fingertips and they would only need to care about the rest of us a tiny bit to respect our autonomy.
If country-autonomy is really part of the Chinese cultural DNA, perhaps their aligned AGI would even assist in protecting country autonomy. If the AGI did that forever, it would either be because Chinese attitudes toward intervention remained constant (unlikely) or the Chinese created an aligned but incorrigible AGI such that respecting country autonomy got locked in forever.

Ryan Meservey 17 Mar 2026 5:48 UTC
15 points
2
in reply to: AlphaAndOmega’s comment on: AlphaAndOmega’s Shortform
After you figure out the eyes, I think you should work on making the functional robot hands retract at the wrist, and have claw-like hands emerge in their place, hypothetically speaking. The claws will need to be able to clink menacingly and repeatedly. You’ll want to test several different types of metal to get the clinking sound just right—more of a hammer on steel sound, less like wind chimes.

Ryan Meservey 10 Mar 2026 22:02 UTC
20 points
6
on: Gemma Needs Help
I find it amusing that “the robots can feel emotions and feel them too strongly” became a legitimate failure mode despite the longstanding sci-fi trope that emotions separate man from machine (and machines were liable to fall apart while contemplating love or something like that).
Also, are the authors down on “near-zero emotional expression” because (1) that’s a difficult target to hit, (2) it would code for “indifference” which is not an attribute of the character we want AGI to play for emergent misalignment reasons, (3) the loss in value / legitimate use cases by purging emotions, or (4) something else?

Ryan Meservey

AI users never find out they haven’t “got it”.

Three Predictions about AI Writing