neo

Karma: 152

neo 4 Jun 2026 22:39 UTC
4 points
0
in reply to: Raemon’s comment on: Akash’s Shortform
I’ve always had an inkling that frontier labs would make increasingly rigorous safety communications as the threat became more salient. After all, most lab leaders seem to have recognized x-risk way before the AI boom and are presumably interested in continuing to exist.
And in general, I think a lot of the safety community tends to extrapolate from current lab safety attitudes and not price-in changes to communications and concerns that seem likely as capabilities, and accompanying unease, grow.
But I agree, this doesn’t feel satisfying for some reason. Was it necessary to downplay/gloss over safety and scare the shit out of everyone concerned about this sort of thing? And if it was, for political or financial reasons, it means there exist incentive structures working against transparency of knowledge/belief, which seems bad overall and for the long-run.
Side note—your link requires a login, not sure if that’s intended.

neo 4 Jun 2026 22:15 UTC
8 points
0
in reply to: jacquesthibs’s comment on: Rohin Shah on AGI Safety
I think he somewhat answers your point here:
And as a result, our sort of meta strategy involves essentially roughly forecasting, maybe not formally forecasting, but roughly in our heads having some sense of what things are going to be potential problems over the next some time period — call it three months, maybe longer, maybe shorter, who knows — and identifying what sorts of considerations might become quite important during that time given the capabilities that we expect to have, and making sure that we’re prepared for those. And if we’re not prepared for that, then possibly slowing down, or pausing development, or talking to governments, trying to do advocacy.
But importantly, we’re not really trying to forecast arbitrarily far into the future all the problems that are going to arise with AI development. The goal isn’t “Know how to align ASI, or else do nothing.” Usually I think of us as looking at a time horizon of, it depends on which particular thing we’re doing, but often somewhere between three months and five years.
I interpret this as: we are focusing on short-term planning and ready to support a pause if we see imminent danger.
In other words, he probably is concerned with the alignment worries you mention may emerge in the future, but doesn’t consider them proximate enough to warrant present-day commitments.

neo 21 Apr 2026 23:31 UTC
4 points
0
in reply to: Valdes’s comment on: I built a semantic search engine for LessWrong
I think it would be good to have easy to click examples on the welcome page, so we can get a feel for it without having to pick a post to test.
Done! Chose some high-karma posts that seemed particularly interesting for now.
I do intend to open source the code, and I’ll definitely try out the connected force graph.

I built a semantic search engine for LessWrong

neo21 Apr 2026 21:13 UTC

15 points

3 comments2 min readLW link

neo 1 Apr 2026 22:15 UTC
1 point
−2
on: Dying with Whimsy
As someone whose worldview was upended over the last few years because of AI progress, this post resonates. Sometimes the situation we are in just seems absurd – like, all I can do is laugh, shrug, and then go back to what I was doing. And I think this is an emotionally healthy response. Sometimes the best reaction to reality is an unbothered acknowledgement of it’s absurdity.
But I do worry that speaking of doom like this – as if it is nearly-certain^[1] – is counterproductive.
I think everyone should invest time in mentally preparing for an uncertain future. Finding a way to be okay with the possibility of disaster, while staying motivated to work hard to avoid it.^[2]
This is important whether or not you think doom is likely. In all worlds, those who can be effective regardless of their perceived odds of success, are the ones most capable of succeeding.
I have personally managed to find a healthier relationship with the world; retaining some whimsy and optimism despite a sober reckoning with reality. And I feel like I can actually achieve things now, as a result.
I want others to experience this as well. The ideas have already been discussed, but emotionally internalizing them takes a while, and it certainly did for me.
I hope we find better ways of communicating this so that all of us could get better at achieving our goals 🙂
1. ^
  When the emotional advice you give is downstream of that prediction. Like, Dying with Dignity or “Life may be very short. So make the next few years the best ones.”
2. ^
  I’ve been thinking about this comment a lot, although I can’t attest to any of the specific recommendations.

neo 31 Mar 2026 18:21 UTC
1 point
0
in reply to: Mateusz Bagiński’s comment on: neo’s Shortform
Apologies for the delayed response.
I should note that the post was somewhat hastily written – I agree that my categorization was not comprehensive, and yours is probably better.
I was mainly trying to point at a dynamic I see often online where influential voices present arguments regarding AI risk that completely ignore years of back-and-forth discussion on similar topics, but whose positions are interpreted as the “forefront” of the debate – leading to offshoot discussions that again, miss years of relevant literature and discourse. I think this leads to, among other things, Eliezer frequently “losing it” on X/Twitter over people apparently misunderstanding something he wrote about in detail 20 years ago.
Community Notes on X/Twitter are sometimes regarded as a major improvement to collective epistemics, but I think there is a lot of room for improvement with tools that “situate” current discussions within previous ones.
But yes, I am describing a somewhat vague and imprecise problem here, so it may be difficult to categorize or pin it down with certainty.

neo 20 Mar 2026 19:04 UTC
1 point
0
in reply to: ChristianKl’s comment on: neo’s Shortform
I think the disagreement stems from a lack of specificity on my part; ignore the specific description of the categories.
Probably, you are in soldier mindset yourself about this very issue.
I hold beliefs on it, sure. I am now interested in seeing if they reflect reality, and learning why/why not. Is this mindset inadequate, and what would make it more rational?
Separately—do you think there is promise in tools of the type I describe to combat soldier mindset at scale? I will definitely be reading into some of the CFAR resources, just curious to hear from you.

neo 20 Mar 2026 18:23 UTC
1 point
0
in reply to: ChristianKl’s comment on: neo’s Shortform
Certainly agree with your point about donating to Substacks / journalists. Could be very impactful to have a writeup of that somewhere here or on the EA forum.
I’m familiar with Galef’s ideas; I would place “soldiers” in category 2. But yes, the distinction is very subtle and I did not specify it well enough.
I believe that sufficiently well designed UI for navigating debates/arguments/discussions can make it very difficult for people to disguise soldier mindsets via obfuscated (intentional or unintentional) communication and reasoning.
Imagine, for example:
- User creates a strongly worded post that features a clear strawman and/or blatantly skips over serious, in-depth prior discussion on the same topic.
- An LLM categorizes the argument to properly situate it within prior discussion and notifies the user that they A) do not appear to have an accurate understanding of the original source—specifically pointing out why B) have not yet explored the X counterarguments coming after that line of reasoning, and the Y that come after that.
This could be seen as an enhanced version of “community notes” aimed at situating shallow, under-researched takes within a larger “map of human thought.”
Whether this can scale and outcompete current systems is unknown, but it does truly seem promising for the enhancement of public discourse and like a step in the right direction.
Appreciate the comment, was helpful in clarifying my thoughts.

neo 20 Mar 2026 17:14 UTC
1 point
0
in reply to: TsviBT’s comment on: neo’s Shortform
Hmm, this does look interesting but I hadn’t really considered depth of 1-on-1 communication as a significant bottleneck. I also think the concept slightly falls apart when the users are not already knowledgeable / quick thinking / good at rigorous communication, as I’d guess there would be a steep learning curve.
Software wise, I think I’m aiming closer to better debates and debate tools, mainly because these things could be made public-facing and seem like an obvious use case for even present-day LLMs.
If you have any further thoughts regarding implementation of those things, I’d be eager to know. I’ll be trying to make my plans more specific.

neo 20 Mar 2026 17:03 UTC
2 points
1
in reply to: papetoast’s comment on: LessWrong’s UX may not be living up to its ideas
even with perfect onboarding someone would still need to motivate themselves to read quite a large chunk of text to reach the pareto frontier.
Sure, I just don’t think this is a big issue because people bottlenecked by motivation probably won’t be serious contributors, and it’s worth improving things for those that will.
I think the main bottleneck isn’t tooling, it is how much time people are spending to make things legible for other people/newcomers.
Yep, I’m pretty much in agreement with your broader point here – most of this stuff fails at the user level. It seems like manually organizing content would be a good place to start for me.
At the same time, I want to explore how this can become a self-propelled process in the future. It seems like an obvious use case for LLMs, and presumably not a massive engineering task.
I appreciate the comments – hopefully you can weigh in as I continue looking into it.

neo 19 Mar 2026 21:39 UTC
4 points
1
in reply to: Daniel Kokotajlo’s comment on: On restraining AI development for the sake of safety
I think Joe is arguing that unilateral capability restraint might exacerbate such risks.
For example, to the extent that safety-related considerations end up motivating or rationalizing especially drastic forms of international action aimed at shutting down or significantly restricting AI development in other countries, I think this could well be harmful even relative to more baseline forms of economic and military competition.
Like, the US government is all in on safety but China is not complying. Which would probably get intense, and rightfully so.
I still agree with you because:
- I don’t think that is very likely at all; if one country/coalition is all-in on safety, I’d guess there is something motivating their fear that would apply to everyone else as well.
- The default outcome of nobody being all-in on safety seems obviously much worse as you point out.
- There is an argument to be made that great power conflict is justified in the credible expectation of total annihilation otherwise. Obviously this is extremely controversial as it can be easily misused, and IIRC has been used to deface the EA/rationalist community (or Nick Bostrom?) in the past, but I don’t think that takes away from it’s merit.

neo 19 Mar 2026 21:29 UTC
8 points
0
on: Broad Timelines
This previous LessWrong article seems extremely relevant and basically sketches out an example of the rough “strategic portfolio” for AI risk that you are arguing for.
In line with some of my recent posts, I’m starting to think there is a lot of value in:
- Clearly defining consensus group strategy (among LW/EA/CG, for example) on “making the future go well.” This should include rough estimates from a variety of respected sources, a diverse portfolio of interventions, and explicitly communicated uncertainty / epistemic humility.
- Designing info-UI tools to facilitate that process. Enabling effective deliberation, strategy adjustment, and maybe most importantly: easy-to-use interfaces for the general public. The goal being to convey community beliefs and disagreements in a very transparent and easy to understand way.
- - This is intentionally unspecific but I have outlined a couple particular ideas in previous posts and will continue to crystallize my suggestions / explain why I think this area has potential.
The AI futures model and related ecosystem is a great start but is limited to a handful of thinkers (Daniel and Eli) and a specific subset of information (forecasting timelines). Their work has already been quite impactful (read by JD Vance apparently) – why not work hard to apply and scale good information-interface-design to broader community strategy?
What links here?
- We need Git for AI Timelines by fluxxrider (13 Apr 2026 9:04 UTC; 28 points)

neo’s Shortform

neo19 Mar 2026 19:19 UTC

2 points

13 comments1 min readLW link

neo 19 Mar 2026 19:19 UTC
18 points
5
on: neo’s Shortform
What are some research directions for “improving coordination?”
In light of a recent post and comment, and several months of thinking, I have come to the position that one of our (humanity’s) biggest problems is that we suck at precise coordination at every level.
This is not very specifically defined but I am trying to gesture at a problem area I think is super important. Some thoughts to convey my intuition here:
- If the extreme risk of the AI development trajectory is as true and obvious as many believe (everyone’s life at risk), humanity’s thinking about it should appear a lot more sophisticated than it does now.
- For the last few years Eliezer has basically been throwing his hands up in exasperation at the incompetence of the world and many have shifted to public-facing communication, presumably believing that trying to convince AI insiders is hopeless.
Broadly, I think there are two cases of problems with coordination:
1. Two people/groups genuinely agree to honest, rigorous exchange of information, but can’t effectively coordinate.
2. Someone is withholding information or doesn’t really want to coordinate in the first place.
I think the first problem is workable, and if improved sufficiently, makes progress on the second problem by clearly exposing parties that are avoiding productive exchange.
- Specifically, I think there is a lot of progress to be made with augmenting the exchange of information between people. I think LessWrong, the knowledge commons arguably at the frontier of ensuring humanity’s survival, is lacking in features for this purpose. Maybe because most users here are already conscientious and strongly value truth-seeking, which makes improvement seem less necessary.
Hopefully I’m making this line of thought clear enough. Key points:
- Trustworthy, robust, and future-proof governance is the ultimate problem for humanity and anything else is a band-aid on a bullet hole.
- Highly effective coordination is part of that problem, and better information exchange/clarity is a subset of that.
- - I think LessWrong can become an exceptionally effective prototype of this, and this could be very high leverage because of how proximate it is to the frontier of AI. Happy to expand more here.
I am interested in situating my thinking better here. Who is working on this sort of thing? I know TsviBT has explored improvements to debate, Richard Ngo / Samo Burja are exploring broader political manifestations, Forethought has published adjacent work. Is there anything I’m missing? Very interested in contributing here and think it’s a clear place where the ball is being dropped.
What links here?
- neo's comment on Broad Timelines by Toby_Ord (19 Mar 2026 21:29 UTC; 8 points)

neo 19 Mar 2026 17:51 UTC
3 points
2
on: On restraining AI development for the sake of safety
What I’m gathering from this is:
- The current trajectory of AI development is reckless and we need more time.
- But it’s way more complicated than that.
It seems like this all ultimately becomes a matter of designing an effective, robust, future-proof institution aimed at guiding humanity to a good future. Similar to what the Founding Fathers attempted with the United States, but with enough foresight to account for huge and unpredictable changes in technological capability.
And then, a matter of getting powerful people on-board, which probably means showing everyone that there are a million ways for humanity to destroy itself^[1] if our process of navigating the future isn’t done with extreme care.
Why can’t we start by agreeing on this relatively simple message? Is there some way to get the AI community to at least come together here, despite all other disagreements? Does it not greatly frustrate everyone else whenever important AI-leaders recycle the same vague oversimplifications^[2] of what is literally humanity’s ultimate problem?
I feel like I’m going a little crazy here. Everyone gestures at different research directions they find promising but it seems like everything eventually falls apart^[3] when you recognize that we are on a planet run by apes that will soon wield self-destructive power. At the end of the day, effective political governance and coordination are a pre-requisite to the survival of our species and it does seem like this is incredibly neglected.
Perhaps there is hope if frontier labs, the members of which may become extremely wealthy and powerful, start actively planning for and prioritizing this process of designing–with much deliberation and careful philosophical consideration–the successor governance system for the human race. What can we do to make that more likely?
1. ^
  We probably do have to worry about cultural drift and other “future issues” quickly made salient by AI-driven acceleration.
2. ^
  Dario says “Humanity is about to be handed almost unimaginable power, and it is deeply unclear whether our social, political, and technological systems possess the maturity to wield it.” And: “The years in front of us will be impossibly hard, asking more of us than we think we can give.”
  
  So, what are we doing about that? Have there been any attempts at creating or investing in institutions that try to figure this sort of thing out? As far as I can tell, AI governance is a fragmented, not-very-rigorous field of study which just seems completely backwards, and something that we could at least be trying way harder to solve.
3. ^
  Including alignment. For God’s sake, what/who are we even aligning AI to? The world still ends if everyone has an obedient super-intelligence in their pocket.
What links here?
- neo's comment on neo’s Shortform by neo (19 Mar 2026 19:19 UTC; 18 points)
- neo's comment on Broad Timelines by Toby_Ord (19 Mar 2026 21:29 UTC; 8 points)

LessWrong’s UX may not be living up to its ideas

neo18 Mar 2026 20:16 UTC

19 points

4 comments2 min readLW link

neo 4 Jan 2026 22:36 UTC
3 points
0
on: In My Misanthropy Era
Regarding traits you love – maybe you are looking for something like intellectual humility? I think it can naturally follow from kindness and cooperativeness, but is often necessary for me to respect an intelligent person.
It also seems like a core principle of this community, where as some say, “we gain status by pointing out where others haven’t been careful or skeptical enough in their thinking.”

Calling all college students (and new readers)

neo4 Jan 2026 21:20 UTC

15 points

0 comments1 min readLW link

neo 23 Sep 2025 20:58 UTC
7 points
5
in reply to: Vladimir_Nesov’s comment on: Global Call for AI Red Lines—Signed by Nobel Laureates, Former Heads of State, and 200+ Prominent Figures
Agree; I’m strongly in favor of using a term like “disempowerment-risk” over “extinction-risk” in communication to laypeople – I think the latter detracts from the more important question of preventing a loss of control and emphasizes the thing that happens after, which is far more speculative (and often invites the common “sci-fi scenario” criticism).
Of course, it doesn’t sound as flashy, but I think saying “we shouldn’t build a machine that takes control of our entire future” is sufficiently attention-grabbing.

neo 23 Sep 2025 0:22 UTC
30 points
7
in reply to: MichaelDickens’s comment on: Global Call for AI Red Lines—Signed by Nobel Laureates, Former Heads of State, and 200+ Prominent Figures
I’m disappointed that they don’t make any mention of extinction risk
Agree, but I wonder if extinction risk is just too vague, at the moment, for something like this. Absent a fast takeoff scenario, AI doom probably does look something like the gradual and unchecked increase of autonomy mentioned in the call, and I’m not sure if there’s enough evidence of a looming fast takeoff scenario for it to be taken seriously.
I think the stakes are high enough that experts should firmly state, like Eliezer, that we should back off way before fast takeoff even seems like a possibility. But I see why that may be less persuasive to outsiders.

neo

I built a se­man­tic search en­g­ine for LessWrong

neo’s Shortform

What are some research directions for “improving coordination?”

LessWrong’s UX may not be liv­ing up to its ideas

Cal­ling all col­lege stu­dents (and new read­ers)

I built a semantic search engine for LessWrong

LessWrong’s UX may not be living up to its ideas

Calling all college students (and new readers)