plex

Karma: 5,437

I have signed no contracts or agreements whose existence I cannot mention.

plex 17 Jun 2026 16:25 UTC
9 points
0
in reply to: leogao’s comment on: leogao’s Shortform
In the right places 10k can do a lot, I funded an initial upskilling grant for someone who discovered glitch tokens for that amount. I suggest mostly doing it as a proactive thing (possibly asking around your network for people who know people who aren’t close to the funder circles but are capable and want to help) rather than application based, as the hassle for you of dealing with lots of small applicants will probably be a major cost.

plex 11 Jun 2026 14:41 UTC
11 points
4
on: Sequent: scale and automation for higher confidence in alignment
Genuinely promising agenda! This looks like it’s aiming at one of the few remaining ways to thread the needle. Added to the map.
Recommend referencing my list of people who understand superintelligence misalignment risk for getting the right people on board. Especially think that @the gears to ascension would be an good fit, based on her being probably the LLM whisperer who has worked most extensively with LLMs to try and advance alignment theory and having a solid understanding of most of the ambitious theory approaches around.
Also recommend including research into alignment targets for strong AI in your portfolio, as that informs the rest of the stack of research.

plex 22 Apr 2026 8:58 UTC
15 points
4
in reply to: Ben Pace, the Vacationing Vagabond’s comment on: Evil is bad, actually (Vassar and Olivia Schaefer)
That’s pretty understandable. Having pushed through a lot of that, I think it’s something like if your priors are that Vassar is retributive and powerful and willing to break norms, it’s quite hard to think and talk groundedly about him. For anyone who wants to speak out, I suggest spending some time with healthy vibes people who are far from the rationalist space (ideally people who’ve never been part of it) for a week or two, and really letting yourself take in the background sense of safety before trying to post publicly.
Getting this post to the level of grounded I managed here took a surprising amount of intentionality and grounding.

plex 21 Apr 2026 13:31 UTC
59 points
14
on: Evil is bad, actually (Vassar and Olivia Schaefer)
Canary for whether I have been threatened with legal action over this post, and I guarantee that I will post any attempted threats in the comments.

plex 15 Apr 2026 17:35 UTC
16 points
0
in reply to: plex’s comment on: Szeth’s Shortform
(this went well enough that Szeth offered the bounty to me and I requested it go to one of my top AIS charities)

plex 10 Apr 2026 19:03 UTC
15 points
3
in reply to: plex’s comment on: No77e’s Shortform
(also, it’s scary to see three of the people I’d put in the upper tiers of good communication and understanding where we’re at with AI technically get into this intense conflict. I’m going to be thinking on this some and seeing if anything crystalizes which might help specifically, but in the meantime a few more general-purpose posts that might be useful memes for minimizing unhelpful conflict are A Principled Cartoon Guide to NVC, NVC as Variable Scoping, and Why Control Creates Conflict, and When to Open Instead)

plex 10 Apr 2026 18:57 UTC
12 points
12
in reply to: habryka’s comment on: No77e’s Shortform
I endorse you taking the space to figure out how you want to relate and doing what’s right for you, I’ve increasingly updated to thinking that people doing things they’re not wholeheartedly behind tends to be net bad in all sorts of sideways ways, but the effort would be weaker for your loss. Wherever you end up, I appreciate you having taken the strategy of speaking in public about things that usually aren’t in a way that helped clarify the strategic situation for me many times.

plex 10 Apr 2026 15:10 UTC
6 points
0
on: Why Control Creates Conflict, and When to Open Instead
Controlling is especially toxic when applied to the self-model of another person, Non-Violent Communication’s ability to defuse conflict comes mostly from requiring people to talk in a way which makes making control-statements about the other’s self model or unobservable variables difficult.

plex 8 Apr 2026 9:07 UTC
13 points
8
in reply to: 152334H’s comment on: Omne’s Shortform
It starts at sudden death of yourself and everyone else, the destruction of earth and extinction of all biological life, and a sphere of darkness eating nearby stars, and gets worse from there.

plex 7 Apr 2026 16:17 UTC
2 points
0
in reply to: Richard_Ngo’s comment on: ricraz’s Shortform
Responded I think resolving this with: Managed vs Unmanaged Agency

plex 5 Apr 2026 22:06 UTC
7 points
0
in reply to: Szeth’s comment on: Szeth’s Shortform
(DMed my top recommendation, someone who used to have pretty bad OCD, helped resolve someone else’s and mostly their own, and is full time doing x-risk reduction work)

plex 5 Apr 2026 22:02 UTC
4 points
−1
in reply to: 1a3orn’s comment on: 1a3orn’s Shortform
I think @Eli Tyre kinda got it from one example in 2023. (second comment)

plex 5 Apr 2026 11:48 UTC
−1 points
2
on: Ten different ways of thinking about Gradual Disempowerment
1. It’s Pythia, the patters that are predicted to be more effective power-seekers being selected for in the present, causing the entire system to side towards productivity maximizing at the expense of all other values.

plex 3 Apr 2026 12:19 UTC
4 points
0
in reply to: Mateusz Bagiński’s comment on: Mateusz Bagiński’s Shortform
At some point AI text gets good enough at persuasion to be actively harmful to read, just as reading text by highly manipulative people can be harmful to read. Sometime before that it pings your memetic immune system a bunch and most people read this as vaguely wanting to avoid taking in LLM text. I think we might be entering the latter phase. Not confident, but we’ll get there at some point and they do seem at about the relevant capability level in some other domains.

plex 1 Apr 2026 15:33 UTC
2 points
0
in reply to: Luc Brinkman’s comment on: Product Alignment is not Superintelligence Alignment (and we need the latter to survive)
Yeah, this one does feel more memetically powerful in some ways, but something like less collaborative. Agree we’d probably want the pair.

plex 1 Apr 2026 9:19 UTC
3 points
0
in reply to: the gears to ascension’s comment on: Product Alignment is not Superintelligence Alignment (and we need the latter to survive)
I agree that’s a better ontology, this was the post I could write fast as a patch, looking forward to yours!
I might read your half baked ones and be up for writing it coauthoring the real thing is you want.
Edit: Added a note to the main post pointing at this.

plex 31 Mar 2026 22:23 UTC
11 points
9
in reply to: draganover’s comment on: Product Alignment is not Superintelligence Alignment (and we need the latter to survive)
It may seem unreasonable within the current paradigm, but I think it’s necessary to reach if we get strong superintelligence. You need to have a system that you can’t make destroy the entire system if you want the whole system to remain undestroyed indefinitely.
You’re that I didn’t explain why each framework fails to plausibly scale to very strong models, maybe that’s also worth it’s own post, because there are a lot and each have limits that you need to go a bit into the weeds to see.

plex 31 Mar 2026 20:22 UTC
6 points
0
in reply to: Shmi’s comment on: shminux’s Shortform
The Real Moral Of Newcomes’s Was The Backwards Causality Omega Made Along The Way.

plex 31 Mar 2026 20:12 UTC
6 points
0
on: The state of AI safety in four fake graphs
I responded with a post: Product Alignment is not Superintelligence Alignment (and we need the latter to survive)
Simply put, I think alignment as used here and many other places is a conflation of two quite separate tech trees.

plex 31 Mar 2026 9:18 UTC
2 points
0
in reply to: evhub’s comment on: evhub’s Shortform
Under the Managed vs Unmanaged Agency frame (which I think replaces instrumental vs terminal with a conceptual split that fits reality better), I agree.