I have signed no contracts or agreements whose existence I cannot mention.
plex
Canary for whether I have been threatened with legal action over this post, and I guarantee that I will post any attempted threats in the comments.
Evil is bad, actually (Vassar and Olivia Schaefer)
(this went well enough that Szeth offered the bounty to me and I requested it go to one of my top AIS charities)
(also, it’s scary to see three of the people I’d put in the upper tiers of good communication and understanding where we’re at with AI technically get into this intense conflict. I’m going to be thinking on this some and seeing if anything crystalizes which might help specifically, but in the meantime a few more general-purpose posts that might be useful memes for minimizing unhelpful conflict are A Principled Cartoon Guide to NVC, NVC as Variable Scoping, and Why Control Creates Conflict, and When to Open Instead)
I endorse you taking the space to figure out how you want to relate and doing what’s right for you, I’ve increasingly updated to thinking that people doing things they’re not wholeheartedly behind tends to be net bad in all sorts of sideways ways, but the effort would be weaker for your loss. Wherever you end up, I appreciate you having taken the strategy of speaking in public about things that usually aren’t in a way that helped clarify the strategic situation for me many times.
Controlling is especially toxic when applied to the self-model of another person, Non-Violent Communication’s ability to defuse conflict comes mostly from requiring people to talk in a way which makes making control-statements about the other’s self model or unobservable variables difficult.
Why Control Creates Conflict, and When to Open Instead
It starts at sudden death of yourself and everyone else, the destruction of earth and extinction of all biological life, and a sphere of darkness eating nearby stars, and gets worse from there.
Responded I think resolving this with: Managed vs Unmanaged Agency
(DMed my top recommendation, someone who used to have pretty bad OCD, helped resolve someone else’s and mostly their own, and is full time doing x-risk reduction work)
I think @Eli Tyre kinda got it from one example in 2023. (second comment)
It’s Pythia, the patters that are predicted to be more effective power-seekers being selected for in the present, causing the entire system to side towards productivity maximizing at the expense of all other values.
At some point AI text gets good enough at persuasion to be actively harmful to read, just as reading text by highly manipulative people can be harmful to read. Sometime before that it pings your memetic immune system a bunch and most people read this as vaguely wanting to avoid taking in LLM text. I think we might be entering the latter phase. Not confident, but we’ll get there at some point and they do seem at about the relevant capability level in some other domains.
Yeah, this one does feel more memetically powerful in some ways, but something like less collaborative. Agree we’d probably want the pair.
I agree that’s a better ontology, this was the post I could write fast as a patch, looking forward to yours!
I might read your half baked ones and be up for writing it coauthoring the real thing is you want.
Edit: Added a note to the main post pointing at this.
It may seem unreasonable within the current paradigm, but I think it’s necessary to reach if we get strong superintelligence. You need to have a system that you can’t make destroy the entire system if you want the whole system to remain undestroyed indefinitely.
You’re that I didn’t explain why each framework fails to plausibly scale to very strong models, maybe that’s also worth it’s own post, because there are a lot and each have limits that you need to go a bit into the weeds to see.
I responded with a post: Product Alignment is not Superintelligence Alignment (and we need the latter to survive)
Simply put, I think alignment as used here and many other places is a conflation of two quite separate tech trees.
That’s pretty understandable. Having pushed through a lot of that, I think it’s something like if your priors are that Vassar is retributive and powerful and willing to break norms, it’s quite hard to think and talk groundedly about him. For anyone who wants to speak out, I suggest spending some time with healthy vibes people who are far from the rationalist space (ideally people who’ve never been part of it) for a week or two, and really letting yourself take in the background sense of safety before trying to post publicly.
Getting this post to the level of grounded I managed here took a surprising amount of intentionality and grounding.