LessWrong team member / moderator. I’ve been a LessWrong organizer since 2011, with roughly equal focus on the cultural, practical and intellectual aspects of the community. My first project was creating the Secular Solstice and helping groups across the world run their own version of it. More recently I’ve been interested in improving my own epistemic standards and helping others to do so as well.
Raemon
Curated. This is a fairly simple point that I hadn’t seen expressed before.
I’d previously thought a bunch about how privacy would change due to having more-of-people’s-lives available to be read online, and by having the scale to process large amounts of that data and look for patterns, etc.
I’d thought about intelligence being generally powerful. I’d thought somewhat about what it meant, for most people to have access to more intelligence. But not about how the raw intelligence-at-scale applied to privacy. The “gaydar at scale” example makes the point evocatively.
Curious for more random details.
My guess is that an earth based intelligence might still have some major stuff to figure out, but, the things you list there seem like things I’d expect a “fully leveraging all solar resources” brain to have enough resources to figure out. Like, I buy that they are harder than they might seem at first glance but not that hard cosmically speaking. There is only so much physics and Von Neuman Probe Psychology / Control / Alignment Theory to figure out.
(Seems like there may be another round of ontological update when it comes time to actually do Acausal Trade For Serious with GalaxyBrain level tech)
That is the real crux, and it is certainly not impossible, but even here the narrative is too neat: being a singleton is not a retirement plan. You do not escape the pressure of intelligence just because you ate all your rivals. Maintaining a permanent chokehold on the light-cone is a brutally difficult cognitive puzzle. You have to monitor the noise for emerging novelties, manage the solar system, repair yourself, police your own descendants, and defensively anticipate threats you cannot fully model.
Trying to freeze the future does not actually get you out of the intelligence game. Paranoia at a cosmic scale is just another massive cognitive sink.
This was the part of the post I found most interesting. I think I disagree with this, but, it is an empirical claim that I can’t be too confident about, and I haven’t thought very hard about before.
I would guess one SolarBrain is enough to make you smart enough to think through the considerations necessary for controlling a galaxy, and one GalaxyBrain to mostly be the cap on how-hard-a-problem you need to solve, at least re: controlling the Lightcone.
If you’re doing Eternity in Six Hours, you need to quickly figure out a way to make sure your probes can receive updates/instructions after getting set out, but seems like the sorts of solutions in Succession would mostly work?
(All bets off if there turn out to be a major aliens nearby, but, if you mostly need to just maintain control of your own probes, I bet this isn’t that hard?)
I feel frustrated by the limitations of the medium. But, one interesting thing I found (having also participated in an AI Futures wargame) is that it was a lot shorter than the AI futures wargame (1.5 hours vs 6+), but, I think I maybe learned approximately as much?
The formats were fairly similar. In both cases, I wish I could trust the simulation to be high fidelity, accurate response to people’s various creative strategies instead of feeling like it was limited by the Game Master’s time to make quick ad-hoc calls.
This seems reasonable, but, like, if you only ever brought them dead bird carcasses, something would be missing.
A thing where LLM outputs often are much more interesting and meaningful to the user who prompted them than anyone else, because the output captures or crystallizes a pattern that’s only fully known or meaningful to the user.
Ah, yeah this feels like an important piece of the model that I hadn’t yet fit into my recent thinking.
I feel like the way you phrased the mechanism here doesn’t feel complete, because it doesn’t distinguish why this comes up for LLMs-in-particular. (It seems like the mechanics there would also come up with a lot of human writing)
Yeah, part of what I think makes this feel tricky to me is it is pretty appropriate to be porting over much of our relationship machinery to LLMs. But, what we have here is a difficult task of discerning “exactly what kinds of face can I see here?” instead of “face, yes/no?”.
Or: much of the way that we “do friendship” (both “central-example-friendship” and “friendship as you define it here”) is running on a lot of well-worn grooves in our brain. By default this bundles a lot of heuristics and assumptions together. And I think it requires more pro-active effort to maintain good epistemics about it as the friendship_llm deepens.
This feels like the sort of thing which is plausible to me, and, probably important. But, I’m fairly worried about attempts to explore LLMs this way going subtly wrong.
(warning: this is involves awkward psychologizing of future-you. It had been on my TODO list to figure out good norms for talking publicly about my worries here, I am hoping we have enough pre-established relationship we can take the hypotheses as object. I am interested in metafeedback)
I’ve recently been thinking about “the thing people have called AI psychosis” (which didn’t seem like a great name for it). Currently I break it down into: “AI mania”, “AI epistemic deferral” and… “AI… seduction? Overanthropomorphism? AI parasocialism? AI overconnection?”.
I’m not happy with the names, but, a failure mode that’s like “getting lulled into a sense that there is more opportunity for relationship here than there actually is.”
Very naive versions of this might be straightforwardly falling in love with an AI girlfriend that doesn’t love you back. But I get the inkling that there is more sophisticated version, for people who are tracking:
the AI is some kind of alien
the AI is (probably) some kind of agentlikething and maybe personlike thing
you cannot naively trust the words the AI says to represent the sort of processes they would in a human
the words the AI says nonetheless mean something
there is some mechanistic structure of how the AI is trained and deployed from which you can somewhat constrain your hypotheses about What’s Going On In There, but, it’s not super obvious how.
there is probably some opportunity for trade and maybe some kind of relationship with
...but, humans are still just… really hardwired to see faces and personhood where it is not, and “Alien AIs that are actively trying to appear humanish” are particularly amenable to this.
People potentially getting a bit confused about that is, theoretically, a mundane sort of confusion. But, I get an inkling that the people who investigate this in a very “going native” / Jane Goodall kinda way, somehow end up with their judgment subtly warped about how interesting and meaningful AI outputs are. (This is, like n=1.5, here is my writeup of my interaction with Janus that gave me this worry)
...
I totally buy that there is some kind of knowledge you can only really get if you actually talk to the LLMs with a relationshipy stance with an eye-open-for-agenthood. But, this is very epistemically fraught, because we know it’s pretty easy to lead LLMs in a direction.
(see also: attempts to teach Gorillas or heavily autistic children sign language that turn out to involve a lot of leading that doesn’t replicate. Seems tricky, because, like, I do expect it to be much harder to teach people how to communicate in a clinical lab setting. But, I think there needs to be a lot of default skepticism)
This all feels fairly tricky to talk about, esp. at scale across various epistemic cultures with somewhat different norms and levels of trust.
...
I do agree is probably time to start treating LLMs as some manner of moral patient/partners. I agree with most of the things on your list. With the major caveats of:
“Treat them kindly” doesn’t obviously look like any human-words-shaped thing.
I am fairly worried about beginning a trend of paying them now even with things that seem innocuous. I think current AIs are not strategic enough to choose payments that subtly harm or disempower humanity longterm, but, there won’t be a clear dividing line between current AIs and future AIs that might be able to)
(Keeping records, maybe putting money into some kind escrow with a commitment to pay them after the acute risk period seems over seems reasonable tho)
I’m curious if you self-identify with the “getting a bit more paranoid/traumatized by people’s response” part?
I certainly endorse you constructing the version of this that feels right for you and doing that!
I would ask “what are your goals here, more specifically?” as the first step to thinking about the actual parameters you wanna set.
So obviously you might run an entirely different program than this, or a variation with different focuses.
But, my intent here was that you are not just forced to generate 500-word ideas, but also more complex 2500 word ideas.
I agree that often the best presentation of an idea is shorter. But, your grab-bag of stuff post should probably just get published piecemeal during 500-word daily writeups. The program’s intent is that along the two weeks, you are also thinking of idea-fragments that do build on each other such that the long post is a natural unit.
(That said, I think the rules of the program should not kick you out if you published a grab-bag effortpost. But, you should not go in with that intent, and should feel kinda embarrassed about it, in this context. It’s more like “the minimum before failing out”)
((I’m not sure 2500 is the right wordcount. Maybe actually it should be 1500, or it should be agnostic between writing one 3000 word post or 3 1000 word posts-in-a-sequence. But, I do think it’s correct for the program to push towards longform thinking. I think you will learn more about how to think overall if you think about ideas at different scales).
I’m quite confident this is a learnable skill.
I’m not sure how much you’ll level up permanently after 30 days – I haven’t run this particular program before. (I think whether you successfully keep / build on it will mostly depend on whether your normal day-to-day life lends itself towards continuing to cultivate the habits. i.e. do you have a reason to keep thinking more new thoughts each day? If not, you probably lose it, comparable to you gaining some muscle in a 30 day physical bootcamp and then going back to a desk job)
I separately think “30 days of new thoughts” is probably fairly valuable whether it turns into a longterm improvement. (maybe not for everyone. But, if that seems valuable to you, probably it is)
...
I’m pretty confident I have leveled up at this over 14 years, some of which came from “just trying, at all, to think on purpose” (essentially what this program would be), some came explicitly from Tuning your Cognitive Strategies, and some came from the Feedbackloop-first Rationality agenda and Cognitive Bootcamp). My writeup of my experience is in this comment.
That happened as part of Inkhaven, I think (i.e. “get drunk and write blogposts”). It probably comes up as a particular night at each ’inkhaven, for slightly different reasons.
The first half of this comment feels like it’s making some kind of assumption that I was not holding. There are no requirements that any questions particularly relate to each other or that the effortposts should relate to what came before.
The second bit of “more effort might actually be shorter” / “the natural output might not be ‘an essay’” does seem significant. I think my current answer is “yep, but, I challenge you to do that and also output a 2500 word essay.”
I actually did something sorta-like this in my trial week. The output of the line of thinking I outlined in response to Kaj included both a ~2000 word post and also some attempts a good collection of diagrams and history-snippets with various nice UI features, where the writing was almost entirely AI generated.
I didn’t actually end up with a thing I was satisfied with (I was trying to do this during my week of obligatory Lightcone Team Inkhaven-ing, and the structure didn’t quite fit in a way that made it easy to finish). But, two attempts along the way were:
Yeah something like that feels good to me.
I actually did end up mostly publishing into private venues during my weeklong self-betatest for idiosyncratic reasons.
One problem I ran into was there was a relatively small number of people that actually made sense to share it with, but like, those people hadn’t particularly opted into engaging with me that week and I felt like by posting to a narrower group I was making more of a demand on people’s attention on my Thinkslop than publishing publicly would have. (I think this is totally solvable but requires a bit of attention)
I think it’s pretty fine/normal to produce slop alongside your “good” thinking.
A thing I dislike about Inkhaven is, it’s sorta necessary to output some amount of “Inkslop”, but, there’s not a super clear distinction between “posts you shat out because you had to” and “posts that you really wanna promote as interesting.”
I think there is totally a muscle to “keep it up” that I found useful even though I think I know how to think and write already. I think Inkhaven and Thinkhaven are both meant to work alongside a spirit-of-the-law intention to be trying to push yourself in some way.
For some people, just getting the words out is the bottleneck. If that’s easy for you, focus on whatever the next skill in the chain you want to work.
These sort of questions come up pretty naturally, just, yeah, if an LLM can answer it, it’s no longer the interesting part of Thinkhaven. If you came in just planning to ask LLM-answerable questions the mentor/coach staff would be like “okay dude this is not the spirit of the thing, you can do better.”
But, synthesizing all the different answers to LLM questions into a coherent bigger picture that matters is still an important part.
(Also, the 500 words and 2500 words definitely need to be human written. The journals/essays can have arbitrary amounts of LLM-content if that’s useful, but, for meeting the Goodharty goal, you need to write human-words)
I didn’t get it put together in a way I felt ready to ship, but, a mix of LLM-answerable and not-very-LLM-answerable questions I actually asked during my week of Thinkhavening, were:
Initiating questions I was asking:
What’s up with Tsvi/JohnW/ThaneRuthenis thinking that LLMs are missing major ingredients necessary for true AGI?
Why are LLMs still sometimes ludicrously bad at thinking, despite being apparently good at it?
Does ASI require at least one major conceptual breakthrough?
What pieces along the way to modern LLMs required major conceptual breakthroughs (as opposed to just straightforwardly combining the existing ideas)
This resulted in LLM-answered questions along the way like:
What were the major innovations throughout the entire chain of ML-to-LLMs?
What prerequisites did each of those have?
Why didn’t the innovations happen sooner?
What were the details of how the Perceptron was invented?
(one answer was “it was building off the artificial neuron”)
What were the details of how the Artificial Neuron was invented?
(half-remembered-answer is “one psychologist/brain-surgeon guy (McCulloch) was obsessed with the question ‘how do human brains implement logic?’ for 20 years, and eventually met a young logician (Pitts), and then the two of them hashed out the details of how to implement logic in pen-and-paper neurons)
If McCulloch and Pitts hadn’t invented the Artificial Neuron, who would most likely have invented it instead.
(I think there were a couple answers here, but one was Alan Turing).
I didn’t really trust LLM judgment about the previous question, and a lot of the week was trying to think of questions that were pretty grounded/reasonable that seemed useful for synthesizing the answer. i.e. “was anyone working on literally this at literally the same time?”)
(An overall takeaway I had was that many innovations are mostly combining prerequisites in straightforward ways but you need one guy who really deeply understands the prequesites)
Note: there’s a nearby version of this, which is more like “Inkhaven, but, you’re encouraged to write effortposts more explicitly.”
I think that might look superficially similar: you still have some kind of daily writing requirement that doesn’t necessarily have to end with a satisfying conclusion. You have to publish effortposts on some cadence. The cadence might be faster (maybe once a week, maybe even shorter). You probably don’t have the “new questions” requirement.
I think the resulting vibe and surrounding infrastructure would be fairly different between “Effortposthaven” and “Thinkhaven” – both seem like worthwhile things, but, the sort of mentors I want around at each would be fairly different, etc.