I think the heuristic “nothing ever happens” is better interpreted to mean “nothing ever happens relative to baseline trends” than “literally nothing ever happens”. The incrementalist worldview seems like a better fit for this heuristic than Eliezer’s, which after all ultimately predicts something very dramatic happening.
interstice
(didn’t downvote, but) I don’t think you’re necessarily wrong, but couldn’t it just be the case that being a singleton isn’t that hard? As an empirical matter, the size(as a fraction of the total) of the largest somewhat-coherent entities controlling resources on Earth seems to have been increasing over time. Space expansion could change things, but a stable singleton might already exist by then, and be faced with a relatively homogeneous set of environments to expand into. I’ve written some pieces along similar lines btw.
What picture does all this data paint? What patterns can you see when you look at all the data, instead of facing down each argument one at a time and dismissing them one at a time?
One possible explanatory pattern is that Michael is an insane and conspiratorial guy and attracts people who are also insane and conspiratorial, and it’s easy to push such people into having a meltdown.
“I,” you say, and are proud of that word. But the greater thing—in which you are unwilling to believe—is your body with its great wisdom; that does not say “I,” but does “I”
What the sense feels, what the mind knows, never has its end in itself. But sense and mind would rather persuade you that they are the end of all things: so vain are they.
Instruments and toys are sense and mind: behind them there is still the Self. The Self seeks with the eyes of the senses, it listens also with the ears of the mind.
Always the Self listens and seeks; it compares, masters, conquers, and destroys. It rules, and is also the mind’s ruler.
Behind your thoughts and feelings, my brother, there is a mighty lord, an unknown sage—it is called Self; it dwells in your body, it is your body
There is more wisdom in your body than in your best wisdom. And who then knows why your body needs precisely your best wisdom?
Your Self laughs at your mind, and its bold leaps. “What are these leaps and flights of thought to me?” it says to itself. “A detour to my end. I hold the puppet-strings of the mind, and am the prompter of its notions.”
I think if Kim Jong Un lived for a million years, and had the smartest AI advisors, and access to intelligence augmentation techniques, he would probably still never come to admit that murdering his brother was an evil thing to do
I feel like the evidence you’ve provided here is pretty weak for drawing that conclusion. The regime where KJU lives for a million years and augments his intelligence is really outside distribution.
LessWrong but with cats instead of posts
Self explanatory
I’ve been taking collagen and the exogenous antioxidant astaxanthin for a while and it’s definitely improved my sleep among other things.
Good post about LLMs scheming against their stupid human overseers.
What sources have you used to derive your understanding of brain function from?
On the object level, there’s a very plausible mechanism why solving some problems can make things worse
Hmm, I see. But it seems to me that the linked post is overoptimistic that making problems legible will cause people to slow down until they are solved. Already, lots of problems that are “legible” to this community are not “legible” enough to make the labs or governments want to slow down. So working to solve them(by our standards, since apparently other people already consider them solved or unimportant) could still be useful.
Of course you could then say that what we should be doing is trying to make problems legible to this community legible to the wider world, which is pretty much what MIRI is trying to do at the moment. Certainly that seems like a valuable thing to do. But far from guaranteed to succeed. And I think the fact that LW-legibility of takeover isn’t on its own enough to cause the wider world to slow down, should make us less worried that solving one problem to our standards will make the world push ahead more recklessly, since they seemingly aren’t that responsive to what problems are considered solved by our standards.
there is a bunch of different safety problems where solving some but not all of them at the same time can make the overall situation worse
But isn’t the relevant thing here whether solving some makes the overall situation worse in expectation? Like sure, it could be the case that e.g. solving intent alignment makes things worse, but it seems relatively unlikely. It seems implausible that the current set of problems we have now cancel each other out optimally. Or maybe we should try to introduce new problems to further mitigate the set of existing problems!
Yeah it’s more like it’s relevant to the kind of world we find ourselves in. But that is itself important to agency as a given agent design will only be successful in certain kinds of worlds.
I think QM itself. It’s important somehow that the world is actually quantum mechanical. But probably not in a very direct fashion, but via influencing the sort of high-level properties and entities that end up “emerging” from the base laws.
[I don’t actually think this is true, but] It would be funny if rationalism turns out to not merely be a euphemism for autism but “mal”functioning oxytocin receptors and rationalists are constitutionally unable to normally feel love/social emotions; whether this would be to the discredit of love or rationalism is up to taste.
“quantum mechanics is probably important to the structure of agency/the mind in some way we don’t understand yet”.
Can’t speak for everybody but when I read simulators I was like “yeah obviously GPTs are generative models not agents, duh, why do people need an entire post to tell them that” and didn’t really expect the thing where agents are like “characters” in the world model to scale to high levels of capability. That seeming to happen raised the salience of the ideas a lot for me.
I’m currently thinking that it might be a good idea to publish a bunch of text that I think could help AIs make conceptual/philosophical progress quickly. Basically because it seems like (a) there could be a time period(like, uh, now) where AIs can do some reasoning on their own but don’t have good taste or autonomy in high-level research directions, so publishing incomplete stubs of ideas or promising things to look into could help them. (b) AIs that rely on human text more to make intellectual progress are likely to be more aligned than AIs that don’t, so this should differentially help more-aligned models. There is the risk that helping AIs in this way could empower misaligned models, but I think this is outweighed by factor (b) overall. And of course the most likely outcome is that whatever I publish simply isn’t that useful, but that seems ~neutral. Does anybody have any other considerations that could be relevant here?
If you’re still interested in trading(although maybe you’re not so much given the possibility of the impending singularity) maybe you should try polymarket, the returns there can be pretty good for smart people even if they have a lot of money. I 5x-ed in 2.5 years starting with four figures, but other people have done much better starting with a similar amount(up to 6 or 7 figures in a similar time frame), and my impression is that 2X for people with 7 figures should be achievable.
I noticed that Opus4.6 tried to get creative and actually write a fantasy-ish story whereas GPT5.2 mostly just elaborated on your notes, making its task a bit easier. So I tried prompting GPT5.2 identically but with “write a fantasy story in the style of Terry Pratchett” prepended and got this:
GPT5.2
The Refrigerator That Went the Wrong Way
Everyone knows that time is a river¹, except in certain parts of the city where it is more of a municipal plumbing problem.
The first indication that something had gone wrong with the fridge was the soup.
Tarin Gloss, junior time-magics student, senior refrigerator tinkerer, and unpaid intern at ChronoCool Solutions (Ltd., Pending, Hopefully), had placed a bowl of soup into the prototype unit at exactly seven minutes past noon. It had been piping hot, aromatic, and optimistically labeled Lunch.
At seven minutes past twelve and thirty seconds, Tarin opened the fridge again and found the soup… older.
Not cold. Not frozen. Older. The carrots had softened with the weary resignation of vegetables that had seen too much of the world. A thin skin had formed, the culinary equivalent of a sigh.
“Well,” Tarin said, because saying well is what sensible people say when reality begins quietly rearranging itself.
This was, admittedly, not the fridge’s intended function. The marketing document—three pages long and mostly exclamation marks—had promised temporal deceleration, preservation, and a revolution in baked goods. Instead, the fridge appeared to be aging things aggressively, like a pessimistic aunt.
Still, Tarin was not alarmed. Time did funny things in this city. Sometimes it paused for paperwork. Sometimes it reversed for tax reasons. Sometimes it just… sulked.
They did what any responsible chronomancer would do: they put a mechanical timepiece inside.
The watch came out having experienced approximately twelve years.
“That’s odd,” Tarin said, writing this down under Minor Irregularities.
ChronoCool Solutions was, after all, a startup. Irregularities were expected. Encouraged, even, as long as they could be rebranded as features.
The real trouble began the next day, when Tarin opened the fridge and found a loaf of bread already inside.
It was unmistakably their bread. Same flour, same slightly lopsided shape, same faint scorch mark from the oven that ran hot on the left. They took it out, frowned at it, and—after a moment’s consideration—put a different loaf in.
They closed the door.
They opened it again.
The original loaf was back.
There are moments in life when the universe clears its throat.
Tarin sat down very carefully.
Time travel, it turned out, fit the facts uncomfortably well.
The fridge did not slow time. It reversed it, locally, internally, and only while powered. Anything inside could loop. And anything that could loop could, with sufficient enthusiasm and poor judgment, exit itself.
Which is how Tarin met themselves.
The time-clone was standing in the kitchen, wearing Tarin’s coat, holding Tarin’s wallet, and looking extremely tired.
“Don’t panic,” the clone said, immediately causing panic. “I’m you, but later. Slightly. And more experienced. Also, we lost the money the first time.”
“What money?”
“The gambling money. Never bet everything on Thunderhoof. He looks honest. He’s not.”
There was, regrettably, no fantasy stock market. But there were horse races, and fantasy horses are subject to all the usual uncertainties plus at least one curse per season. A few hours’ foreknowledge went a long way.
The loop was only several hours long, which made it manageable. It merely extended the day, like an unexpected appendix. Tarin would bet, lose, step into the fridge, return earlier, adjust strategy. The second loop was better. The third was profitable.
The fourth was educational.
Covering for oneself turned out to be harder than time travel. ChronoCool still expected Tarin to show up to meetings, answer emails, and not behave as if they had personally stepped on causality’s toes. The winnings had to be put back where they belonged. Time-Tarin #2 generously handed over extra coins.
“For the next run,” they said. “Trust me.”
Tarin stepped into the fridge.
And did not step out again.
Time-Tarin #2 straightened their coat, smiled at the empty kitchen, and went to work.
Things continued in this fashion until the city’s racing authority noticed that someone was winning far too consistently and banned Tarin for cheating, temporal or otherwise.
At which point Tarin did the sensible thing.
They told a business partner.
The partner listened carefully, nodded, and said, “Oh good. You’ve figured it out too.”
They introduced Tarin to Protagonist #3, who was, technically speaking, Tarin #4, and who had already used the failsafe device—installed discreetly in a rented dockside storage unit—to go back and place the bets that had gotten them all banned.
“I thought you did that,” Tarin said.
“Oh no,” said Protagonist #3 cheerfully. “That was me. Sorry.”
The fridge hummed softly in the background, aging bread, preserving heat, and quietly turning time into a commodity.
After all, you could take money back— if you could buy and sell gold— and if you could buy and sell time, well.
That was just good business.
Not bad imitation of the tone, but it also seems to have gotten the bread stuff wrong, writing:
The real trouble began the next day, when Tarin opened the fridge and found a loaf of bread already inside.
It was unmistakably their bread. Same flour, same slightly lopsided shape, same faint scorch mark from the oven that ran hot on the left. They took it out, frowned at it, and—after a moment’s consideration—put a different loaf in.
They closed the door.
They opened it again.
The original loaf was back.
which seems confused.
Yes. But the OP is about contrasting people like Paul with Eliezer. Paul (I think) does indeed predict a dramatic singularity, but also that the ramp up to said singularity will be smoother and more widely distributed across society than Eliezer predicts.