Personally I want to be the superintelligence, not have it happen to me
Logan Zoellner
This is just Friendship is Optimal but without the explicit use of ponies making it clear it is actually dystopia.
Power is happening to you. That power has lots of weird ideas that 99% of humans on Earth would find objectionable. That power uses the word consent a lot, but what they really mean is: we will continue to have power and you will give verbal consent to us using that power.
Waiting someone who knows neuroscience to respond with “actually...”
Given that the brain is a large distributed network in which different subagents are trying to optimize different objectives and what we know about the social planning problem, it would be shocking to me if there wasn’t something like trade going on.
In the world of the anti-singularity the AI bureaucracy has all of the same problems modern bureaucracy has. it is confused, error-prone, has conflicting goals at various layers of management and is dramatically less efficient than a swarm of simpler heuristic agents acting on local-information and trading information via markets.
I think maybe you missed the point? I’m not saying human-level intelligence isn’t possible (obviously it is). In the world of the anti-singularity Human Intelligence isn’t “General” intelligence (because there’s no such thing) and SAI is impossible.
The Anti-Singularity
On your way to figuring out how to build controllable ASI, you will have figured out how to build unsafe ASI, because unsafe ASI is vastly easier to build than controlled ASI, and is on the same tech path.
This is only true if you are building some kind of cartoon ASI that self-replicates without regard for its creators’ intentions. If you (a human being) are trying to build ASI to achieve any purpose at all you basically have to solve AI safety along the way. This is empirically demonstrated. GPT 3.5 wasn’t vastly more intelligent than GPT 3, but it was vastly more useful because RLHF was used to aim it at goals. We see the exact same trend today. Far from paying an “alignment tax”, Anthropic is able to build the most powerful AI models because they are obsessed with the question “how do I control the AI?”
>Arguably that wasn’t the point of the book
Why did you title the book “If anyone builds it everyone dies” if the point of the book was not to convince people “If anyone builds it everyone dies”? If this really was some obscure philosophical project that has no bearing on the real question why not give it some obscure title like “On the Electrodynamics of Moving Bodies” to clearly indicate “this isn’t meant to be persuasive or even comprehensible to 99% of human beings”
“my prior is low,” not “the evidence isn’t convincing,”
I still don’t follow.
You wrote an entire book and it didn’t move Bentham’s priors. If that’s not a clear cut example of “the evidence [in the book] isn’t convincing.” I don’t know what is.
In fact, if someone wrote an entire book (in which I would assume they would naturally collect the best arguments for a position) and I found no convincing evidence it, I would actively consider that evidence against the position. Because “I haven’t done much research but the evidence looks poor” is a less definitive conclusion than “I have read the foremost expert’s book on the topic and the evidence looks bad.”
Moltbook and the AI Alignment Problem
Suppose that, in the years before telescopes, I came to you and said “the planets are other worlds, like ours, and a bunch of them have moons.”
Suppose you should believe, without evidence such a theory, as opposed to one of the many equally plausible but wrong theories that were going around at the time such as: “other planets will have different kinds of men on them” or “other planets have vegetation and life on them” or “other planets have rocky surfaces and air on them”.
And suppose that subsequently, evidence should be discovered proving that you, and you alone were correct.
Then you will be lauded throughout the world. People will declare you a thought-leader, an influencer a visionary of the future. Undoubtedly, wealth and fame will attract themselves to you. History books will sing your praises for centuries to come as “the man who knew other planets had moons.”
In one small, dark corner of the internet, however, you will encounter a strange group of people. These people have beliefs like “claims should be based off of evidence.” And those people will use a different word to describe you: lucky.
To criticize an idea on the grounds that the evidence for that idea isn’t conclusive is insane — that’s a problem with your body of evidence, not the ideas themselves!
What does this sentence even mean? The problem isn’t the idea, it’s that there’s not enough evidence for it… sounds like the problem is with the idea.
There are no new ideas only new datasets
Currently all LLMs are terrible at computer-use. Part of this is an ergonomics problem (GPT agent is frequently blocked from viewing websites and I still don’t trust it enough to e.g. give it my street address and credit card number). But when I give graphically demanding task that is 100% doable in the browser, it still falls absolutely flat on its face.
What is needed for RL to succeed is something like: an internet-scale dataset of graphically demanding tasks with objective success criteria. Sooner or later someone is going to put together a dataset like “here are all 150k games on steam with a simple yes/no that tells us whether or not the AI beat the game.” And when that happens, I strongly suspect RL will suddenly start working.
Alternatively, companies like figure are planning to deploy 1000′s of robots in the real-world with more or less the same idea: create a huge training set of actual physical reality (as opposite to just text + multimedia).
Once a proper dataset is in place, I expect we will not see slow-gradual progress indicated by the METR chart, but rather a huge all-at-once leap (on par with when we first started properly applying RL to math).
Most Americans use ChatGPT if AI was causing psychosis (and the phenomena wasn’t just already psychotic people using ChatGPT) it would be showing up in statistics, not anecdotes. SA concludes that the prevalence is ~1/100k people. This would make LLMs 10x safer than cars. If your concern was saving lives, you should be focusing on accelerating AI (self driving) not worrying about AI psychosis.
tend to say things like “probably 5 to 25 years”.
Just to be clear, your position is that 25 years from now when LLMs are trained using trillions of times as much compute and routinely doing task that take humans months to years that they will still be unable to run a business worth $1B?
thank you for clarifying.
It’s easy to imagine a situation where an AI has a payoff table like:
| defect | don’t defect
------------------------succeed| 100 | 10
--- ------------------------------
fail | X | n/a
where we want to make X as low as possible (and commit to doing so)
For example a paperclip maximizing AI might be able to make 10 paperclips while cooperating with humans, 100 by successfully defecting against humans
seems to violate not only the “don’t negotiate with terrorists” rule, but even worse the “especially don’t signal in advance that you intend to negotiate with terrorists” rule.
Those all sound line fairly normal beliefs.
Like… I’m trying to figure out why the title of the post is “I am not a successionist” and not “like many other utilitarians I have a preference for people who are biologically similar to me, I have things in common with, or I am close friends with. I believe when optimizing utility in the far future we should take these things into account”
Even though can’t comment on OP’s views, you seemed to have a strong objection to my “we’re merely talking price” statement (i.e. when calculating total utility we consider tradeoffs between different things we care about).
Edit:
to put it another way, if I wrote a post titled “I am a successionist” in which I said something like: “I want my children to have happy lives and their children to have happy lives, and I believe they can define ‘children’ in whatever way seems best to them”, how would my views actually different from yours (or the OPs)?
this was a fine view to have 3 years ago. But at the point where LLMs are already pushing the boundaries of mathematics claiming they won’t scale is denying objective reality. What specific capability do you expect ASI to have that LLMs don’t already possess?