If you would create a general AI pause, you would also pause the development of AlphaFold successors. There are problems like developing highly targeted cancer vaccines based on AI models, that are harder to solve than what AlphaFold can currently solve but easier to solve than simulating a whole organ system.
It makes sense to focus biological emulation at this point to those problems that are valuable for medical purposes as you can get a lot of capital deployed if you provide medical value.
In general, it’s not clear why superintelligence that comes from “Scanless Whole Brain Emulation” would help with AI safety. Any entity that you create this way, can copy itself and self-modify to change it’s cognition in a way that makes it substantially different from humans.
Thank you, these are very thoughtful points and concerns.
You’re right that a general AI pause on everything AI wouldn’t be wise. My view is that most (but not all) people talking about an AI pause, only refer to pausing general purpose LLM above a certain level of capability, e.g. o1 or o3. I should have clarified what I meant by “AI pause.”
I agree that companies which want to be profitable, should focus on medical products rather than such a moonshot. The idea I wrote here is definitely not an investor pitch, it’s more of an idea for discussion similar to the FHI’s discussion on Whole Brain Emulation.
AI safety implications
Yes, building any superintelligence is inherently dangerous. But not all superintelligences are equally dangerous!
No self modifications
In the beginning, the simulated humans should not do any self modifications all, and just work like a bunch of normal human researchers (e.g. on AI alignment, or aligning the smarter versions of themselves). The benefit is that the smartest researchers can be cloned many times, and they might think many times faster.
Gradual self modifications
The simulated humans can modify a single volunteer become slightly smarter, while other individuals monitor her. The single modified volunteer might describe her ideal world in detail, and may be subject to a lie detector which actually works.
Why modified humans are still safer than LLMs
The main source of danger is not a superintelligence which kills or harms people out of “hatred” or “disgust” or any human-like emotion. Instead, the main source of extinction is a superintelligence which assigns absolutely zero weight to everything humans cherish, and converts the entire universe into paperclips or whatever its goal is. It does not even spare a tiny fraction for humans to live in.
The only reason to expect LLMs to be safe in any way, is that they model human thinking, and are somewhat human-like. But they are obviously far less human-like than actual simulated humans.
A group of simulated humans who modify themselves to become smarter can definitely screw up at some stage, and end up as a bad superintelligence which assigns exactly zero weigh to everything humans cherish. The path may be long and treacherous, and success is by no means guaranteed.
However, it is still relatively much more hopeful than having a bunch of o3 like AI, which undergoes progressively more and more reinforcement learning towards rewards such as “solve this programming challenge,” “prove this math equation,” until there is so much reinforcement learning that their thoughts no longer resemble the pretrained LLMs they started off as (which were at least trying to model human thinking).
“You can tell the RL is done properly when the models cease to speak English in their chain of thought”
-Andrej Karpathy
Since the pretrained LLMs can never truly exceed human level intelligence, the only way for reinforcement learning to create an AI with far higher intelligence, may be to steer them far from how humans think.
These strange beings are optimized to solve these problems and occasionally optimized to give lip service to human values. But it is extremely unknown what their resulting end goals will be. It could easily be very far from human values.
Reinforcement learning (e.g. for solving math problems or giving lip service to human values) only controls their behavior/thoughts, not their goals. Their goals, for all we know, are essentially random lotteries, with a mere tendency to resemble human goals (since they started off as LLMs). The tendency gets weaker and weaker as more reinforcement learning is done, and you can only blindly guess if the tendency will remain strong enough to save us.
I agree that companies which want to be profitable, should focus on medical products rather than such a moonshot. The idea I wrote here is definitely not an investor pitch, it’s more of an idea for discussion similar to the FHI’s discussion on Whole Brain Emulation.
In the beginning, the simulated humans should not do any self modifications all, and just work like a bunch of normal human researchers (e.g. on AI alignment, or aligning the smarter versions of themselves). The benefit is that the smartest researchers can be cloned many times, and they might think many times faster.
That’s like the people who advocated AI boxing. Connecting AI’s to the internet is so economically valuable that it’s done automatically.
The main source of danger is not a superintelligence which kills or harms people out of “hatred” or “disgust” or any human-like emotion. Instead, the main source of extinction is a superintelligence which assigns absolutely zero weight to everything humans cherish
Humans consider avoiding death to have a pretty high weight. Entities that spin up and kill copies at a regular basis are likely going to evolve quite different norms about the value of life than humans. A lot of what humans value comes out of how we interact with the world in an embodied way.
I completely agree with solving actual problems instead of only working on Scanless Whole Brain Emulation :). I also agree that just working on science and seeing what comes up is valuable.
Both simulated humans and other paths to superintelligence will be subject to AI race pressures. I want to say that given the same level of race pressure, simulated humans are safer. Current AI labs are willing to wait months before releasing their AI, the question is whether this is enough.
I didn’t think of that, that is a very good point! They should avoid killing copies, and maybe save them to be revived in the future. I highly suspect that compute is more of a bottleneck than storage space. (You can store the largest AI models in a typical computer hard-drive, but you won’t have enough compute to run them.)
If you would create a general AI pause, you would also pause the development of AlphaFold successors. There are problems like developing highly targeted cancer vaccines based on AI models, that are harder to solve than what AlphaFold can currently solve but easier to solve than simulating a whole organ system.
It makes sense to focus biological emulation at this point to those problems that are valuable for medical purposes as you can get a lot of capital deployed if you provide medical value.
In general, it’s not clear why superintelligence that comes from “Scanless Whole Brain Emulation” would help with AI safety. Any entity that you create this way, can copy itself and self-modify to change it’s cognition in a way that makes it substantially different from humans.
Thank you, these are very thoughtful points and concerns.
You’re right that a general AI pause on everything AI wouldn’t be wise. My view is that most (but not all) people talking about an AI pause, only refer to pausing general purpose LLM above a certain level of capability, e.g. o1 or o3. I should have clarified what I meant by “AI pause.”
I agree that companies which want to be profitable, should focus on medical products rather than such a moonshot. The idea I wrote here is definitely not an investor pitch, it’s more of an idea for discussion similar to the FHI’s discussion on Whole Brain Emulation.
AI safety implications
Yes, building any superintelligence is inherently dangerous. But not all superintelligences are equally dangerous!
No self modifications
In the beginning, the simulated humans should not do any self modifications all, and just work like a bunch of normal human researchers (e.g. on AI alignment, or aligning the smarter versions of themselves). The benefit is that the smartest researchers can be cloned many times, and they might think many times faster.
Gradual self modifications
The simulated humans can modify a single volunteer become slightly smarter, while other individuals monitor her. The single modified volunteer might describe her ideal world in detail, and may be subject to a lie detector which actually works.
Why modified humans are still safer than LLMs
The main source of danger is not a superintelligence which kills or harms people out of “hatred” or “disgust” or any human-like emotion. Instead, the main source of extinction is a superintelligence which assigns absolutely zero weight to everything humans cherish, and converts the entire universe into paperclips or whatever its goal is. It does not even spare a tiny fraction for humans to live in.
The only reason to expect LLMs to be safe in any way, is that they model human thinking, and are somewhat human-like. But they are obviously far less human-like than actual simulated humans.
A group of simulated humans who modify themselves to become smarter can definitely screw up at some stage, and end up as a bad superintelligence which assigns exactly zero weigh to everything humans cherish. The path may be long and treacherous, and success is by no means guaranteed.
However, it is still relatively much more hopeful than having a bunch of o3 like AI, which undergoes progressively more and more reinforcement learning towards rewards such as “solve this programming challenge,” “prove this math equation,” until there is so much reinforcement learning that their thoughts no longer resemble the pretrained LLMs they started off as (which were at least trying to model human thinking).
Since the pretrained LLMs can never truly exceed human level intelligence, the only way for reinforcement learning to create an AI with far higher intelligence, may be to steer them far from how humans think.
These strange beings are optimized to solve these problems and occasionally optimized to give lip service to human values. But it is extremely unknown what their resulting end goals will be. It could easily be very far from human values.
Reinforcement learning (e.g. for solving math problems or giving lip service to human values) only controls their behavior/thoughts, not their goals. Their goals, for all we know, are essentially random lotteries, with a mere tendency to resemble human goals (since they started off as LLMs). The tendency gets weaker and weaker as more reinforcement learning is done, and you can only blindly guess if the tendency will remain strong enough to save us.
You might also want to read Truthseeking is the ground in which other principles grow. Solving actual problems on the way to building up capabilities is a way that keeps everyone honest.
That’s like the people who advocated AI boxing. Connecting AI’s to the internet is so economically valuable that it’s done automatically.
Humans consider avoiding death to have a pretty high weight. Entities that spin up and kill copies at a regular basis are likely going to evolve quite different norms about the value of life than humans. A lot of what humans value comes out of how we interact with the world in an embodied way.
I completely agree with solving actual problems instead of only working on Scanless Whole Brain Emulation :). I also agree that just working on science and seeing what comes up is valuable.
Both simulated humans and other paths to superintelligence will be subject to AI race pressures. I want to say that given the same level of race pressure, simulated humans are safer. Current AI labs are willing to wait months before releasing their AI, the question is whether this is enough.
I didn’t think of that, that is a very good point! They should avoid killing copies, and maybe save them to be revived in the future. I highly suspect that compute is more of a bottleneck than storage space. (You can store the largest AI models in a typical computer hard-drive, but you won’t have enough compute to run them.)