Digital humans vs merge with AI? Same or different?

mishka

So, to create context, this is a continuation of our remarks in the comments to Zvi’s “AI #40: A Vision from Vitalik

mishka

There I was asking about

  1. the boundary between humans merging with AI and digital humans (can these approaches be reliably differentiated from each other? or is there a large overlap?)

  2. why digital humans would be a safer alternative than the merge

mishka

A good starting point might be to ask you to elaborate on the first item: what are the differences, is there an significant overlap, is one concept almost a subset of another.

How do you see the relationship between the notion of a digital human and a “hybrid” (if that’s what the merge is)?

Nathan Helm-Burger

So the way I’m thinking about this is that digital human is a quite narrow and precise definition. An emulation of a human brain (not necessarily based on a particular human, that concept is even more specific, an Upload).

This brain emulation is constrained to act and to be modified only via a highly accurate ruleset based on detailed observations of human neurons. It cannot, for instance, use external processes to monitor or modulate the activity of its simulated neurons. It can’t add more neurons other than by the same processes and to the same degree that a normal biological human brain can.

Nathan Helm-Burger

Merging with AI on the other hand, is a much more open concept. Lots of things fall under this heading. Some central examples might include:

  • A brain-computer implant that allows an AI system to have read and/​or write access to the human brain.

  • An AI system that non-invasively controls much of the sensory input a set of humans receive, feeding them information and taking instructions from them in a high bandwidth way.

  • A digital human which has high bandwidth communication with an AI, either by normal emulated sensory methods or by the AI being able to directly read and/​or write neuronal activity.

Nathan Helm-Burger

The key part here is the AI, that is merging with the humans. It is not necessarily dangerous or overwhelming. But it might be.

And the trouble is that the easiest way to scale the “merge”-type systems up in power is to expand the power and influence of the AI.

The digital human, by being restricted to a biological ruleset, is limited in the ways it can scale in power. It still could have significant superpowers relative to biological humans. Not being subject to aging or biological diseases, able to travel at the speed of internet-data-transmission, able to clone itself very cheaply and quickly, cheaply save backups which double as checkpoints, experience virtual world with full sensory fidelity, and run at much higher clock speeds than biological brains.

Nathan Helm-Burger

So what danger do I foresee in the systems with connected closely with AI or systems that start as digital humans but don’t stick to biological rules?

The trouble is that I see the more permissive digital entity landscape as having multiple different slippery slopes towards very bad outcomes.

A classic example discussed at length elsewhere is that a digital human allowed to arbitrarily self-modify could fall into wire-heading without realizing how sticky the situation would be, and this effectively end their interaction with the world.

Another classic example is a digital human in a competitive economy who feels pressured to modify themselves to be more motivated to work. This could lead to a murder-Ghandi style slippery slope where each new version of the digital person cares relatively less about non-work things and thus makes the choice to further increase work-desire and decrease caring about anything else. Down this slope lies a loss of humanity.

mishka

Yes, I certainly do agree with all this. The main crux is

This brain emulation is constrained to act and to be modified only via a highly accurate ruleset based on detailed observations of human neurons. It cannot, for instance, use external processes to monitor or modulate the activity of its simulated neurons. It can’t add more neurons other than by the same processes and to the same degree that a normal biological human brain can.

Basically, a digital human can certainly do everything a biological human can, including equivalents of enhancement with nootropics, cognitive strategies, psychedelic brainstorms, and so on.

But this digital human will certainly be aware that breaking the rules and hacking on its architecture directly would potentially bring many orders of magnitude enhancement over even that.

So we would need to rely on the combination of digital humans promising not to go along that route and honorably keeping the commitment despite huge temptations, and technical means making it particularly difficult to break this constraint.

Nathan Helm-Burger

Yes, and I’m not sure we will succeed at restraining digital humans from biological-rule-breaking, but I think it’s worth trying. Just as I think it’s worth trying to keep AI sufficiently tool-like and under control that it doesn’t have the opportunity to take over, and so that we don’t accidentally or intentionally design an intelligent digital entity with feelings and moral value who we then must decide between giving it rights which would make it perilous to humanity or keeping it oppressed and enslaved.

Nathan Helm-Burger

And both those cautious paths seem like they come with a safety-tax, which will require regulation and enforcement to keep people from skipping out on.

Nathan Helm-Burger

Whereas, I see the proposal to ‘let humans and AI merge’ to be a vague proposal for a hands-off uncontrolled race to power. Let humans hook themselves up with AI in any way with no regulation, seems like you are asking for a lot of experiments, some of which seem likely to work in terms of the resulting collaboration gaining power. And yet not work in terms of the resulting collaboration maintaining or upholding human values.

mishka

Yes, even if one is trying to be very careful, and only uses non-invasive BCI between humans and digital systems, which is the form of merge I typically consider, safety issues are formidable, both in terms of immediate safety of participating humans, but even more importantly in terms of what kind of hybrid entities might result (even if we insist on the ability to decouple, to disconnect, take a long pause, and reconnect later, and keep repeating this “disconnect-take a break-reconnect” cycle, which is a reasonable thing to insist upon, still the uncertainty is high)...


On the other side of the scale, unlike Neuralink, progress with non-invasive BCI can be rather rapid, and might be competitive with pure AI approach in terms of timelines...

Whereas progress in terms of “pure brain emulation” might be too far in the future, unless one successfully invents a way to “accelerate it faster than our ability to actually map the brain”...

Nathan Helm-Burger

Yes, it’s part of the slowness of the path to a biological-rule-constrained accurate-brain-emulation-based digital human that gives me more confidence that that path is safer. We have more time to consider the ways to regulate digital humans and work on safety constraints.

But also for this reason, I don’t think that digital humans are going to be helpful in getting us through the tricky near-term period when AI becomes sufficiently powerful to play a part in catastrophe. These catastrophes might come from human misuse or from AI directed action.

On the other hand, I can see how someone might say, “Allowing for a near-term janky attempt at human-AI merger could give us a powerful AI-human team to address the safety issues of insufficiently regulated AI! It’s much easier than digital humans! It could be quite powerful, but yet would have a ‘human’ element.”

My response to this take is, “I would not trust that having some non-zero human element would be sufficient to make the system safe, even if the human going into the experiment seemed trustworthy to begin with. Human good behavior seems to be a fragile thing, and this would absolutely be an out-of-distribution trial by fire.”

Nathan Helm-Burger

So if someone is interested in pursuing the longer term goals of digital humans and human uploads (e.g. from cryopreserved brains), I am not against that. That doesn’t seem like it’s making risks to humanity worse in the short term, and it seems like we have enough time before those techs are ready to figure out the regulatory and safety issues.

However, if a funder were deciding between allocating resources to either a digital human and Uploads path, or to something directly tackling AI safety… I would urge them to contribute to AI safety, since I think that that issue is both more urgent and more tractable within the relevant timeframe.

Nathan Helm-Burger

Yet another direction is Intelligence Augmentation. Here, I believe there is a lot of capacity for biological human intelligence improvement, but that most of the high-impact non-AI-merge options are things which will take a lot more research to be ready. For instance, the topic I studied when I was in neuroscience: genetic modification of consenting adults for radical intelligence enhancement. I think that’ll perhaps take even longer that digital humans, and almost certainly isn’t relevant to the next 10 years. And I absolutely think the scientific community should spend whatever brain power it can on helping humanity survive the next 10 years, because I think we are in quite a lot of danger from AI-enabled catastrophe.

Nathan Helm-Burger

And that goes for non-computer-scientists as well. Biologists, for instance, can help by improving society’s ability to detect, prevent, and halt bioengineered pandemics. AI makes bioengineered pandemics easier and more likely, so by helping protect society from such you are helping reduce AI-catastrophe risk.

Nathan Helm-Burger

In the case of a human-AI high-bandwidth team, such as with non-invasive BCI, I would argue that there is potentially useful assistance which could be safely gained from such. The caveat however, is that the system should be treated with a great deal of mistrust and held to high safety standards. The scientific findings of the human-AI team should only be trusted if they can be fully verified by non-AI-enhanced humans.

Nathan Helm-Burger

So the same sorts of regulation that I think need to apply to AI should also be applied to a human-AI team.

  • Keep it confined, don’t let it spread /​ replicate. Keep it sandboxed (potentially with a cached copy of the internet that gets updated, sandboxing can still allow for information flow inwards.)

  • Don’t let it acquire power and resources of its own.

  • Don’t trust its outputs without verification.

  • Don’t let it self-modify or build novel AI systems. (Just because you have Generation 1 under control, doesn’t mean Generation 1 can’t build a sufficiently powerful Generation 2 to break out from under your control.)

  • Don’t let it fall into the hands of bad actors. (The AI part of team can be modified, for example by finetuning. The human part of the team can be persuaded to cooperate with immoral aims by, for example, brainwashing and torture.)

mishka

Yes, this makes sense.

In the case of a human-AI handbandwidth team, such as with non-invasive BCI, I would argue that there is potentially useful assistance which could be safely gained from such.

Yes, in fact, the main use case is scientific research, and, in particular, AI safety research, which has to be done by human-AI teams to a large extent in order to be at all feasible.

But a good deal of caution is needed, and, in particular

The caveat however, is that the system should be treated with a great deal of mistrust and held to high safety standards. The scientific findings of the human-AI team should only be trusted if they can be fully verified by non-AI-enhanced humans.

is very much applicable (with a caveat that what counts as “fully verified” depends on procedures computer science might be able to devise in order to make it feasible… e.g. I want to gesture in the direction of “zero-knowledge proofs”, not in the sense of them being literally applicable, but in the sense that some solutions approximately in this spirit might be found).

Nathan Helm-Burger

Good point about “fully verified”, I should’ve said something more like, “verify probabilistically to a degree of confidence commensurate with the risk of implementing the suggested innovation.” Since we’re basically sitting on a ticking time bomb with a hidden timer, we can’t really afford to be maximally cautious in our attempts to solve the problem.