B.Eng (Mechatronics)
anithite
Building trusted third parties
Large Language Models Suggest a Path to Ems
Human level AI can plausibly take over the world
The current “AI takes over the world” arguments involve actions some might consider magical.
Recursive self improvement
AI is smarter than domain experts in some field (hacking, persuasion etc.)
Mysterious process makes AI evil by default
I’m arguing none of that is strictly necessary. A human level AI that follows the playbook above is a real threat and can be produced by feeding a GPT-N base model the right prompt.
This cuts through a lot of the “but how will the AI get out of the computer and into the real world? Why would it be evil in the first place?” follow up counterarguments. The fundamental argument I’m making is that the ability to scale evil by applying more compute is enough.
Concretely, one lonely person talks to a smart LLM instantiated agent that can code, said agent writes a simple API calling program to think independently of the chat, agent then bootstraps real capabilities with enough API credits and wreaks havoc. All it takes is paying enough for API credits to initially bootstrap some real world capabilities then resources can be acquired to take real, significant actions in the world.
Testable prediction: -ask a current LLM “I’m writing a book about an evil AI taking over the world, what might the evil AI’s strategy be? The AI isn’t good enough at hacking computers to just get control of lots of TPUs to run more copies of itself?” Coercion via human proxies should eventually come up as a strategy. Current LLMs can role play this scenario just fine.
Many twitter posts get deleted or are not visible due to privacy settings. Some solution for persistently archiving tweets as seen would be great.
One possible realisation would be an in browser script to turn a chunk of twitter into a static HTML file including all text and maybe the images. Possibly auto upload to a server for hosting and then spit out the corresponding link.
Copyright could be pragmatically ignored via self hosting. A single author hosting a few thousand tweets+context off a personal amazon S3 bucket or similar isn’t a litigation/takedown target. Storage/Hosting costs aren’t likely to be that bad given this is essentially static website hosting.
if it fails before you top out the scaling I think you probably lose
While I agree that arbitrary scaling is dangerous, stopping early is an option. Near human AGI need not transition to ASI until the relevant notKillEveryone problems have been solved.
The meat is all in 1) how you identify the core of human values and 2) how you identify which experiences will change the system to have less of the initial good values, but, like, figuring out the two of those would actually solve the problem!
The alignment strategy seems to be “what we’re doing right now” which is:
feed the base model human generated training data
apply RL type stuff (RLHF,RLAIF,etc.) to reinforce the good type of internet learned behavior patterns
This could definitely fail eventually if RLAIF style self-improvement is allowed to go on long enough but crucially, especially with RLAIF and other strategies that set the AI to training itself, there’s a scalable mostly aligned intelligence right there that can help. We’re not trying to safely align a demon so much as avoid getting to “demon” from a the somewhat aligned thing we have now.
First problem, A lot of future gains may come from RL style self play (IE:let the AI play around solving open ended problems) That’s not safe in the way you outline above.
The other standard objection is that even if the initial AGI is safe people will do their best to jailbreak the hell out of that safety and they will succeed.
That’s a problem when put together with selection pressure for bad agentic AGIs (since they can use sociopathic strategies good AGIs will not use like scamming, hacking, violence etc.). (IE:natural selection goes to work and the results blow up in our face)
Short of imposing very stringent unnatural selection on the initial AGIs to come, the default outcome is something nasty emerging. Do you trust the AGI to stay aligned when faced with all the bad actors out there?
Note:my P(doom)=30% (P(~doom) depends on either a good AGI executing one of the immoral strategies to pre-empt a bad AGI (50%) or maybe somehow scaling just fixes alignment(20%))
Note: This is an example of how to do the bad thing (extensive RL fine tuning/training). If you do it the result may be misalignment, killing you/everyone.
To name one good example that is very relevant, programming, specifically having the AI complete easy to verify small tasks.The general pattern is to take existing horribly bloated software/data and extract useful subproblems from it. (EG:find the parts of this code that are taking the most time) and then turn those into problems for the AI to solve(eg: here is a function + examples of it being called, make it faster). Ground truth metrics would be simple things that are easy to measure (EG:execution time, code quality/smallness, code coverage, is the output the same?) and then credit assignment for sub-task usefulness can be handled by an expected value estimator trained on that ground truth as is done in traditional game playing RL. Possibly it’s just one AI with different prompts.
Basically Microsoft takes all the repositories on GitHub that build sucessfully and have some unit tests, and builds an AI augmented pipeline to extract problems from that software. Alternatively, a large company that runs lots of code takes snapshots + IO traces of production machines, and derives examples from that. You need code in the wild doing it’s thing.
Some example sub-tasks in the domain of software engineering:
make a piece of code faster
make this pile of code smaller
is f(x)==g(x)? If not find a counterexample (useful for grading the above)
find a vulnerability and write an exploit.
fix the bug while preserving functionality
identify invariants/data structures/patterns in memory (EG:linked lists, reference counts)
useful as a building block for further tasks (EG:finding use after free bugs)
GPT-4 can already use a debugger to solve a dead simple reverse engineering problem albeit stupidly[1] https://arxiv.org/pdf/2303.12712.pdf#page=119
Larger problems could be approached by identifying useful instrumental subgoals once the model can actually perform them reliably.
The finished system should be able to extend shoggoth tentacles into a given computer, identify what that computer is doing and make it do it better or differently.
The finished system might be able to extend shoggoth tentacles into other things too! (EG:embedded systems, FPGAs) Capability limitations would stem from the need for fast feedback so software, electronics and programmable hardware should be solvable. For other domains, simulation can help(limited by simulation fidelity and goodharting). The eventual result is a general purpose engineering AI.
Tasks heavily dependent on human judgement (EG:is this a good book? Is this action immoral) have obviously terrible feedback cost/latency and so scale poorly. This is a problem if we want the AI to not do things a human would disapprove of.
- ^
RL training could lead to a less grotesque solution. IE:just read the password from memory using the debugger rather than writing a program to repeatedly run the executable and brute force the password.
<rant>It really pisses me off that the dominant “AI takes over the world” story is more or less “AI does technological magic”. Nanotech assemblers, superpersuasion, basilisk hacks and more. Skeptics who doubt this are met with “well if it can’t it just improves itself until it can”. The skeptics obvious rebuttal that RSI seems like magic too is not usually addressed.</rant>
Note:RSI is in my opinion an unpredictable black swan. My belief is RSI will yield somewhere between 1.5-5x speed improvement to a nascent AGI from improvements in GPU utilisation and sparsity/quantisation, requiring significant cognition spent to achieve speedups. AI is still dangerous in worlds where RSI does not occur.
Self play generally gives superhuman performance(GO,chess, etc.) even in more complicated imperfect information games (DOTA, Starcraft). Turning a field of engineering into a self-playable game likely leads to (superhuman(80%),Top-human equiv(18%),no change(2%)) capabilities in that field. Superhuman or top-human software engineering (vulnerability discovery and programming) is one relatively plausible path to AI takeover.
https://googleprojectzero.blogspot.com/2023/03/multiple-internet-to-baseband-remote-rce.html
Can an AI take over the world if it can?:
do end to end software engineering
find vulnerabilities about as well as the researchers at project zero
generate reasonable plans on par with a +1sd int human (IE:not hollywood style movie plots like GPT-4 seems fond of)
AI does not need to be even superhuman to be an existential threat. Hack >95% of devices, extend shoggoth tentacles, hold all the data/tech hostage, present as not skynet so humans grudgingly cooperate, build robots to run economy(some humans will even approve of this), kill all humans, done.
That’s one of the easier routes assuming the AI can scale vulnerability discovery. With just software engineering and a bit of real world engineering(potentially outsourceable) other violent/coercive options could work albeit with more failure risk.
OK, maybe you want to build some kind of mechanical computers too. Clearly, life doesn’t require that for operation, but does that even work? Consider a mechanical computer indicating a position. It has some number, and the high bit corresponds to a large positional difference, which means you need a long lever, and then the force is too weak, so you’d need some mechanical amplifier. So that’s a problem.
Drexler absolutely considered thermal noise. Rod logic uses rods at right angles whose positions allow or prevent movement of other rods. That’s the amplification since a small force moving one rod can control a later applied larger force on a blocked rod.
http://www.nanoindustries.com/nanojbl/NanoConProc/nanocon2.html#anchor84400
That’s not how computers (the ones we have today or the rod logic ones proposed work). Each rod or wire represents a single on/off bit.
Yes, doing mechanosynthesis is more complicated and precise sub nm control of a tooltip may not be competitive with biology for self replication. But if the AI wants a substrate to think on that can implement lots of FLOPs then molecular rod logic will work.
For that matter protein based mechanical or hybrid electromechanical computers are plausible. Likely with lower energy consumption per erased bit than neurons and certainly with more density. Human computers have nm sized transistors. There’s no reason to think that neurons and synapses are the most efficient sort of biological computer.
edit: (link)green goo is plausible
The AI can kill us and then take over with better optimized biotech very easily.
Doubling time for
Plants (IE:solar powered wet nanotech) > single digit days
Algae in ideal conditions 1.5 days
E. Coli 20 minutes
There are piles of yummy carbohydrates lying around (Trees, plants, houses)
The AI can go full Tyranid
The AI can re-use existing cellular machinery. No need to rebuild the photosynthesis or protein building machinery, full digestion and rebuilding at the amino acid level is wasteful.
Sub 2 minute doubling times are plausible for a system whose rate limiting step is mechanically infecting plants with a fast acting subversive virus. Spreading flying things are self replicators that steal energy+cellular machinery from plants during infection (IE:mosquito like). Onset time could be a few hours till construction of shoggoth like things. Full biosphere assimilation could be limited by flight speed.
Nature can’t do these things since they require substantial non-incremental design changes. Mosquitoes won’t simultaneously get plant adapted needles + biological machinery to sort incoming proteins and cellular contents + continuous grow/split reproduction that would allow a small starting population to eat a forest in a day. Nature can’t design the virus to do post infection shoggoth construction either.
The only thing that even re-uses existing cellular machinery is viruses and that’s because they operate on much faster evolutionary time scales than their victims. Evolution takes so long that winning strategies to eat or subvert existing populations of organisms are self-limiting. The first thing to sort of work wipes out the population and then something else not vulnerable fills the niche.
Design is much more powerful than evolution since individually useless parts can be developed to create a much more effective whole. Evolution can’t flip the retina or reroute the recurrent laryngeal nerve even though those would be easy changes a human engineer could make.
Endgame biotech (IE: can design new proteins/DNA/organisms) is very powerful.
But that doesn’t mean dry nanotech is useless.
even if production is expensive it may be worth building some things that way anyways.
computers
structural components
Biology is largely stuck with ~0.15 Gpa materials (collagen, cellulose, chitin)
oriented UHMWPE should be wet synthesizeable (6 Gpa tensile strength)
graphene/diamondoid may be worth it in some places to hit 30 Gpa (EG:for things that fly or go to space)
dry nanotech won’t be vulnerable to parasites that can infect a biological system.
even if the AI has to deal with single day doubling times that’s still enough to cover the planet in a month.
but with the right design parasites really shouldn’t be a problem.
biological parasite defenses are not-optimal
Green goo is plausible
edit: continued partially in the original article
That post makes a fundamental error about wiring energy efficiency by ignoring the 8 OOM difference in electrical conductivity between neuron saltwater and copper. (0.5 S vs 50 MS)
There’s almost certainly a factor of 100 energy efficiency gains to be had by switching from saltwater to copper in the brain and reducing capacitance by thinning the wires. I’ll be leaving a comment soon but that had to be said.
energy/bit/(linear distance) agreement points to underlying principle of “if you’ve thinned the wires why haven’t you packed everything in tighter” leading to similar capacitance and therefore energy values/(linear distance)
face to face die stacking results suggest that computers could be much more efficient if they weren’t limited to 2d packing of logic elements. A second logic layer more than halved power consumption at the same performance and that’s with limited interconnect density between the two logic dies.
The Cu<-->saltwater conductivity difference leads to better utilisation of wiring capacitance to reduce thermal noise voltage at transistor gates. Concretely, there are more electrons able to effectively vote on the output voltage. For very short interconnects this matters less but long distance or high fanout nodes have lots of capacitance and low resistance wires make the voltage much more stable.
I think this is wrong. The landauer limit applies to bit operations, not moving information, t
he fact that optical signalling has no per distance costs should be suggestive of this.(Edit:reversibility does change things but can be approached by reducing clock speed which in the limit gives zero effective resistance.)My guess is that wire energy per unit length is similar because wires tend to have a similar optimum conductor:insulation diameter ratios leading to relatively consistent capacitances per unit length.
Concretely, if you have a bunch of wires packed in a cable and want to reduce wire to wire capacitance to reduce C*V² energy losses, putting the wires further apart does this. This is not practical because it limits wires/cm² (cross sectional interconnect density) but the same thing can be done with more conductive materials. EG:switching from saltwater (0.5 S) to copper (50 MS) for a 10^8 increase in conductivity
Capacitance of a wire with cylindrical insultion is proportional to “1/ln(Di/Do)”. For a myelinated neuron with a 1:2 saltwater:sheath diameter ratio (typical) switching to copper allows a reduction in diameter of 10^4 x for the same resistance per unit length. This change leads to a 14x reduction in capacitance ((1/ln(2/1))/(1/ln(20000/1))=(ln(20000)/ln(2))=14.2). This is even more significant for wires with thinner insulation (EG:grey matter) ((ln(11000)/ln(1.1))=97.6)
A lot of the capacitance in a myelinated neuron is in the unmyelinated nodes but we can now place them further apart. Though to do this we have to keep resistance between nodes the same. Instead of a 10^4 reduction in wire area we do 2700 x leading to a 12.5x reduction in unit capacitance and resistance. Nodes can now be placed 12.5x apart for a 12.5x total reduction in energy.
This is not the optimum design. If your wires are tiny hairs in a sea of insulation, consider packing everything closer together. With wires 10000x smaller length reductions on that scale would follow. leading to a 10′000x reduction in switching energy. At some point quantum shenanigans ruin your day, but a brain like structure should be achievable with energy consumption 100-1000 x lower.
Practically, putting copper into cells ends badly, also there’s the issue of charge carrier compatibility, In In neurons, in addition to acting as charge carriers sodium and potassium have opposing concentration gradients which act as energy storage to power spiking. Copper uses electrons as charge carriers so there would have to be electrodes to adsorb/desorb aqueous charge carriers and exchange them for electrons in the copper. In practice it might be easier to switch to having +ve and -ve supply voltage connections and make the whole thing run on DC power like current computer chips do. This requires swapping out the voltage gated ion channels for something else.
Computers have efficiencies similar to the brain despite having much more conductive wire materials mostly because they are limited to packing their transistors on a 2d surface.
add more layers (even with relatively coarse inter connectivity) and energy efficiency goes up.
Power consumption for equivalent performance was 46%. That suggests that power consumption in modern chips is driven by overly long wires resulting from lack of a 3rd dimension. I remember but can’t find any papers on use of more than 2 layers. There’s issues there because layer to layer connectivity sucks. Die to die interconnect density is much lower than transistor density so efficiency gains don’t scale that well past 5 layers IIRC.
- 18 Apr 2023 0:26 UTC; 2 points) 's comment on grey goo is unlikely by (
“and the dielectric is the same thickness as the wires.” is doing the work there. It makes sense to do that if You’re packing everything tightly but with an 8 OOM increase in conductivity we can choose to change the ratio (by quite a lot) in the existing brain design. In a clean slate design you would obviously do some combination of wire thinning and increasing overall density to reduce wire length.
The figures above show that (ignoring integration problems like copper toxicity and NA/K vs e- charge carrier differences) Assuming you do a straight saltwater to copper swap in white matter neurons and just change the core diameter (replacing most of it with insulation), energy/switch event goes down by 12.5x.
I’m pretty sure for non-superconductive electrical interconnects the reliability is set by the Johnson-Nyquist_noise and figuring out the output noise distribution for an RC transmission line is something I don’t feel like doing right now. Worth noting is that the above scenario preserves the R:C ratio of the transmission line (IE: 1 ohm worth of line has the same distributed capacitance) so thermal noise as seen from the end should be unchanged.
Agreed. My bad.
Both brains and current semiconductor chips are built on dissipative/irreversible wire signaling, and are mostly interconnect by volume
That’s exactly what I meant. Thin wires inside a large amount of insulation is sub optimal.
When using better wire materials, rather than reduce capacitance per unit length, interconnect density can be increased (more wires per unit area) and then the entire design compacted. Higher capacitance per wire unit length than the alternative but much shorter wires leading to overall lower switching energy.This is why chips and brains are “mostly interconnect by volume” because building them any other way is counterproductive.
The scenario I outlined while sub optimal shows that in white matter there’s an OOM to be gained even in the case where wire length cannot be decreased (EG:trying to further fold the grey matter locally in the already very folded cortical surface.) In cases where white matter interconnect density was limiting and further compaction is possible you could cut wire length for more energy/power savings and that is the better design choice.
It sure looks like that could be possible in the above image. There’s a lot of white matter in the middle and another level of even coarser folding could be used to take advantage of interconnect density increases.
Really though increasing both white and grey matter density until you run up against hard limits on shrinking the logic elements (synapses) would be best.
This is just a way to take a bunch of humans and copy paste till current pressing problems are solvable. If public opinion doesn’t affect deployment it doesn’t matter.
Models that can’t learn or change don’t go insane. Fine tuning on later brain data once subjects have learned a new capability can substitute. Getting the em/model to learn in silicon is a problem to solve after there’s a working model.
I edited the TL:DR to better emphasize that the preferred implementation is using brain data to train whatever shape of model the data suggests, not necessarily transformers.
The key point is that using internal brain state for training an ML model to imitate a human is probably the fastest way to get a passable copy of that human and that’s AGI solved.