MMath Cambridge. Currently studying postgrad at Edinburgh.
Donald Hobson
Also, there are big problems with the idea of patents in general.
If Alice and Bob each invent and patent something, and you need both ideas to be a useful product, then if Alice and Bob can’t cooperate, nothing gets made. This becomes worse the more ideas are involved.
It’s quite possible for a single person to patent something, and to not have the resources to make it (at least not at scale) themselves, but also not trust anyone else with the idea.
Patents (and copyright) ban a lot of productive innovation in the name of producing incentives to innovate.
Arguably the situation where innovators have incentive to keep their idea secret and profit off that is worse. But the incentives here are still bad.
How about
When something is obviously important with hindsight, pay out the inventors. (Innovation prize type structure. Say look at all the companies doing X, and split some fraction of their tax revenue between inventors of X) This is done by tracing backwards from the widely used product. Not tracing forwards from the first inventor. If you invent something, but write it up in obscure language and it gets generally ignored, and someone else reinvents and spreads the idea, that someone gets most of the credit.
Let inventors sell shares that are 1% of any prize I receive for some invention.
Do x-rays only interact with close in electrons?
I would expect there to be some subtle effect where the xray happened to hit an outer electron and knock it in a particular way.
For that matter, xray diffraction can tell you all sorts of things about crystal structure. I think you can detect a lot, with enough control of the xrays going in and out.
make the AI produce the AI safety ideas which not only solve alignment, but also yield some aspect of capabilities growth along an axis that the big players care about, and in a way where the capabilities are not easily separable from the alignment.
So firstly, in this world capability is bottlenecked by chips. There isn’t a runaway process of software improvements happening yet. And this means there probably aren’t large easy capabilities software improvements lying around.
Now “making capability improvements that are actively tied to alignment somehow” sounds harder than making any capability improvement at all. And you don’t have as much compute as the big players. So you probably don’t find much.
What kind of AI research would make it hard to create a misaligned AI anyway?
A new more efficient matrix multiplication algorithm that only works when it’s part of a CEV maximizing AI?
The big players do care about having instruction-following AIs,
Likely somewhat true.
and if the way to do that is to use the AI safety book, they will use it.
Perhaps. Don’t underestimate sheer incompetence. Someone pressing the run button to test the code works so far, when they haven’t programmed the alignment bit yet. Someone copying and pasting in an alignment function but forgetting to actually call the function anywhere. Misspelled variable names that are actually another variable. Nothing is idiot proof.
I mean presumably alignment is fairly complicated and it could all go badly wrong because of the equivalent of one malfunctioning o-ring. Or what if someone finds a much more efficient approach that’s harder to align.
Possible alternatives.
AI can make papers as good as the average scientist, but wow is it slow. Total AI paper output is less than total average scientist output, even with all available compute thrown at it.
AI can write papers as good as the Average scientist. But a lot of progress is driven by the most insightful 1% of scientists. So we get ever more mediocre incremental papers without any revolutionary new paradigms.
AI can make papers as good as the average scientist. For AI safety reasons, this AI is kept rather locked down and not run much. Any results are not trusted in the slightest.
AI can make papers as good as the average scientist. Most of the peer review and journal process is also AI automated. This leads to a goodhearting loop. All the big players are trying to get papers “published” by the million. Almost none of these papers will ever be read by a human. There may be good AI safety ideas somewhere in that giant pile of research. But good luck finding them in the massive piles of superficially plausible rubbish. If making a good paper becomes 100x easier, but making a rubbish paper becomes a million times easier, and telling the difference becomes 2x easier, the whole system get’s buried in mountains of junk papers.
AI’s can do and have done AI safety research. There are now some rather long and technical books that present all the answers. Capabilities is now a question of scaling up chip production. (Which has slow engineering bottlenecks) We aren’t safe yet. When someone has enough chips, will they use that AI safety book or ignore it? What goal will they align their AI to?
There are probably highly effective anti-cancer methods which have a modest performance overhead.
The world contains a huge number of cameras, and a lot of credulous people.
If you search for any weird blip you can’t explain, you find a lot of them.
The “UFO” videos are all different sizes and characteristics.
If you think most of the videos have a non-aliens explanation, the number of videos offers almost no evidence.
A mole of flops.
That’s an interesting unit.
Physics Myths vs reality.
Myth: Ball bearings are perfect spheres.
Reality: The ball bearings have slight lumps and imperfections due to manufacturing processes.
Myth: Gravity pulls things straight down at 9.8 m/s/s.
Reality: Gravitational force varies depending on local geology.
You can do this for any topic. Everything is approximations. The only question is if they are good approximations.
I’m not sure what that’s supposed to mean.
Why lift dirt when you can push it sideways.
I suppose the particle size of condensed rock could theoretically be smaller than RAB particles and thus require a lower pressure drop, but that’s not necessarily the case.
Particle size seems like an important factor here. You don’t know it and I don’t know it either.
But presumably that’s one of the factors that QUAZE will be working on. Possibly something fancy like injecting a small electrostatic charge so the hot rock specks repel each other.
When rock gets hot and pressures get high, a hole will slowly close as rock flows inward.
When rock gets very hot in a microwave beam, it expands. Pressure could get very high. Will it quickly flow outwards, meaning that removing the dirt isn’t needed?
With a single hole, thermal conductivity is a limiting factor. The rock around the hole cools down before much power is produced.
Are you assuming that QUAZE, with their microwave tech, will use a less efficient form of 1 hole geothermal when you know that most geothermal plants use something more efficient?
In general, this is the sort of article that can be written, whether a tech is feasible or not.
You name engineering problems, and don’t discuss whether they are show stoppers or manageable. You guess at a way things might be done, do calculations, and declare it impractical. (The calculations are for a naive / stupidly designed version of the tech that is indeed impractical.)
If AI labs are slamming on the recursive self improvement ASAP, it may be that Autonomous Replicating Agents are irrelevant. But that’s a “ARA can’t destroy the world if AI labs do it first” argument.
ARA may well have more compute than AI labs. Especially if the AI labs are trying to stay within the law, and the ARA is stealing any money/compute that it can hack it’s way into. (Which could be >90% of the internet if it’s good at hacking. )
there will be millions of other (potentially misaligned) models being deployed deliberately by humans, including on very sensitive tasks (like recursive self-improvement).
Ok. That’s a world model in which humans are being INCREDIBLY stupid.
If we want to actually win, we need to both be careful about deploying those other misaligned models, and stop ARA.
Alice: That snake bite looks pretty nasty, it could kill you if you don’t get it treated.
Bob: That snake bite won’t kill me, this hand grenade will. Pulls out pin.
I propose a layercake model of AI. An AI consists of 0 or more layers of general optimizer, followed by 1 layer of specific tricks.
(I won’t count the human programmer as a layer here)
For example, if you hardcoded an algorithm to recognize writing, designing algorithms by hand, expert system style, then you have 0 layers of general optimizer.
If you have a standard CNN, the gradient descent is an optimization layer, and below that is specific details about what letters look like.
In this picture, there is a sense in which you aren’t missing any insights about intelligence in general. The idea of gradient descent is intelligence in general. And all the weights of the network contain is specific facts about what shape letters are.
(Although these specific facts are stored in a pretty garbled format. And this doesn’t tell you which specific facts will be learned)
If you used an evolutionary algorithm over Tensor maths, and you evolved a standard gradient descent neural network, this would have 2 general optimization layers.
If neural networks become more general/agentic (As some LLM’s might already be a little bit) then those neural nets are starting to contain an internal general optimization algorithm, along with the specifics.
This general algorithm should be of the same Type of thing as gradient descent. It might be more efficient or have more facts hard coded in or a better prior. It might be insanely contrived and complicated. But I think, if we had these NN found algorithms, alignment would still be the same type of problem.
I don’t think the failure of evolution is evidence that alignment is impossible or even hard.
Evolution wasn’t smart enough to realize that alignment was a problem. It put the retina backwards. Evolution can do really dumb things. The failure of evolution is consistent with a world where alignment takes 2 lines of python and is obvious to any smart human who gives the problem a few hours thought.
“Smart humans haven’t solved it yet” gives a much stronger lower bound on difficulty than evolutions failure. At least if alignment is the sort of problem best solved with general simple principles (where humans are better) as opposed to piling on the spaghetti code (where evolution can sometimes beat humans)
I think the marginal value of OpenAI competence is now negative. We are at a point where they have basically no chance to succeed at alignment, and further incompetence makes it more likely for the company to not get anything dangerous. Making any AGI at all requires competence and talent, and an environment that isn’t a political cesspool.
You can make work out, if you are prepared to make your mathematics even more deranged.
So lets look at
Think of the not as but as some infinitesimal times some unknown function .
If that function is then we get which is finite, so multiplied by it becomes infinitesimal.
If then we get and as we know because
So this case is the same as before.
But for we get which doesn’t converge. The infinite largeness of this sum cancels with the infinitesimally small size of (Up to an arbitrary finite constant).
So
Great. Now lets apply the same reasoning to
. First note that this is infinite, it’s , so undefined. Can we make this finite. Well think of as actually being and in this case, take
For the final term, the smallness of epsilon counteracts having to sum to infinity. For the first and middle term, the sum is
Which is
Now
So we have
The first term is negligible. So
Note that the can be ignored, because we have for arbitrary (finite) C as before.
Now is big, but it’s probably less infinite than somehow. Let’s just group it into the and hope for the best.
advanced ancient technology is such a popular theme
Well one reason is it’s a good way to produce plot relevant artefacts. It’s hard to have dramatic battles over some object when a factory is churning out more.
True. But for that you need there to exist another mind almost identical to yours except for that one thing.
In the question “how much of my memories can I delete while retaining my thread of subjective experience?” I don’t expect there to be an objective answer.
The point is, if all the robots are a true blank state, then none of them is you. Because your entire personality has just been forgotten.
Who knows what “meditation” is really doing under the hood.
Lets set up a clearer example.
Suppose you are an uploaded mind, running on a damaged robot body.
You write a script that deletes your mind, running a bunch of nul-ops before rebooting a fresh blank baby mind with no knowledge of the world.
You run the script, and then you die. That’s it. The computer running nul ops “merges” with all the other computers running nul ops. If the baby mind learns enough to answer the question before checking if it’s hardware is broken, then it considers itself to have a small probability of the hardware being broken. And then it learns the bad news.
Basically, I think forgetting like that without just deleting your mind isn’t something that really happens. I also feel like, when arbitrary mind modifications are on the table, “what will I experience in the future” returns Undefined.
Toy example. Imagine creating loads of near-copies of yourself, with various changes to memories and personality. Which copy do you expect to wake up as? Equally likely to be any of them? Well just make some of the changes larger and larger until some of the changes delete your mind entirely and replace it with something else.
Because the way you have set it up, it sounds like it would be possible to move your thread of subjective experience into any arbitrary program.
Gold is high value per mass, but has a lot of price transparency and competition.