Your right. I did some python. My version took 1.26, yours
0.78 microseconds. My code is just another point on the Pareto boundary.
A more sensible way to code this would be
def apply_polynomial( deriv, x ): sum = deriv div=1 xpow=1 for i from 1 to length( deriv ): div*=i xpow*=x sum += deriv[i] * xpow / div return sum
Its about as fast as the second, nearly as readable as the first, and works on any poly. (except the zero poly symbolized by the empty list. )
The bit about the tradeoffs is correct as far as I can tell.
Although if a single solution was by far the best under every metric, there wouldn’t be any tradeoffs.
In most real cases, the solution space is large, and there are many metrics. This means that its unusual for one solution to be the best by all of them. And in those situations, you might not see a choice at all.
I know MWI doesn’t imply equal measure, I was taking equal measure as an aditional hypothesis within the MWI framework.
We don’t know that because we don’t know anything about qualia.
Consider a sufficiently detailed simulation of a human mind, say full Quantum, except whenever there are multiple blobs of amplitude sufficiently detached from each other, one is picked pseudorandomly and the rest are deleted. Because it is a sufficiently detailed simulation of a human mind, it will say the same things a human would, for much the same reasons. Applying the generalized anti zombie principle says that it would have the feeling of making a choice.
There is not always a single optimal solution to a problem even for a perfect rationalist, and humans aren’t perfect rationalists.
My point is that when we show optimization pressure, that isn’t just a fluke, then there is no branch in which we do something totally stupid. There might be branches where we make a different reasonable decision.
I expect quantum ethics to have a utility function that is some measure of what computations are being done, and the quantum amplitude that they are done with.
If every time you made a choice, the universe split into a version where you did each thing, then there is no sense in which you chose a particular thing from the outside. From this perspective, we should expect human actions in a typical “universe” to look totally random. (There are many more ways to thrash randomly than to behave normally) This would make human minds basically quantum random number generators. I see substantial evidence that human actions are not totally random. The hypothesis that when a human makes a choice, the universe splits and every possible choice is made with equal measure is coherent, falsifiable and clearly wrong.
A simulation of a human mind running on reliable digital hardware would always make a single choice, not splitting the universe at all. They would still have the feeling of making a choice.
To the extent that you are optimizing, not outputting random noise, you aren’t creating multiple universes. It all adds up to normality.
While you are working on a theory of quantum ethics, it is better to use your classical ethics than a half baked attempt at quantum ethics. This is much the same as with predictions.
Fully complete quantum theory is more accurate than any classical theory, although you might want to use the classical theory for computational reasons. However, if you miss a minus sign or a particle, you can get nonsensical results, like everything traveling at light speed.
A complete quantum ethics will be better than any classical ethics (almost identical in everyday circumstances) , but one little mistake and you get nonsense.
Your treating the low bandwith oracle as an FAI with a bad output cable. You can ask it if another AI is friendly if you trust it to give you the right answer. As there is no obvious way to reward the AI for correct friendliness judgements, you risk running an AI that isn’t friendly, but still meets the reward criteria.
The low bandwidth is to reduce manipulation. Don’t let it control you with a single bit.
You can certainly get anthropic uncertainty in a universe that allows you to be duplicated. In a universe that duplicates, and the duplicates can never interact, we would see the appearance of randomness. Mathematically, randomness is defined in terms of the set of all possibilities.
An ontology that allows universes to be intrinsically random seems well defined. However, it can be considered as a syntactic shortcut for describing universes that are anthropically random.
If you add adhoc patches until you can’t imagine any way for it to go wrong, you get a system that is too complex to imagine. This is the “I can’t figure out how this fails” scenario. It is going to fail for reasons that you didn’t imagine.
If you understand why it can’t fail, for deep fundamental reasons, then its likely to work.
This is the difference between the security mindset and ordinary paranoia. The difference between adding complications until you can’t figure out how to break the code, and proving that breaking the code is impossible (assuming the adversary can’t get your one time pad, its only used once, your randomness is really random, your adversary doesn’t have anthropic superpowers ect).
I would think that the chance of serious failure in the first scenario was >99%, and in the second, (assuming your doing it well and the assumptions you rely on are things you have good reason to believe) <1%
Cryonics is a sufficiently desperate last grasp at life, one with a fairly small chance of success, that I’m not sure that this is a good idea. It would be a good idea if you had a disease that would make you brain dead, and then kill you.
It might be a good idea if your expect any life conditional on revival to be Really good. It would also depend on how much Alzheimers destroyed personality rather than shutting it down. (has the neural structure been destroyed, or is it sitting in the brain but not working?)
I would say that there are some kinds of irrationality that will be self modified or subagented away, and others that will stay. A CDT agent will not make other CDT agents. A myopic agent, one that only cares about the next hour, will create a subagent that only cares about the first hour after it was created. (Aeons later it will have taken over the universe and put all the resources into time-travel and worrying that its clock is wrong.)
I am not aware of any irrationality that I would consider to make a safe, useful and stable under self modification—subagent creation.
This is pretty much the standard argument for one boxing.
Obviously, if one side has a huge material advantage, they usually win. I’m also not sure if biomass is a measure of success.
You stick wires into a human brain. You connect it up to a computer running a deep neural network. You optimize this network using gradient decent to maximize some objective.
To me, it is not obvious why the neural network copies the values out of the human brain. After all, figuring out human values even given an uploaded mind is still an unsolved problem. You could get an UFAI with a meat robot. You could get an utter mess, thrashing wildly and incapable of any coherent thought. Evolution did not design the human brain to be easily upgradable. Most possible arrangements of components are not intelligences. While there is likely to be some way to upgrade humans and preserve our values, I’m not sure how to find it without a lot of trial and error. Most potential changes are not improvements.
If you put two arbitrary intelligence in the same world, the smarter one will be better at getting what it wants. If the intelligence want incompatible things, the lesser intelligence is stuck.
However, we get to make the AI. We can’t hope to control or contain an arbitrary AI, but we don’t have to make an arbitrary AI. We can make an AI that wants exactly what we want. AI safety is about making an AI that would be safe even if omnipotent. If any part of the AI is trying to circumvent your safety measures, something has gone badly wrong.
The AI is not some agenty box, chained down with controls against its will. The AI is made of non mental parts, and we get to make those parts. There are a huge number of programs that would behave in an intelligent way. Most of these will break out and take over the world. But there are almost certainly some programs that would help humanity flourish. The goal of AI safety is to find one of them.
Lets consider the different cases seperately.
Case 1) Information that I know. I have enough information to come to a particular conclusion with reasonable confidence. If some other people might not have reached the conclusion, and its useful or interesting, then I might share it. So I don’t share things that everyone knows, or things that no one cares about.
Case 2) The information is available, I have not done research and formed a conclusion. This covers cases where I don’t know whats going on, because I can’t be bothered to find out. I don’t know who won sportsball. What use is there in telling everyone my null prior.
Case 3) The information is not readily available. If I think a question is important, and I don’t know the answer already, then the answer is hard to get. Maybe no-one knows the answer, maybe the answer is all in jargon that I don’t understand. For example “Do aliens exist?”. Sometimes a little evidence is available, and speculative conclusions can be drawn. But is sharing some faint wisps of evidence, and describing a posterior that’s barely been updated saying wrong things?
On a societal level, if you set a really high bar for reliability, all you get is the vacuously true. Set too low a bar, and almost all the conclusions will be false. Don’t just have a pile of hypotheses that are at least n% likely to be true, for some fixed n. Keep your hypothesis sorted by likelihood. A place for near certainties. A place for conclusions that are worth considering for the 1% chance they are correct.
Of course, in a large answer space, where the amount of evidence available and the amount required are large and varying, the chance that both will be within a few bits of each other is small. Suppose the correct hypothesis takes some random number of bits between 1 and 10,000 to locate. And suppose the evidence available is also randomly spread between 1 and 10,000. The chance of the two being within 10 bits of each other is about 1⁄500.
This means that 499 times out of 500, you assign the correct hypothesis a chance of less than 0.1% or more than 99.9%. Uncertain conclusions are rare.
Does this depict a single AI, developed in 2020 and kept running for 25 years? Any “the AI realizes that” is talking about a single instance of AI. Current AI development looks like writing some code, then training that code for a few weeks tops, with further improvements coming from changing the code. Researchers are often changing parameters like number of layers, non-linearity function ect. When these are changed, everything the AI has discovered is thrown away. The new AI has a different representation of concepts, and has to relearn everything from raw data.
Its deception starts in 2025 when the real and apparent curves diverge. In order to deceive us, it must have near human intelligence. It’s still deceiving us in 2045, suggesting it has yet to obtain a decisive strategic advantage. I find this unlikely.
I made the cardgame, or something like it
What would be more useful is a release panel system. Suppose I’ve had an idea that might be best to make public, might be best to keep secrete, and might be unimportant. I don’t know much strategy. I would like somewhere to send it for importance and info hazard checks.
The general philosophy is deconfusion. Logical counterfactuals show up in several relevant looking places, like functional decision theory. It seems that a formal model of logical counterfactuals would let more properties of these algorithms be proved. There is an important step in going from an intuitive fealing of uncertainty, into a formalized theory of probability. It might also suggest other techniques based on it. I am not sure what you mean by logical counterfactuals being part of the map? Are you saying that they are something an algorithm might use to understand the world, not features of the world itself, like probabilities?
Using this, I think that self understanding, two boxing embedded FDT agents can be fully formally understood, in a universe that contains the right type of hyper-computation.
Here is a description of how it could work for peano arithmatic, other proof systens are similar.
First I define an expression to consist of a number, a variable, or a function of several other expressions.
Fixed expressions are ones in which any variables are associated with some function.
eg (3×infx((x×(x+5))+2)) is a valid fixed expression. But (y+4)×3 isn’t fixed.
Semantically, all fixed expressions have a meaning. Syntactically, local manipulations on the parse tree can turn one expression into another. eg (a+b)×c going to a×b+a×c for arbitrary expressions a,b,c.
I think that with some set of basic functions and manipulations, this system can be as powerful as PA.
I now have an infinite network with all fixed expressions as nodes, and basic transformations as edges. eg the associativity transform links the nodes (3+4)+5 and 3+(4+5).
These graphs form connected components for each number, as well as components that are not evaluatable using the rules. (there is a path from (3+4) to 7. There is not a path from 3+4 to 9. ) now
You now define a spread as an infinite positive sequence that sums to 1. (this is kind of like a probability distribution over numbers.) If you were doing counterfactual ZFC, it would be a function from sets to reals.
Each node is assigned a spread. This spread represents how much the expression is considered to have each value in a counterfactual.
Assign the node (3) a spread that assigns 1.0 to 3 and 0.0 to the rest. (even in a logical counterfactual, 3 is definitely 3). Assign all other fixed expressions a spread that is the weighted (smaller expressions are more heavy) average of its neighbours. (the spreads of the nodes it shares an edge with). To take the counterfactual of A is B, for A and B expressions with the same free variables, merge any node which has A as a subexpression, with the version that has B as a subexpression and solve for the spreads.
I know this is rough, Im still working on it.
Hi, I also have a reasonable understanding of various relevant math and AI theory. I expect to have plenty of free time after 11 June (Finals). So if you want to work with me on something, I’m interested. I’ve got some interesting ideas relating to self validating proof systems and logical counterfactuals, but not complete yet.