This view appears to directly contradict the lines “Better for you if you take me off” in conjunction with “The earring is always right”.
JBlack
The problem isn’t in the simulation part, but in the “supports” part.
You can certainly write a simulation in which an agent decides to take both boxes. By the conditions of the scenario, they get $1000. Does this simulation “support” taking both boxes? No, unless you’re only comparing with alternative actions of not taking a box at all, or burning box B and taking the ashes, or other things that are worse than getting $1000.
However, the scenario states that the agent could take 1 box, and it is a logical consequence of the scenario setup that that in the situations where they do, they get $1000000. That’s better than getting $1000 under the assumptions of the scenario, and so a simulation that actually follows the rules of the scenario cannot support taking two boxes.
There aren’t just two possibilities “ideal bayesian reasoning” and “useless rubbish”. There is a huge range of heuristics, ad-hoc models, evolved instincts, and everything else in the mix. These are all ‘outside bayesianism’ and while the collection is almost certainly worse than ideal bayesian reasoning, they are not useless.
That also doesn’t mean that we can only improve by having more people actively do bayesian reasoning about stuff, though there are certainly many cases where people would be better off actively doing bayesian reasoning.
There are many ways to improve incredibly complex systems such as human minds and their interactions. It’s far from certain that applying more bayesian reasoning is the best way. We are definitely not capable of reaching the ideal, and will have to settle for something imperfect. Maybe there is a better approximation than “try bayesian reasoning as far as our limited human brains can handle”, maybe there is not.
The main point is that we don’t know anything better, and pretty much everything else that we do know looks worse. However, there is a lot that we don’t know, and far more that we don’t even know that we don’t know.
One thing that is pretty clear is that an ideal bayesian reasoner would distribute some probability across all non-self-contradictory hypotheses that you can express in text of bounded length. There are only finitely many of them, so a failure to include some of them would be a pretty major departure from idealness.
Humans are not ideal Bayesian reasoning engines with unbounded ability to track all possible hypotheses.
In principle an ideal Bayesian reasoner could deduce the hypotheses of life, resource competition, strategies for obtaining resources, and the possible existence of scams from basically nothing (in addition to myriad other hypotheses). If they’re starting from the existence of markets as implied by being able to understand the letter at all, they could get to scams almost instantly.
None of these examples nor discussion get anywhere near the core of acausal trade, nor the objections to it.
The core is that you don’t know anything about the so-called trade partner except that which is a priori deducible from your model of the wider universe, specifically from your model of how other entities in the wider model of the universe model acausal trade with things in their model of the wider universe.
Almost all posts about acausal trade talk about two entities, unable to directly communicate but able to specifically model each other, but this is misleading to the point of being an incredible lie. They don’t and can’t know anything about each other, and while they are generally supposed to be superintelligences and could model each other in sufficient detail if each was sufficiently well described to the other, there is no such description. There are immensely more possible entities for each to consider than atoms in the visible universe—too many for even a superintelligence to model accurately regarding their counterfactual specific decisions.
You know of nothing that has destroyed this civilisation, and you believe this provides sufficient evidence that a descendant of this civilisation exists today, one who would also send fifty dollars to your preferred charity if an apple were placed in the right spot.
No. Under this analogy, you definitely know that there is no descendant of this civilization that will send fifty dollars to your preferred charity if an apple were placed in the right spot. That would be causal!
All you know is that there might be any of infinitely many civilizations, with apple-spot-caring civilizations making up 10^-100 of them or less, and none of those apple-spot-caring civilizations know that your civilization exists, and certainly don’t know that you exist, let alone what your favourite charity is and whether you have placed or are likely to place an apple in that spot. It’s likely that half of them would instead donate to the Puppy-Kicking Foundation or worse if they did know, because they very much care about apples not being in that sort of spot (and don’t care about puppies).
You don’t know which is more likely, and can’t spend the brainpower to figure it out: they make up 10^-100 or less of the infinite distribution, and 10^-100 of your computing infrastructure isn’t even one atom-nanosecond and there are many much more important things to think about.
Yes, I would have to agree here. The parts that are sci-fi (almost none) are not hard, and the parts that are hard are not sci-fi.
This fails immediately. The implication in your second sentence simply does not hold, and the rest is rendered irrelevant (though also flawed independently of that).
No, there is no way to write a simulation that supports taking both boxes while also upholding the conditions of the scenario.
Even with an imperfect predictor, you would have to make the predictor effectively useless at predicting, performing no better than 0.1% above chance. Even if it predicts some agents well and some poorly, you would need to p-hack the result by ignoring the agents for which it predicted better to get a recommendation to take both boxes.
Variants with imperfect predictors still generally favour one-boxing unless the the predictor is absolutely terrible at predicting, simply because the loss you suffer from being predicted to take both is 1000x greater than the gain from actually taking both.
I expect the strategy to cause the effects of anti-American propaganda to skyrocket. Suppose that Russia decided to murder Zelenskiy.
There are credible reasons to believe that they have attempted to do so more than once. Did you mean if they succeeded?
What do you define as “replace a human job”? We are already seeing AI that can replace at least 50% of a job for very much cheaper than the 50% of the cost of paying a worker to do those parts of that job. In principle that means that many employers can fire half their workforce and get the remaining employees to pick up the other 50% of the jobs from the fired employees.
In practice this would involve huge disruption and uncertainty, and perhaps they can avoid that bother by letting go their most obviously least productive employees, lowering costs a little (say to 95%) to do the same or slightly more work with much reduced disruption. Over time, the employees who use more AI in the workflows take less effort to do the job. We are seeing this already.
This obviously isn’t a stable long-term economic behaviour. Those conservative employers probably will continue to decrease workforce slowly, while being eroded by employers much more willing to accept disruption eating into their markets at greatly reduced costs.
However, it takes time. The more capable that AI becomes per unit cost, the greater the advantage that disruption-tolerant employers will have, possibly leading to multiple larger failures of conservative employers or possibly rapid culture changes to avoid such failures, replacing large segments of workforce at some later point.
In this model (which matches what we are already seeing), the job losses are inevitable but come some economically significant and somewhat unpredictable time after the cost of AI drops well below the cost of employing a human to do some tasks.
There’s nothing that requires an economy to maintain a continuous equilibrium of perfectly distributed cost/productivity balances at all times, and we see plenty of past examples where it has not. Continuous changes to parameters in a complex system often result in sudden changes in behaviour, not just continuous ones.
A single Turing machine can perfectly emulate any number of independent concurrently running Turing machines (including an unbounded number), so I’m not sure that the distinction for abstract machines is relevant.
In the physical realm, it is certainly true that modern computers actually contain many different physical computers. Though again, one physical computer can also emulate multiple physical computers so the distinction isn’t all that great there either.
I do think that most of the apparent limitation is actually a human limitation. My computer at the moment is doing at least half a dozen different top-level tasks (i.e. not just housekeeping or support tasks), but most of them are not constantly competing for my limited human attention on the physical screen at once.
Yes, I agree here. Efficiency concerns just lower the temperature you want to radiate at, which need not be related to the distance from the star until you’re using the entire surface area. Building further out does reduce energy usable per unit area (and likely also per unit mass), and also increases coordination problems, so it seems to be strictly a disadvantage. It isn’t that difficult to maintain equipment in the inner solar system at temperature far below the black-body equilibrium.
Also yes, Landauer’s principle isn’t binding in many ways, since it relies on multiple assumptions that may not hold or can be bypassed. Reversible computing is one, and also energy is not the only conserved quantity involved in thermodynamics.
This is one of the least possible concerns arising from superpersuasive AIs. It assumes that experts exposed to superpersuasive AIs still get to choose whether to believe what it says, and considers only higher-order epistemic harms instead of direct first-order harms like persuading people to kill each other and/or themselves.
Then Omega correctly predicts that you wouldn’t have paid if the coin had come up the other way, and punishes you.
Note: I am using the word “correct” in the sense that you have literally just told us that you wouldn’t have paid if the coin had come up the other way, and it makes no sense to claim anything about “that case regarding the other possibility for the coin is just a hypothetical” since the entire thing being discussed is a hypothetical.
In more detail:
Within the outer hypothetical of this scenario happening at all, Omega’s prediction about the coin-alternative hypothetical is a fact (not a hypothetical) that you are not aware of, but can predict with very high success rate. It is very highly correlated with the output of your decision process, though not caused by the output of your decision process. Both the prediction and the output have a common cause. If your decision process is anywhere near as legible (to Omega) as you state it to be, and results in the output you state, then it will result in you being punished, and this punishment should be highly predictable to you in advance.
However, you have stated that you do not predict punishment, so there is something wrong with your decision process.
Why do you think that the is implies any particular ought?
Now sit him down on his college dorm twin mattress and show him photographs from his future.
By doing this, it is almost certain that these photographs won’t be from his future unless you are a superintelligence and very carefully chose both that person and those photographs knowing that his future contingent upon him seeing those photographs = the depicted future.
Those photographs can only be correct to the extent that he allows them to be correct. Not every such system has a fixed point, and for some selections of person this may be simply impossible.
But is this even possible for any predictee at all? After all, his future is also your future, and you would need to be able to predict your own actions and their consequences in perfect detail as well as that of the rest of the universe.
Can a system have sufficient computing power to do this, while still being embedded in the universe that is being predicted in perfect detail? This seems a pretty big assumption, and it’s not just an assumption about a specific universe (which we can just posit as a hypothetical) but a mathematical assumption about the degree to which complex computational systems (e.g. able to prove theorems of arithmetic) are capable of perfectly predicting their own future behaviour.
To me, it seems an unlikely proposition.
Their training certainly does produce systems that are really good at seeming like they have inner lives. That’s part of the point. But then humans well-trained to write fiction are also really good at producing text from the viewpoint of characters that seem like they have inner lives. Another analogy may be immersive roleplayers, who often do experience emotions that their character would be feeling—though generally not to the extent of more immediate sensations like physical pain.
Are AIs more like authors, more like immersive roleplayers, more like living beings, or something different from any of these?
In the first case there is pretty clearly no moral weight carried by the text, except as a side-effect of any real-world impact of the text. It may be distasteful, but nearly everyone agrees that fictional characters are not moral patients and that an author writing a torture scene is not literally experiencing torture even though there are a whole bunch of living neurons dedicated to simulating it.
I suspect that AI experience (if it exists) is a lot closer to the author end of such a scale, possibly out to the extent of an immersive roleplayer. Their “mood” seems to be affected much more by the most recent prompt than any human genuinely experiencing things would be. This is likely an artifact of current details of training and architecture rather than any fundamental principle, though.
I think it’s pretty obvious that the “both AI will generate the same output” is wrong, although this may become true by the time they’re superintelligent.
At lower levels of capability, the benevolent AI belief world will almost certainly get a lot more nice outputs if starting from imitative prediction. Whether it internalizes “niceness” in a suitable way right up through superintelligence is a completely different question, one for which we have no answers.
In the secret plotting belief world, it’s much more likely that superintelligent AI will never be produced, or at least not until there are extremely good reasons to believe that it can’t be secretly plotting.
The worst case is probably somewhere in between, where enough people believe that AI can be fundamentally good without much effort, but base their imitative prediction on lots of training data of secret AI plotting.
The most salient example I’ve seen of an Alice in these circles was very publically banned not long ago.