I’m interested in people’s opinions on this:
If it’s a talking point on Reddit, you might be early.
Of course the claim is technically true; there’s >0% chance that you can get ahead of the curve by reading reddit. But is it dramatically less likely than it was, say, 5/10/15 years ago? (I know ‘reddit’ isn’t a monolith; let’s say we’re ignoring the hyper-mainstream subreddits and the ones that are so small you may as well be in a group chat.)
10. Everyday Razor—If you go from doing a task weekly to daily, you achieve 7 years of output in 1 year. If you apply a 1% compound interest each time, you achieve 54 years of output in 1 year.
What’s the intuition behind this—specifically, why does it make sense to apply compound interest to the daily task-doing but not the weekly?
I think we’re mostly talking past each other, but I would of course agree that if my position contains or implies logical contradictions then that’s a problem. Which of my thoughts lead to which logical contradictions?
That doesn’t mean qualia can be excused and are to be considered real anyway. If we don’t limit ourselves to objective descriptions of the world then anyone can legitimately claim that ghosts exist because they think they’ve seen them, or similarly that gravity waves are transported across space by angels, or that I’m actually an attack helicopter even if I don’t look like one, or any other unfalsifiable claim, including the exact opposite claims, such as that qualia actually don’t exist. You won’t be able to disagree on any grounds except that you just don’t like it, because you sacrificed the assumptions to do so in order to support your belief in qualia.
Those analogies don’t hold, because you’re describing claims I might make about the world outside of my subjective experience (‘ghosts are real’, ‘gravity waves are carried by angels’, etc.). You can grant that I’m the (only possible) authority on whether I’ve had a ‘seeing a ghost’ experience, or a ‘proving to my own satisfaction that angels carry gravity waves’ experience, without accepting that those experiences imply the existence of real ghosts or real angels.
I wouldn’t even ask you to go that far, because—even if we rule out the possibility that I’m deliberately lying—when I report those experiences to you I’m relying on memory. I may be mistaken about my own past experiences, and you may have legitimate reasons to think I’m mistaken about those ones. All I can say with certainty is that qualia exist, because I’m (always) having some right now.
I think this is one of those unbridgeable or at least unlikely-to-be-bridged gaps, though, because from my perspective you are telling me to sacrifice my ontology to save your epistemology. Subjective experience is at ground level for me; its existence is the one thing I know directly rather than inferring in questionable ways.
That’s the thing, though—qualia are inherently subjective. (Another phrase for them is ‘subjective experience’.) We can’t tell the difference between qualia and something that doesn’t exist, if we limit ourselves to objective descriptions of the world.
a 50%+ chance we all die in the next 100 years if we don’t get AGI
I don’t think that’s what he claimed. He said (emphasis added):
if we don’t get AI, I think there’s a 50%+ chance in the next 100 years we end up dead or careening towards Venezuela
Which fits with his earlier sentence about various factors that will “impoverish the world and accelerate its decaying institutional quality”.
(On the other hand, he did say “I expect the future to be short and grim”, not short or grim. So I’m not sure exactly what he was predicting. Perhaps decline → complete vulnerability to whatever existential risk comes along next.)
My model of CDT in the Newcomb problem is that the CDT agent:
is aware that if it one-boxes, it will very likely make $1m, while if it two-boxes, it will very likely make only $1k;
but, when deciding what to do, only cares about the causal effect of each possible choice (and not the evidence it would provide about things that have happened in the past and are therefore, barring retrocausality, now out of the agent’s control).
So, at the moment of decision, it considers the two possible states of the world it could be in (boxes contain $1m and $1k; boxes contain $0 and $1k), sees that two-boxing gets it an extra $1k in both scenarios, and therefore chooses to two-box.
(Before the prediction is made, the CDT agent will, if it can, make a binding precommitment to one-box. But if, after the prediction has been made and the money is in the boxes, it is capable of two-boxing, it will two-box.)
I don’t have its decision process running along these lines:
“I’m going to one-box, therefore the boxes probably contain $1m and $1k, therefore one-boxing is worth ~$1m and two-boxing is worth ~$1.001m, therefore two-boxing is better, therefore I’m going to two-box, therefore the boxes probably contain $0 and $1k, therefore one-boxing is worth ~$0 and two boxing is worth ~$1k, therefore two-boxing is better, therefore I’m going to two-box.”
Which would, as you point out, translate to this loop in your adversarial scenario:
“I’m going to choose A, therefore the predictor probably predicted A, therefore B is probably the winning choice, therefore I’m going to choose B, therefore the predictor probably predicted B, therefore A is probably the winning choice, [repeat until meltdown]”
My model of CDT in your Aaronson oracle scenario, with the stipulation that the player is helpless against an Aaronson oracle, is that the CDT agent:
is aware that on each play, if it chooses A, it is likely to lose money, while if it chooses B, it is (as far as it knows) equally likely to lose money;
therefore, if it can choose whether to play this game or not, will choose not to play.
If it’s forced to play, then, at the moment of decision, it considers the two possible states of the world it could be in (oracle predicted A; oracle predicted B). It sees that in the first case B is the profitable choice and in the second case A is the profitable choice, so—unlike in the Newcomb problem—there’s no dominance argument available this time.
This is where things potentially get tricky, and some versions of CDT could get themselves into trouble in the way you described. But I don’t think anything I’ve said above, either about the CDT approach to Newcomb’s problem or the CDT decision not to play your game, commits CDT in general to any principles that will cause it to fail here.
How to play depends on the precise details of the scenario. If we were facing a literal Aaronson oracle, the correct decision procedure would be:
If you know a strategy that beats an Aaronson oracle, play that.
Else if you can randomise your choice (e.g. flip a coin), do that.
Else just try your best to randomise your choice, taking into account the ways that human attempts to simulate randomness tend to fail.
I don’t think any of that requires us to adopt a non-causal decision theory.
In the version of your scenario where the predictor is omniscient and the universe is 100% deterministic -- as in the version of Newcomb’s problem where the predictor isn’t just extremely good at predicting, it’s guaranteed to be infallible—I don’t think CDT has much to say. In my view, CDT represents rational decision-making under the assumption of libertarian-style free will; it models a choice as a causal intervention on the world, rather than just another link in the chain of causes and effects.
green_leaf, please stop interacting with my posts if you’re not willing to actually engage. Your ‘I checked, it’s false’ stamp is, again, inaccurate. The statement “if box B contains the million, then two-boxing nets an extra $1k” is true. Do you actually disagree with this?
I don’t think that’s quite right. At no point is the CDT agent ignoring any evidence, or failing to consider the implications of a hypothetical choice to one-box. It knows that a choice to one-box would provide strong evidence that box B contains the million; it just doesn’t care, because if that’s the case then two-boxing still nets it an extra $1k. It doesn’t merely prefer two-boxing given its current beliefs about the state of the boxes, it prefers two-boxing regardless of its current beliefs about the state of the boxes. (Except, of course, for the belief that their contents will not change.)
We’ve had reacts for a couple months now and I’m curious to here, both from old-timers and new-timers, what people’s experience of them was, and how much they shape their expectations/culture/etc.
I received (or at least, noticed receiving) a react for the first time recently, and honestly I found it pretty annoying. It was the ‘I checked, it’s False’ one, which basically feels like a quasi-authoritative, quasi-objective, low effort frowny-face stamp where an actual reply would be much more useful.
Edit: If it was possible to reply directly to the react, and have that response be visible to readers who mouse over the react, that would help on the emotional side. On the practical side, I guess it’s a question of whether, in the absence of reacts, I would have got a real reply or just an unexplained downvote.
green_leaf, what claim are you making with that icon (and, presumably, the downvote & disagree)? Are you saying it’s false that, from the perspective of a CDT agent, two-boxing dominates one-boxing? If not, what are you saying I got wrong?
Your ‘modified Newcomb’s problem’ doesn’t support the point you’re using it to make.
In Newcomb’s problem, the timeline is:
prediction is made → money is put in box(es) → my decision: take one box or both? → I get the contents of my chosen box(es)
CDT tells me to two-box because the money is put into the box(es) before I make my decision, meaning that at the time of deciding I have no ability to change their contents.
In your problem, the timeline is:
rules of the game are set → my decision: play or not? → if I chose to play, 100x(prediction is made → my decision: A or B → possible payoff)
CDT tells me to play the game if and only if the available evidence suggests I’ll be sufficiently unpredictable to make a profit. Nothing prevents a CDT agent from making and acting on that judgment.
This game is the same: you may believe that I can predict your behavior with 70% probability, but when considering option A, you don’t update on the fact that you’re going to choose option A. You just see that you don’t know which box I’ve put the money in, and that by the principle of maximum entropy, without knowing what choice you’re you’re going to make, and therefore without knowing where I have a 70% chance of having not put the money, it has a 50% chance of being in either box, giving you an expected value of $0.25 if you pick box A.
Based on this, I think you’ve misdiagnosed the alleged mistake of the CDT agent in Newcomb’s problem. The CDT agent doesn’t fail to update on the fact that he’s going to two-box; he’s aware that this provides evidence that the second box is empty. If he believes that the predictor is very accurate, his EV will be very low. He goes ahead and chooses both boxes because their contents can’t change now, so, regardless of what probability he assigns to the second box being empty, two-boxing has higher EV than one-boxing.
Likewise, in your game the CDT agent doesn’t fail to update on the fact that he’s going to choose A; if he believes your predictions are 70% accurate and there’s nothing unusual about this case (i.e. he can’t predict your prediction nor randomise his choice), he assigns -EV to this play of the game regardless of which option he picks. And he sees this situation coming from the beginning, which is why he doesn’t play the game.
Without reading the book we can’t be sure. But the trouble is that this claim has been made a million times, and in every previous case the author has turned out to be either ignoring the hard problem, misunderstanding it, or defining it out of existence. So if a longish, very positive review with the title ‘x explains consciousness’ doesn’t provide any evidence that x really is different this time, it’s reasonable to think that it very likely isn’t.
The reason these two situations look different is that it’s now easy for us to verify that the Earth is flat, but it’s hard for us to verify what’s going on with consciousness.
Even if I had no way of verifying it, “the earth is (roughly) spherical and thus has no edges, and its gravity pulls you toward its centre regardless of where you are on its surface” would clearly be an answer to my question, and a candidate explanation pending verification. My question was only ‘confused’ in the sense that it rested on a false empirical assumption; I would be perfectly capable of understanding your correction to this assumption. (Not necessarily accepting it—maybe I think I have really strong evidence that the earth is flat, or maybe you haven’t backed up your true claim with good arguments—but understanding what it means and why it would resolve my question).
Are you suggesting that in the case of the hard problem, there may be some equivalent of the ‘flat earth’ assumption that the hard-problemists hold so tightly that they can’t even comprehend a ‘round earth’ explanation when it’s offered?
I would have considered fact-checking to be one of the tasks GPT is least suited to, given its tendency to say made-up things just as confidently as true things. (And also because the questions it’s most likely to answer correctly will usually be ones we can easily look up by ourselves.)
edit: whichever very-high-karma user just gave this a strong disagreement vote, can you explain why? (Just as you voted, I was editing in the sentence ‘Am I missing something about GPT-4?’)
e.g. Eliezer would put way less than 10% on fish feeling pain in a morally relevant way
Semi-tangent: setting aside the ‘morally relevant way’ part, has Eliezer ever actually made the case for his beliefs about (the absence of) qualia in various animals? The impression I’ve got is that he expresses quite high confidence, but sadly the margin is always too narrow to contain the proof.
What about AI researchers? How many of them do you think you could persuade?
If they were motivated to get it right and we weren’t in a huge rush, close to 100%. Current-gen LLMs are amazingly good compared to what we had a few years ago, but (unless the cutting edge ones are much better than I realise) they would still be easily unmasked by a motivated expert. So I shouldn’t need to employ a clever strategy of my own—just pass the humanity tests set by the expert.
How many random participants do you believe you could convince that you are not an AI?
This is much harder to estimate and might depend greatly on the constraints on the ‘random’ selection. (Presumably we’re not randomly sampling from literally everyone.)
In the pre-GPT era, there were occasional claims that some shitty chatbot had passed the Turing test. (Eugene Goostman is the one that immediately comes to mind.) Unless the results were all completely fake/rigged, this suggests that non-experts are sometimes very bad at determining humanity via text conversation. So in this case my own strategy would be important, as I couldn’t rely on the judges to ask the right questions or even to draw the right inferences from my responses.
If the judges were drawn from a broad enough pool to include many people with little-to-no experience interacting with GPT and its ilk, I couldn’t rely on pinpointing the most obvious LLM weaknesses and demonstrating that I don’t share them. (Depending on the structure of the test, I could perhaps talk the judges through the best way to unmask the bot. But that seems to go against the spirit of the question.) Honestly, off the top of my head I really don’t know what would best convince the average person of my humanity via a text channel, and I wouldn’t be very confident of success.
(I’m assuming here that my AI counterpart(s) would be set up to make a serious attempt at passing the Turing test; obviously the current public versions are much too eager to give away their true identities.)
what’s the point of imagining a hypothetical set of physical laws that lack internal coherence?
I don’t think they lack internal coherence; you haven’t identified a contradiction in them. But one point of imagining them is to highlight the conceptual distinction between, on the one hand, all of the (in principle) externally observable features or signs of consciousness, and, on the other hand, qualia. The fact that we can imagine these coming completely apart, and that the only ‘contradiction’ in the idea of zombie world is that it seems weird and unlikely, shows that these are distinct (even if closely related) concepts.
This conceptual distinction is relevant to questions such as whether a purely physical theory could ever ‘explain’ qualia, and whether the existence of qualia is compatible with a strictly materialist metaphysics. I think that’s the angle from which Yudkowsky was approaching it (i.e. he was trying to defend materialism against qualia-based challenges). My reading of the current conversation is that Signer is trying to get Carl to acknowledge the conceptual distinction, while Carl is saying that while he believes the distinction makes sense to some people, it really doesn’t to him, and his best explanation for this is that some people have qualia and some don’t.
After a while, you are effectively learning the real skills in the simulation, whether or not that was the intention.
Why the real skills, rather than whatever is at the intersection of ‘feasible’ and ‘fun/addictive’? Even if the consumer wants realism (or thinks that they do), they are unlikely to be great at distinguishing real realism from fantasy realism.
FWIW, the two main online chess sites forbid the use of engines in correspondence games. But both do allow the use of opening databases.
I agree that your model is clearer and probably more useful than any libertarian model I’m aware of (with the possible exception, when it comes to clarity, of some simple models that are technically libertarian but not very interesting).
Do you call it illusion because the outcomes you deem possible are not meta-possible: only one will be the output of your decision making algorithm and so only one can really happen?
Something like that. The SEP says “For most newcomers to the problem of free will, it will seem obvious that an action is up to an agent only if she had the freedom to do otherwise.”, and basically I a) have not let go of that naive conception of free will, and b) reject the analyses of ‘freedom to do otherwise’ that are consistent with complete physical determinism.
I know it seems like the alternatives are worse; I remember getting excited about reading a bunch of Serious Philosophy about free will, only to find that the libertarian models that weren’t completely mysterious were all like ‘mostly determinism, but maybe some randomness happens inside the brain at a crucial moment, and then everything downstream of that counts as free will for some reason’.
But basically I think there’s enough of a crack in our understanding of the world to allow for the possibility that either a) a brilliant theory of libertarian free will will emerge and receive some support from, or at least remain consistent with, developments in physics; or b) libertarian free will is real but just inherently baffling, like consciousness (qualia) or some of the impossible ontological questions.