This thread made me sign up because it is a big enough sign that you apparently care about new people—you’re willing to go to that length to get people to sign up, so I’ll guess I could create an account. Maybe that ought to lower the barrier for me to participate.
Pimgd
Walking away from problems in traffic (like when you have a near miss because someone else made a steering mistake) is usually a lot better than getting into a heated argument about what an this other person is for not noticing you even although you had your lights on and everything. Walking away works if you’re not likely to interact with the other person in the future. Walking away also works if you’re not likely to interact in the context of X with the other person in the future.
As always, there is a middle path where sometimes walking away is good and some times it isn’t, but “that will literally never solve the problem” is only correct if you see “the problem” as “the grievance that has just occurred”.
Something which wasn’t clear to me after looking around a bit—it seems the recent comments in the bar at the right is cached, and I saw some comments with a pink border. Does the pink border mean they’re new?
This adds so much more to my LW experience. Reading open threads just became doable, rather than an exercise in trying to remember what parts of the discussion I’ve already seen and which ones I hadn’t. … Although I’m not seeing everything with a pink border whenever I look at an old page, so I think that part of the explanation is false. That, or there is a bug somewhere...
About voting—how does voting on the open thread post itself work? The post is pretty much always the same, so why does it get voted up anyway? Is it about the quality of the comments?
The idea of learning styles as “fits better to a specific person” wasn’t interesting to me—instead I took it as inspiration for natural division of “ways people could learn this thing in general”.
As for publication bias, I don’t think anyone published their research. … but if there had been a really interesting result, I bet someone would have tried to get their research published somehow.
It doesn’t look like a silly question; steelmanned to some degree it would be “do you have any evidence of this, because if that was true, I’d want to end that practice in my organization”. I prefer systems where burden of proof is on the accuser, and whilst you don’t need payslips that have as job title “content upvoter”, some explanation would be nice.
It’s perfectly possible to speak the truth whilst being intellectually dishonest, you two could be arguing past each other—“You’re engaging in shady business practices!” “There’s no fraud here.”
I can’t even begin to model myself as “liking” smoking—it gives a disgusting smell that clings to everything and even being near second-hand smoke makes for uncomfortable breathing. If I try to model myself as someone who likes smoking, I don’t see myself living, because I’ve been altered beyond recognition.
Add to that that it seems to be a problem without a correct answer (“yes” seems to be the preferred option, given that there is no statement that you prefer smoking without cancer over smoking with cancer, thus “you prefer to smoke” + “some cancer related stuff that you may or may not have an opinion about” = “go smoke already”. But this isn’t the direct correct answer because if you take another worldview and look at the problem, “to smoke is to admit that you have this genetic flaw and thus you have cancer”), and I have massive problems when it comes to understanding this sort of thing.
This question seems to have the same thing going on—pick one! A) “everyone is tortured” or B) “everyone gets a dust speck”. But wait, there’s some numbers going on in the background where there’s either a lot of clones of you or only one of you. And if everyone gets tortured then there’s only one of you. Here it is left unsaid that torture is far far far worse than the dust speck for a single individual, but the issue remains: I see “Do a really really really bad thing” or “Do a meh thing” and then some fancy attempts to trip up various logic systems—What about the logic that, hey, A is always worse than B? … I guess you could fix this by there being OTHER people present, so that it’s a “you get tortured” vs “you and everyone else (3^^^3) get a dust speck”… but then there’d be loopholes in the region of “yes, but my preferences prefer a world where there are people other than me, so I’ll take torture if that means I get to exist in such a world”.
As for one-box/two-box, I’d open B up, and if it was empty I’d take the contents of A home. If it contained the cash, well, I dunno. I guess I’d leave the 1000 behind, if the whole “if you take both then B is empty” idea was true. Maybe it’s false. Maybe it’s true! Regardless of that, I just got a million bucks, and an extra $1000, well, that’s not all that much after receiving a whole million. (Yes, you could do stuff with that money, like buying malaria nets or something, but I am not an optimal rational agent, my thinking capacity is limited, and I’d rather bank the $1m than get tripped up by $1000 because I got greedy). … weirdly enough, if you change the numbers so that A contained $1000 and B contained $1001, I’d open up B first… and then regardless of seeing the money, I’d take A home too.
Feel free to point out the holes in my thinking—I’d prefer examples that are not too “out there” because my answers tend to not be based on the numbers but on all the circumstances around it—that $1m would see me work on what I’d want to work on for the rest of my life, and that $1000 would reduce the time I’d need to spend working for doing what I wanna do by about a month (or 3 weeks).
I get the feeling maybe this ought to be two comments, one on the main thread and one here. But they’re too entangled.
That is one well-plucked mulberry bush.
I went looking around on wikipedia and found Kavka’s toxin puzzle which seems to be about “you can get a billion dollars if you intend to drink this poison (which will hurt a lot for a whole day similar to the worst torture imaginable but otherwise leave no lasting effects) tomorrow evening, but I’ll pay you tonight”… but there I don’t get the paradox either—whats stopping you from creating a sub agent (informing a friend) with the task of convincing you not to drink AFTER you’ve gotten the money? … Possibly by force. Possibly by relying on saying things in a manner that you don’t know that he knows he has to do this. Possibly with a whole lot of actors. Like scheduling a text “I am perfectly fine, there is nothing wrong with me” to parents and friends to be sent tomorrow morning.
Of course, this relies on my ability to raise the probability of intervention, but that seems like an easier challenge than engaging in willful doublethink… … or you’d perhaps add various chemicals to your food the next day—I know I can be committed to an idea (I will do this task tonight), come home, eat dinner, and then I’d be totally uncommitted (that task can wait, I will play games first).
… A billion is a lot of money, perhaps I’d drink the poison and then have a hired person drug me to a coma, to be awoken the next day? You could hire a lot of medical staff with that kind of money.
Yet I get the feeling that all these “creative” solutions are not really allowed. Why is that?
How is it not possible? When force is allowed, the hired people could simply physically restrain me—I’d fight them with tooth and nail, their vastly superior training would have me on the floor within a minute, after which I’d be kept separate from the vial of toxin for the remainder of the day. … Although I guess “separation for a period of time”-based arguments rely on you both being obsessive AND pedantic enough to not care about it on the next day. Being really passionate about something and then dropping the issue the next day because the window of opportunity has been closed is … unlikely to occur, so my solution might end up making me rich but leaving me in the looney-bin.
I think a better argument against my ideas is logistics—how could I acquire everything I need in a span of (at most) 23 hours? (The wording is a such that at tonight as the day turns, you must intend to take the poison). A middle class worker generally doesn’t have ties to any mercenaries, and payment isn’t given until the morning after your intent has to be made.
I get your point, though—convincing someone to later convince you already carries massive penalties (“Why are you acting so weird?”), the situation carries massive penalties (“And you believe this guy?”, “For HOW MUCH?!”)...
My argument basically rests on turning the whole thing into a game: “Design a puzzle you cannot get out of. Then, a few minutes before midnight (to be safe), start doing your utmost best to break this puzzle.”
If I intend to do my best at an exam tomorrow, but stay up late playing games, does this somehow lift my intention to do well on my exam?
By the original problem statement, I have to have the intention of taking the poison AT midnight. Rephrased—when it is midnight, I must intend to take the poison that next day. BEFORE midnight, it is allowed to have OTHER intentions. I intend to use that time to set up hurdles for myself—and then to try my hardest. It would be especially helpful if these hurdles are also things like tricking myself that it won’t actually hurt (via transquilizer to put me under straight afterward, for instance).
I know it sounds like doublethink, but that’s only if you think there is no difference between me before midnight and me after midnight.
It is indeed a million, woops. Thanks for explaining in detail about the purpose of such questions. I find that I get into “come up with a clever answer” mode faster if the question has losses—not getting money is “meh”, a day worth of excruciating pain in exchange for money, well, that needs a workaround!
As for the puzzle itself, I don’t know if I can form such an intention… but I seem to be really good at it in real life. I call it procrastinating. I make a commitment that fails to account for time discounting and then I end up going to bed later than I wanted. After dinner I intended to go to bed early; at midnight I wanted to see another episode. So apparently it’s possible.
You seem to be confusing goals and value systems—even without a goal, the UFAI risk is not gone.
Maybe it is not right to anthropomorphize but take a human who is (acting) absolutely clueless, and given choices. They’ll pick something and stick to it. Questioned about it, they’ll say something like “I dunno, I think I like that option” . This is what I’d imagine something without a goal to act—maybe it is consistent, maybe it will pick things it likes, but it doesn’t plan ahead and doesn’t try to steer actions to a goal.
For an AI, that would be a totally indifferent AI. I think it would just sit idle or do random actions. If you then give it a bad value system, and ask it to help you, you’ll get “no” back. Helping people takes effort. Who’d want to spend processor cycles on that?
...
On the other hand, perhaps goals and value systems are actually the same; having a value system means you’ll have goals (“envisioned preferred world states” vs “preferred world states”), so you can not not have goals whilst having a value system. In that case, you’d have an AI without values. This I think is likely to result in one of two options… on contact with a human that provides an order to follow, it could either not care and do nothing (it stays idle… forever, not even acting in self-preservation because, again, it has no values). Or, it accepts the order and just goes along. That’d be dangerous, because this has basically no brakes—if it does whatever you ask of it, without regard for human values… I hope you didn’t ask for anything complex. “World peace” would resolve very nastily, as would “get me some money” (it is stolen from your neighbors… or maybe it brings you your wallet), and things like “get me a glass of water” can be interpreted in so many ways that being handed a piece of ice in the shape of a drinking glass is in the positive side of results.
That’s the crux of it, I think. Without a value system, there are no brakes. There might also not be any way to get the AI to do anything. But with a value system that is flawed, there might be no brakes in a scenario where we’d want the AI to stop. Or the AI wouldn’t entertain requests that we’d want it to do. So a lot of research goes into this area to make sure we can make the AI do what we want it to do in a way that we’re okay with.
Pretty much this; if we adjust the numbers to “A: 20 cents or B: 25% chance for 100 cents” then I’d take the option B, but scale it up to “A: $200,000 or B: 25% chance for $1,000,000″, and I’d take option A. Because $0 is 0 points, $1 million is something like 4 points, and $200,000 is about 2 points.
Human perception of scale for money is not linear (but not logarithmic either… not log 10, anyway, maybe log somethingelse). And since I’m running this flawed hardware...
Some of it was pointed out already as “prospect theory” but that seems to be more about perception of probability rather than the perception of the actual reward.
Maybe it doesn’t help when you’re the only one, but that doesn’t matter; your species is one that has multiple children, and the mutation was so small it occurred in multiple children? … And if that’s too high a complexity penalty, there could be an alternative: say it is a trait which got spread due to a resource boom in a population (the resource boom makes it likely for even disadvantaged mutations to survive), and then individuals with the trait managed to find each other and be more fit?
… Just conjecture, though.
Maybe I just don’t get it, but offering me the option AFTER you’ve told me that it makes no difference makes it a pointless option. I get the feeling there’s a single step missing from your explanation.
From what I’m reading, there’s 3 things that can happen...
You spend $100. A prophet comes to you and tells you that you will lose $10,000 in the future, and then...
1) looks at you more closely, and coughs “wait, you’re not the person I was looking for.”
2) tells you something that sounds plausibly true, but turns out to be false, costing you $10,000 by overpaying for your next house.
1 & 2 happen with 50% chance, if you have spent $100.
If you don’t spend $100, then
3) A prophet comes to you and tells you that you will lose $10,000 in the future, and then afterwards as you sputter “why”, he tells you this plausibly true thing that turns out to be false, costing you $10,000 by overpaying for your next house.(As for what the thing is, it’s either something that makes you spend $10,000 in carefully rationalizing your decision to buy a house, or $10,000 costs in overbidding)
But… you’ve just told me that a prophet came to me and told me I will lose $10,000 in the future. I am already on path 3. There is no going back. Time CAN create time loops but there is no cause for it to do so in your explanation. You yourself walled it off by stating the prophecy was self-fulfilling and that you could spend $100 “if the prophecy weren’t immutably correct” (this in a manner implying that it is immutably correct).
You have given me a button, but the button is disabled. I can’t take any actions.
Also, there’s something lurking in your description which might (I really am unsure) imply that if I spend $100, the world may become inconsistent and therefore stop disappearing. Basically replace path 1 with “universe ends.” Which would make spending $100 really bad, since, losing $10,000 is preferable to destroying your own universe?
I understand the normal version of newcombs perfectly fine, I understand the normal version of counterfactual mugging (or at least the wiki version of it) perfectly fine, I get that the transparent boxes are mostly the same if you follow the logic , but in this case, the choice is presented AFTER you’ve picked boxes. “Here are two boxes. Would you like one or both. ‘Both please.’ Okay, also I’d like to inform you that if you pick both, you don’t get the million. No backsies.”
Saying that this is predicted in advance is weird, because there is no possibility of a meaningful loop: the moment of timeline separation is AFTER the choice has been made. The choice is set in stone. There is no possible change. You can pay, but it won’t change a thing. Unless you were somehow determined to pay people in scenarios like this—which requires knowledge of scenarios like this.
In the original version, something happens, and then the losing you is contacted and asked whether you’d want to pay. And you’d be able to choose at that point, and even think about it. And then it turns out this was all a simulation and because you paid, the winning real you gets paid.
In this version, we could make it work by taking the result of the previous simulation (I flipped a coin you lost, pay me $100 or I won’t pay $10000 if you had won), and then going through the prophet who either says you’re fine if losing you paid, or that you’re not fine if losing you didn’t pay.
But what we cannot do is simulate this and loop it on itself. You are doomed in the future. You are always doomed in the future. There is no possibility of you being not doomed in the future. But, if you pay, then there is a possibility that you are not doomed in the future. That’s a contradiction right there. If I accept that the statement about my unchanging future is false, then I’ll pay because then I can go from 100% doomed to 50% doomed. If I accept that the statement about changing my future is false, then I won’t pay, because you’re a snake oil salesman, your cure will do me no good.
To fix this, the wording needs to be altered so that there is no contradiction and that there is a clear result of paying the money that will reduce the chance.
In short, I think this problem relies too much on UDT’s ability to magically teleport between possible situations and failed to left a path for Time to take.
Hi there! I didn’t sign up before because this community tends to comment what I want to say most of the time anyway, and because signup hurdles are a thing and lack of OpenID support makes me frustrated.
I’ve been reading LW intermittently for about one and a half years now; whilst integrating these concepts in my life is something I tend to find hard, I have picked some of these up. Specifically anchoring effects and improving my ability to spot “the better action”. It’s still hard to actually take such actions; I’ll find myself coming up with a better plan of action and then executing the inferior plan of action anyway.
I’ve been horrified at a few of my past mistakes; one of them was accidental p-hacking. (Long story!)
One of the things I had to do for my college degree was performing research. I picked a topic (learning things) and got asked to focus on a key area (I picked best instructional method for learning how to play a game). We had to use two data collection methods; I wanted to do an experiment because that was cool, and I added a survey because if I’m going to have to ask lots of people to do something for me, I might as well ask those same people to do something else. Basically I’m lazy.
My experiment consisted of a few levels (15) in which you have to move a white box to various shapes by dragging it about. I had noticed that teaching research focused on “reading” “doing” “listening” and “seeing” types, (I forgot the specific words, something about Kinestetic, Audititive, Visual… - learning). So I translated to “written text”, “imagery”, “sounds and spoken text”, and “interactivity” to model the reading, seeing, listening and doing respectively.
Then I made each level test a combination of learning methods. First “learning by doing” only. Here’s a box. Here’s a green circle. Here’s a red star. Go.
Most people passed in 5 seconds or in 1 minute. This after I added a background which was dotted so that you’d see a clear white box and not a black rectangle, and a text “this is level 1, experiment!”. Some people would think it was still loading without this text. I didn’t include the playtesters in the research result data.
After that it showed you 4 colored shapes and a arrow underneath, and a button “next” below it. Hitting next moves you to level 2, where a white box is in the center of the screen, and various colored shapes are surrounding the white box. Dragging the white box over the wrong shape sends you back to the screen with the 4 colored shapes and the arrow. This was supposed to be “imagery”.
Then the next screen after that was an audio icon and a “next button”. I had recorded myself saying various colored shapes, and people were told at this screen something like “black circle, red triangle, blue star, green square”. The idea being you’d have to remember various instructions and act upon them. Hitting the next button brings you to the surrounded white box again. Each level had a different distribution of shapes to prevent memorizing the locations.
Then the 4th text level was just text instructions (“drag the white box over the green circle, then the red star …”)
Then after that came combinations—voiced text, text where I had put the shapes in images on the screen as well, shapes + voice saying what they were… for interactivity, I skipped the instruction screen and just went with text appearing in the center of the screen, and then the text changes when you perform the correct action (else level resets). This to simulate tutorials like “press C to crouch” whenever you hit the first crouch obstacle.
I had recorded the time spent on the instruction screen, the total time for each level, and per attempt, the time between each progress step and failure. So 1.03 seconds to touch the first shape, 0.7 to touch the second, 0.3 to touch a third wrong one, then 0.5 to touch the first, 0.4 to touch the second, 0.8 to touch the third and 1.0 to touch the fourth and level complete.
The idea was that I could use this to see how “efficient” people were at understanding the instructions, both in speed and correctness.
(FYI, N=75 or so, out of a gaming forum with 700 users)
Then I committed my grave sin and took the data, took excel’s “correlate” function, and basically compared various columns until I got something with a nice R. This after trying a few things I had thought I would find and seeing non-interesting results.
I “found” that apparently showing text and images in interactive form “learn as you go” was best—audio didn’t help much, it was too slow. Interactivity works as a force multiplier and does poorly on its own.
But these findings are likely to be total bogus because, well, I basically compared statistics until I found something with a low chance to randomly occur.
… What scares me not is not that I did this. What scares me is that I turned this in, got told off for “not including everything I checked”, thought this was a stupid complaint because look I found a correlation, voiced said opinion, and still got a passing grade (7/10) anyway. And then thought “Look, I am a fancy researcher.”
I could dig it up if people were interested—the experiment is in English, the research paper is in Dutch, and the data is in an SQL database somewhere.
This is probably a really long post now, so I’ll write more if needed instead of turning this into a task to be pushed down todo lists forever.