I love the landmine metaphor—it blows up in your face and it’s left over from some ancient war.
Giles
Hypothesis: people eat a poor diet because their decision-making ability is most impaired right at the point where they are deciding what to eat next.
I was about to kick myself for not checking last year’s answers to all the probability questions (I don’t feel I’ve received much new information or insights that should cause me to change my mind, so I felt I should have averaged my current subjective estimate with last year’s).
But then I found that my subjective estimates were remarkably stable! (with possible slight drift towards 50%). Not sure what to make of that. Was going to post answers here to illustrate but wasn’t sure if that violated protocol because of anchoring. (People should really take the survey before reading any of the comments in any case).
P.S. I took the survey.
Minor points on survey phrasing...
P(Global catastrophic risk) should be P(Not Global catastrophic risk)
You say in part 7 that research is allowed, but don’t say that research is disallowed in part 8, calibration year.
In the true prisoner’s dilemma article, it doesn’t appear to give any information about the cognitive algorithms the opponent is running. For this reason I answered noncommittally, and I’m not sure how useful the question is for distinguishing people with CDTish versus TDTish intuitions.
Similarly in torture versus dust specks I answered not sure, not so much due to moral uncertainty but because the problem is underspecified. What’s the baseline? Is everybody’s life perfect except for the torture or dust specks specified, or is the distribution more like today’s world with a broad range of experiences ranging from basically OK to torture?
I might have given an inflated answer for “Hours on the Internet”, as I’m on the computer and the computer is on the Internet but it doesn’t necessarily mean I’m actively using the Internet at all times.
Are people keen to have these in text format? I’ve transcribed a couple of them (it’s one of the SIAI volunteer tasks) and I want to know if it’s worth carrying on.
small-scale power shits
You should probably fix this typo.
Wow, anchoring! That one didn’t even occur to me!
As part of Singularity University’s acquisition of the Singularity Summit, we will be changing our name and …
OK, this is big news. Don’t know how I missed this one.
I can imagine that if you design an agent by starting off with a reinforcement learner, and then bolting some model-based planning stuff on the side, then the model will necessarily need to tag one of its objects as “self”. Otherwise the reinforcement part would have trouble telling the model-based part what it’s supposed to be optimizing for.
“Why If Your AGI Doesn’t Take Over The World, Somebody Else’s Soon Will”
i.e. however good your safeguards are, it doesn’t help if:
another team can take your source code and remove safeguards (and why they might have incentives to do so)
Multiple discovery means that your AGI invention will soon be followed by 10 independent ones, at least one of which will lack necessary safeguards
EDIT: “safeguard” here means any design feature put in to prevent the AGI obtaining singleton status.
This was (more or less) the discussion topic at the last Toronto meetup. Here’s what we discussed (NOTE: these are minutes of a LW meetup so don’t expect it to be 100% on-topic). Also see the wiki page with previous discussion threads
No definitive game idea yet, but lots of interesting suggestions came up.
Harry Potter & the Methods of Rationality: The Game.
Suggestion was not to play Harry, as he already knows everything (or a lot, anyway). Instead you have to deal with Harry.
Realtime stategy game where you recruit units rather than building them
either bully or befriend units or just make your team the most fun (Draco/Hermione/Harry strategies)
can also just pay units but that obviously means you need to acquire more resources
units are aligned to different factions, and you can signal loyalty to one faction to gain their support at the expense of the other
Epistemic rationality: the game
discover how the game mechanics work as you go along
Is it possible to procedurally generate the laws of physics? (e.g. number of dimensions, gravitational constant etc. randomly generated)
I was worried that most combinations of physical laws would be unplayable for one reason or the other, and unlike real life we don’t have the anthropic principle to help us out
game of science, e.g. building atoms into molecules with different laws of physics
Extreme cooperation
Both players controlling same character
(Alien hand syndrome)
The Elephant and its Rider: The Game. One player assigned the role of rationalising the other’s behaviour.
The opposite of this is a single player controlling two characters but with the same controls (e.g. you can’t move one character right without making the other character fall off a cliff). Not really LWish but might be fun.
Bayes Theorem game
Murder mystery/court case
Base on real life court cases?
Game where you have to program your own rewards
The way gameplay is set up, you are motivated to achieve far goals but not near ones
Gameplay is too frustrating unless you can calibrate visual rewards so that you get rewarded for doing vaguely the right thing
This was my idea but I don’t know whether the concept even makes sense
Game teaches you real world stuff incidentally.
NPC’s prone to different biases
First use your Bayesianness to work out who has which bias
Then work out how to use characters’ biases to defeat them or persuade them to join your side
Can we simulate biased player also?
Close off elements of the dialogue tree depending on what your biases are supposed to be.
(My idea for a reprogramming your own brain game. More precisely, reprogramming interface between input devices and what your character does. Not really LW)
Time Portal
Like Portal except jumping through the portal displaces you in time (one way is forwards, the other backwards)
Again not really LWish.
Fix moral system
Existing games tend to be kick puppy vs. feed puppy
Sometimes you should get good results from bad actions
Trolley problem: the game
Game that starts off like the Sims and ends up like Civilization
Weaponising apparently innocent game mechanics, e.g. stealing all the fire alarms from someone’s house and then cooking something
Existing games which we like and/or which came up in the discussion:
Braid
Portal
Limbo
Fez
Phoenix Wright
Psychonauts
A flash game called Chronotron
Sid Meier’s Alpha Centauri
Some game which was like Ikaruga but cooperative?
Mass Effect
I’ve given 2x the amount that I gave on previous funding drives. You seem to be maturing as an organization and I’ve also been reasonably impressed by the attitude on display in the responses to Holden’s criticisms.
Something about this page bothers me—the responses are included right there with the criticisms. It just gives off the impression that a criticism isn’t going to appear until lukeprog has a response to it, or that he is going to write the criticism in a way that makes it easy to respond to, or something.
Maybe it’s just me. But if I wanted to write this page, I would try and put myself into the mind of the other side and try to produce the most convincing smackdown of the intelligence explosion concept that I could. I’d think about what the responses would be, but only so that I could get the obvious responses to those responses in first. In other words, aim for DH7.
The responses could be collected and put on another page, or included here when this page is a bit more mature. Does anyone think this approach would help?
Agree with purchasing non-sketchiness signalling and utilons separately. This is especially important if like jkaufman a lot of your value comes as an effective altruist role model
Agree that if diversification is the only way to get the elephant to part with its money then it might make sense.
Similarly, if you give all your donations to a single risky organization and they turn out to be incompetent then it might demotivate your future self. So you should hedge against this, which again can be done separately from purchasing the highest-expected-value thing.
Confused about what to do if we know we’re in a situation where we’re behaving far from rational agents but aren’t sure exactly how. I think this is the case with purchasing xrisk reduction, and with failure to reach Aumann agreement between aspiring effective altruists. To what extent do the rules still apply?
Lots of valid reasons for diversification can also serve as handy rationalizations. Diversification feels like the right thing to do—and hey, here are the reasons why! I feel like diversification should feel like the wrong thing to do, and then possibly we should do it anyway but sort of grudgingly.
What evidence justified a prior strong enough as to be updated on a single paragraph
I can’t speak for lukeprog, but I believe that “update” is the wrong word to use here. If we acted like Bayesian updaters then compartmentalization wouldn’t be an issue in the first place. I.J. Good’s paragraph, rather than providing evidence, seems to have been more like a big sign saying “Look here! This is a place where you’re not being very Bayesian!”. Such a trigger doesn’t need to be written in any kind of formal language—it could have been an offhand comment someone made on a completely different subject. It’s simply that (to an honest mind), once attention is drawn to an inconsistency in your own logic, you can’t turn back.
That said, lukeprog hasn’t actually explained why his existing beliefs strongly implied an intelligence explosion. That wasn’t the point of this post, but like you it’s a post that I’d very much like to see. I’m interested in trying to build a Bayesian case for or against the intelligence explosion (and other singularity-ish outcomes).
You’re right that there’s a problem obtaining evidence for or against beliefs about the future. I can think of three approaches:
Expertology—seeing what kinds of predictions have been made by experts in the past, and what factors are correlated with them being right.
Models—build a number of parameterizable models of the world which are (necessarily) simplified but which are at least capable of modeling the outcome you’re interested in. Give a prior for each and then do a Bayesian update according to how well that model predicts the past.
There might be an intuitive heuristic along the lines of “if you can’t rule it out then it might happen”, but I don’t know how to formalize that or make it quantitative.
So I’m interested in whether these can be done without introducing horrible biases, whether anyone’s tried them before or whether there are any other approaches I’ve missed out.
Everyone should take the survey before reading any more comments, in case they contain anchors etc.
I took the survey. My estimates will be very poorly calibrated (I haven’t done much in the way of calibration/estimation exercises) but I’m hoping they’ll at least be good enough for wisdom-of-the-crowds purposes and more useful than just leaving blank.
Minor quibble: shouldn’t “p(xrisk)” be “p(NOT xrisk)”? Just worried about people in a hurry not reading the question properly.
The added bonus is they can’t answer back.
Is this just a case of the utility function not being up for grabs? muflax can’t explain to me why wireheading counts as a win, and I can’t explain to muflax why wireheading doesn’t count as a win for me. At least, not using the language of rationality.
It might be interesting to get a neurological or evo-psych explanation for why non-wireheaders exist. But I don’t think this is what’s being asked here.
Was the buyer sane enough to realise that it probably wasn’t a power crystal, or just sane enough to realise that if he pretended it wasn’t a power crystal he’d save $135?
Is that amount of raising-the-sanity waterline worth $135 to Tony?
I would guess it’s guilt-avoidance at work here.
(EDIT: your thanks to Tony are still valid though!)
Eliezer Yudkowsky will never have a mid-life crisis.