We’d talked about getting a dump out as well, and your plan sounds great to me! The LW team should get back to you with a list at some point (unless they think of a better idea).
I asked Eliezer if it made sense to cross-post this from Arbital, and did the cross-posting when he approved. I’m sorry it wasn’t clear that this was a cross-post! I intended to make this clearer, but my idea was bad (putting the information on the sequence page) and I also implemented it wrong (the sequence didn’t previously display on the top of this post).
This post was originally written as a nontechnical introduction to expected utility theory and coherence arguments. Although it begins in media res stylistically, it doesn’t have any prereqs or context beyond “this is part of a collection of introductory resources covering a wide variety of technical and semitechnical topics.”
Per the first sentence, the main purpose is for this to be a linkable resource for conversations/inquiry about human rationality and conversations/inquiry about AGI:
So we’re talking about how to make good decisions, or the idea of ‘bounded rationality’, or what sufficiently advanced Artificial Intelligences might be like; and somebody starts dragging up the concepts of ‘expected utility’ or ‘utility functions’. And before we even ask what those are, we might first ask, Why?
There have been loose plans for a while to cross-post content from Arbital to LW (maybe all of it; maybe just the best or most interesting stuff), but as I mentioned downthread, we’re doing more cross-post experiments sooner than we would have because Arbital’s been having serious performance issues.
I assume you mean ‘no one has this responsibility for Arbital anymore’, and not that there’s someone else who has this responsibility.
Arbital has been getting increasingly slow and unresponsive. The LW team is looking for fixes or work-arounds, but they aren’t familiar with the Arbital codebase. In the meantime, I’ve been helping cross-post some content from Arbital to LW so it’s available at all.
MIRI folks are the most prominent proponents of fast takeoff, and we unfortunately haven’t had time to write up a thorough response. Oli already quoted the quick comments I posted from Nate and Eliezer last year, and I’ll chime in with some of the factors that I think are leading to disagreements about takeoff:
Some MIRI people (Nate is one) suspect we might already be in hardware overhang mode, or closer to that point than some other researchers in the field believe.
MIRI folks tend to have different views from Paul about AGI, some of which imply that AGI is more likely to be novel and dependent on new insights. (Unfair caricature: Imagine two people in the early 20th century who don’t have a technical understanding of nuclear physics yet, trying to argue about how powerful a nuclear-chain-reaction-based bomb might be. If one side were to model that kind of bomb as “sort of like TNT 3.0” while the other is modeling it as “sort of like a small Sun”, they’re likely to disagree about whether nuclear weapons are going to be a small v. large improvement over TNT. Note I’m just using nuclear weapons as an analogy, not giving an outside-view argument “sometimes technologies are discontinuous, ergo AGI will be discontinuous”.)
This list isn’t at all intended to be sufficiently-detailed or exhaustive.
I’m hoping we have time to write up more thoughts on this before too long, because this is an important issue (even given that we’re trying to minimize the researcher time we put into things other than object-level deconfusion research). I don’t want MIRI to be a blocker on other researchers making progress on these issues, though — it would be bad if people put a pause on hashing out takeoff issues for themselves (or put a pause on alignment research that’s related to takeoff views) until Eliezer had time to put out a blog post. I primarily wanted to make sure people know that the lack of a substantive response doesn’t mean that Nate+Eliezer+Benya+etc. agree with Paul on takeoff issues now, or that we don’t think this disagreement matters. Our tardiness is because of opportunity costs and because our views have a lot of pieces to articulate.
That counts! :) Part of why I’m asking is in case we want to build a proper LW glossary, and Rationality Cardinality could at least provide ideas for terms we might be missing.
Are there any other OK-quality rationalist glossaries out there? https://wiki.lesswrong.com/wiki/Jargon is the only one I know of. I vaguely recall there being one on http://www.bayrationality.com/ at some point, but I might be misremembering.
The wiki glossary for the sequences / Rationality: A-Z ( https://wiki.lesswrong.com/wiki/RAZ_Glossary ) is updated now with the glossary entries from the print edition of vol. 1-2.
New entries from Map and Territory:
anthropics, availability heuristic, Bayes’s theorem, Bayesian, Bayesian updating, bit, Blue and Green, calibration, causal decision theory, cognitive bias, conditional probability, confirmation bias, conjunction fallacy, deontology, directed acyclic graph, elan vital, Everett branch, expected value, Fermi paradox, foozality, hindsight bias, inductive bias, instrumental, intentionality, isomorphism, Kolmogorov complexity, likelihood, maximum-entropy probability distribution, probability distribution, statistical bias, two-boxing
New entries from How to Actually Change Your Mind:
affect heuristic, causal graph, correspondence bias, epistemology, existential risk, frequentism, Friendly AI, group selection, halo effect, humility, intelligence explosion, joint probability distribution, just-world fallacy, koan, many-worlds interpretation, modesty, transhuman
A bunch of other entries from the M&T and HACYM glossaries were already on the wiki; most of these have been improved a bit or made more concise.
One option that’s smaller than link posts might be to mention in the AF/LW version of the newsletter which entries are new to AIAF/LW as far as you know; or make comment threads in the newsletter for those entries. I don’t know how useful these would be either, but it’d be one way to create common knowledge ‘this is currently the one and only place to discuss these things on LW/AIAF’.
Possible compromise idea: send everyone their karma upvotes along with downvotes regularly, but send the upvotes in daily batches and the downvotes in monthly batches. Having your downvotes sent to you at known, predictable times rather than in random bursts, and having the updates occur less often, might let users take in the relevant information without having it totally dominate their day-to-day experience of visiting the site. This also makes it easier to spot patterns and to properly discount very small aversive changes in vote totals.
On the whole, I’m not sure how useful this would be as a sitewide default. Some concerns:
It’s not clear to me that karma on its own is all that useful or contentful. Ray recently noted that a comment of his had gotten downvoted somewhat, and that this had been super salient and pointed feedback for him. But I’m pretty sure that the ‘downvote’ Ray was talking about was actually just me turning a strong upvote into a normal upvote for minor / not-worth-independently-tracking reasons. Plenty of people vote for obscure or complicated or just-wrong reasons.
The people who get downvoted the most are likely to have less familiarity with LW norms and context, so they’ll be especially ill-equipped to extract actionable information from downvotes. If all people are learning is ‘<confusing noisy social disapproval>’, I’m not sure that’s going to help them very much in their journey as a rationalist.
Upvotes tend to be a clearer signal in my experience, while needing to meet a lower bar. (Cf.: we have a higher epistemic bar for establishing a norm ‘let’s start insulting/criticizing/calling out our colleagues whenever they make a mistake’ than for establishing a norm ‘let’s start complimenting/praising/thanking our colleagues whenever they do something cool’, and it would be odd to say that the latter is categorically bad in any environment where we don’t also establish the former norm.)
I’m not confident of what the right answer is; this is just me laying out some counter-considerations. I like Mako’s comment because it’s advocating for an important value, and expressing a not-obviously-wrong concern about that value getting compromised. I lean toward ‘don’t make down-votes this salient’ right now. I’d like more clarity inside my head about how much the downvote-hiding worry is shaped like ‘we need to make downvotes more salient so we can actually get the important intellectual work done’ vs. ‘we need to make downvotes more salient so we can better symbolize/resemble Rationality’.
! Hi! I am a biased MIRI person, but I quite dig all the things you mentioned. :)
I like this shortform feed idea!
Yeah, strong upvote to this point. Having an Arbital-style system where people’s probabilities aren’t prominently timestamped might be the worst of both worlds, though, since it discourages updating and makes it look like most people never do it.
I have an intuition that something socially good might be achieved by seeing high-status rationalists treat ass numbers as ass numbers, brazenly assign wildly different probabilities to the same proposition week-by-week, etc., especially if this is a casual and incidental thing rather than being the focus of any blog posts or comments. This might work better, though, if the earlier probabilities vanish by default and only show up again if the user decides to highlight them.
(Also, if a user repeatedly abuses this feature to look a lot more accurate than they really were, this warrants mod intervention IMO.)
Also, if you do something Arbital-like, I’d find it valuable if the interface encourages people to keep updating their probabilities later as they change. E.g., some (preferably optional) way of tracking how your view has changed over time. Probably also make it easy for people to re-vote without checking (and getting anchored by) their old probability assignment, for people who want that.
One small thing you could do is to have probability tools be collapsed by default on any AIAF posts (and maybe even on the LW versions of AIAF posts).
Also, maybe someone should write a blog post that’s a canonical reference for ‘the relevant risks of using probabilities that haven’t already been written up’, in advance of the feature being released. Then you could just link to that a bunch. (Maybe even include it in the post that explains how the probability tools work, and/or link to that post from all instances of the probability tool.)
Another idea: Arbital had a mix of (1) ‘specialized pages that just include a single probability poll and nothing else’; (2) ‘pages that are mainly just about listing a ton of probability polls’; and (3) ‘pages that have a bunch of other content but incidentally include some probability polls’.
If probability polls on LW mostly looked like 1 and 2 rather than 3, then that might make it easier to distinguish the parts of LW that should be very probability-focused from the parts that shouldn’t. I.e., you could avoid adding Arbital’s feature for easily embedding probability polls in arbitrary posts (and/or arbitrary comments), and instead treat this more as a distinct kind of page, like ‘Questions’.
You could still link to the ‘Probability’ pages prominently in your post, but the reduced prominence and site support might cause there to be less social pressure for people to avoid writing/posting things out of fears like ‘if I don’t provide probability assignments for all my claims in this blog post, or don’t add a probability poll about something at the end, will I be seen as a Bad Rationalist?’
I’ve never checked my karma total on LW 2.0 to see how it’s changed.
I am most worried that this will drastically increase the clutter of comment threads and make things a lot harder to parse. In particular if the order of the reacts is different on each comment, since then there is no reliable way of scanning for the different kinds of information.
I like the reactions UI above, partly because separating it from karma makes it clearer that it’s not changing how comments get sorted, and partly because I do want ‘agree’/‘disagree’ to be non-anonymous by default (unlike normal karma).
I agree that the order of reacts should always be the same. I also think every comment/post should display all the reacts (even just to say ‘0 Agree, 0 Disagree...‘) to keep things uniform. That means I think there should only be a few permitted reacts—maybe start with just ‘Agree’ and ‘Disagree’, then wait 6+ months and see if users are especially clambering for something extra.
I think the obvious other reacts I’d want to use sometimes are ‘agree and downvote’ + ‘disagree and upvote’ (maybe shorten to Agree+Down, Disagree+Up), since otherwise someone might not realize that one and the same person is doing both, which loses a fair amount of this thing I want to be fluidly able to signal. (I don’t think there’s much value to clearly signaling that the same person agreed and upvoted or disagree and downvoted a thing.)
I would also sometimes click both the ‘agree’ and ‘disagree’ buttons, which I think is fine to allow under this UI. :)
*disagrees with and approves of this relevant, interesting, and non-confused comment*