A tricky thing about this is that there’s an element of cognitive distortion in how most people evaluate these questions, and play-acting at “this distortion makes sense” can worsen the distortion (at the same time that it helps win more trust from people who have the distortion).
If it turned out to be a good idea to try to speak to this perspective, I’d recommend first meditating on a few reversal tests. Like: “Hmm, I wouldn’t feel any need to add a disclaimer here if the text I was recommending were The Brothers Karamazov, though I’d want to briefly say why it’s relevant, and I might worry about the length. I’d feel a bit worried about recommending a young adult novel, even an unusually didactic one, because people rightly expect YA novels to be optimized for less useful and edifying things than the “literary classics” reference class. The insights tend to be shallower and less common. YA novels and fanfiction are similar in all those respects, and they provoke basically the same feeling in me, so I can maybe use that reversal test to determine what kinds of disclaimers or added context make sense here.”
(If I want to express stronger gratitude than that, I’d rather write it out.)
On slack, Thumbs Up, OK, and Horns hand signs meet all my minor needs for thanking people.
Can’t individuals just list ‘Reign of Terror’ and then specify in their personalized description that they have a high bar for terror?
We’d talked about getting a dump out as well, and your plan sounds great to me! The LW team should get back to you with a list at some point (unless they think of a better idea).
I asked Eliezer if it made sense to cross-post this from Arbital, and did the cross-posting when he approved. I’m sorry it wasn’t clear that this was a cross-post! I intended to make this clearer, but my idea was bad (putting the information on the sequence page) and I also implemented it wrong (the sequence didn’t previously display on the top of this post).
This post was originally written as a nontechnical introduction to expected utility theory and coherence arguments. Although it begins in media res stylistically, it doesn’t have any prereqs or context beyond “this is part of a collection of introductory resources covering a wide variety of technical and semitechnical topics.”
Per the first sentence, the main purpose is for this to be a linkable resource for conversations/inquiry about human rationality and conversations/inquiry about AGI:
So we’re talking about how to make good decisions, or the idea of ‘bounded rationality’, or what sufficiently advanced Artificial Intelligences might be like; and somebody starts dragging up the concepts of ‘expected utility’ or ‘utility functions’. And before we even ask what those are, we might first ask, Why?
There have been loose plans for a while to cross-post content from Arbital to LW (maybe all of it; maybe just the best or most interesting stuff), but as I mentioned downthread, we’re doing more cross-post experiments sooner than we would have because Arbital’s been having serious performance issues.
I assume you mean ‘no one has this responsibility for Arbital anymore’, and not that there’s someone else who has this responsibility.
Arbital has been getting increasingly slow and unresponsive. The LW team is looking for fixes or work-arounds, but they aren’t familiar with the Arbital codebase. In the meantime, I’ve been helping cross-post some content from Arbital to LW so it’s available at all.
MIRI folks are the most prominent proponents of fast takeoff, and we unfortunately haven’t had time to write up a thorough response. Oli already quoted the quick comments I posted from Nate and Eliezer last year, and I’ll chime in with some of the factors that I think are leading to disagreements about takeoff:
Some MIRI people (Nate is one) suspect we might already be in hardware overhang mode, or closer to that point than some other researchers in the field believe.
MIRI folks tend to have different views from Paul about AGI, some of which imply that AGI is more likely to be novel and dependent on new insights. (Unfair caricature: Imagine two people in the early 20th century who don’t have a technical understanding of nuclear physics yet, trying to argue about how powerful a nuclear-chain-reaction-based bomb might be. If one side were to model that kind of bomb as “sort of like TNT 3.0” while the other is modeling it as “sort of like a small Sun”, they’re likely to disagree about whether nuclear weapons are going to be a small v. large improvement over TNT. Note I’m just using nuclear weapons as an analogy, not giving an outside-view argument “sometimes technologies are discontinuous, ergo AGI will be discontinuous”.)
This list isn’t at all intended to be sufficiently-detailed or exhaustive.
I’m hoping we have time to write up more thoughts on this before too long, because this is an important issue (even given that we’re trying to minimize the researcher time we put into things other than object-level deconfusion research). I don’t want MIRI to be a blocker on other researchers making progress on these issues, though — it would be bad if people put a pause on hashing out takeoff issues for themselves (or put a pause on alignment research that’s related to takeoff views) until Eliezer had time to put out a blog post. I primarily wanted to make sure people know that the lack of a substantive response doesn’t mean that Nate+Eliezer+Benya+etc. agree with Paul on takeoff issues now, or that we don’t think this disagreement matters. Our tardiness is because of opportunity costs and because our views have a lot of pieces to articulate.
That counts! :) Part of why I’m asking is in case we want to build a proper LW glossary, and Rationality Cardinality could at least provide ideas for terms we might be missing.
Are there any other OK-quality rationalist glossaries out there? https://wiki.lesswrong.com/wiki/Jargon is the only one I know of. I vaguely recall there being one on http://www.bayrationality.com/ at some point, but I might be misremembering.
The wiki glossary for the sequences / Rationality: A-Z ( https://wiki.lesswrong.com/wiki/RAZ_Glossary ) is updated now with the glossary entries from the print edition of vol. 1-2.
New entries from Map and Territory:
anthropics, availability heuristic, Bayes’s theorem, Bayesian, Bayesian updating, bit, Blue and Green, calibration, causal decision theory, cognitive bias, conditional probability, confirmation bias, conjunction fallacy, deontology, directed acyclic graph, elan vital, Everett branch, expected value, Fermi paradox, foozality, hindsight bias, inductive bias, instrumental, intentionality, isomorphism, Kolmogorov complexity, likelihood, maximum-entropy probability distribution, probability distribution, statistical bias, two-boxing
New entries from How to Actually Change Your Mind:
affect heuristic, causal graph, correspondence bias, epistemology, existential risk, frequentism, Friendly AI, group selection, halo effect, humility, intelligence explosion, joint probability distribution, just-world fallacy, koan, many-worlds interpretation, modesty, transhuman
A bunch of other entries from the M&T and HACYM glossaries were already on the wiki; most of these have been improved a bit or made more concise.
One option that’s smaller than link posts might be to mention in the AF/LW version of the newsletter which entries are new to AIAF/LW as far as you know; or make comment threads in the newsletter for those entries. I don’t know how useful these would be either, but it’d be one way to create common knowledge ‘this is currently the one and only place to discuss these things on LW/AIAF’.
Possible compromise idea: send everyone their karma upvotes along with downvotes regularly, but send the upvotes in daily batches and the downvotes in monthly batches. Having your downvotes sent to you at known, predictable times rather than in random bursts, and having the updates occur less often, might let users take in the relevant information without having it totally dominate their day-to-day experience of visiting the site. This also makes it easier to spot patterns and to properly discount very small aversive changes in vote totals.
On the whole, I’m not sure how useful this would be as a sitewide default. Some concerns:
It’s not clear to me that karma on its own is all that useful or contentful. Ray recently noted that a comment of his had gotten downvoted somewhat, and that this had been super salient and pointed feedback for him. But I’m pretty sure that the ‘downvote’ Ray was talking about was actually just me turning a strong upvote into a normal upvote for minor / not-worth-independently-tracking reasons. Plenty of people vote for obscure or complicated or just-wrong reasons.
The people who get downvoted the most are likely to have less familiarity with LW norms and context, so they’ll be especially ill-equipped to extract actionable information from downvotes. If all people are learning is ‘<confusing noisy social disapproval>’, I’m not sure that’s going to help them very much in their journey as a rationalist.
Upvotes tend to be a clearer signal in my experience, while needing to meet a lower bar. (Cf.: we have a higher epistemic bar for establishing a norm ‘let’s start insulting/criticizing/calling out our colleagues whenever they make a mistake’ than for establishing a norm ‘let’s start complimenting/praising/thanking our colleagues whenever they do something cool’, and it would be odd to say that the latter is categorically bad in any environment where we don’t also establish the former norm.)
I’m not confident of what the right answer is; this is just me laying out some counter-considerations. I like Mako’s comment because it’s advocating for an important value, and expressing a not-obviously-wrong concern about that value getting compromised. I lean toward ‘don’t make down-votes this salient’ right now. I’d like more clarity inside my head about how much the downvote-hiding worry is shaped like ‘we need to make downvotes more salient so we can actually get the important intellectual work done’ vs. ‘we need to make downvotes more salient so we can better symbolize/resemble Rationality’.
! Hi! I am a biased MIRI person, but I quite dig all the things you mentioned. :)
I like this shortform feed idea!
Yeah, strong upvote to this point. Having an Arbital-style system where people’s probabilities aren’t prominently timestamped might be the worst of both worlds, though, since it discourages updating and makes it look like most people never do it.
I have an intuition that something socially good might be achieved by seeing high-status rationalists treat ass numbers as ass numbers, brazenly assign wildly different probabilities to the same proposition week-by-week, etc., especially if this is a casual and incidental thing rather than being the focus of any blog posts or comments. This might work better, though, if the earlier probabilities vanish by default and only show up again if the user decides to highlight them.
(Also, if a user repeatedly abuses this feature to look a lot more accurate than they really were, this warrants mod intervention IMO.)
Also, if you do something Arbital-like, I’d find it valuable if the interface encourages people to keep updating their probabilities later as they change. E.g., some (preferably optional) way of tracking how your view has changed over time. Probably also make it easy for people to re-vote without checking (and getting anchored by) their old probability assignment, for people who want that.