# ozziegooen(Ozzie Gooen)

Karma: 2,738

I’m currently researching forecasting and epistemics as part of the Quantified Uncertainty Research Institute.

• (I originally posted this to Goodreads)

TDLR: A good book with mass appeal to help people care more about being accurate. Fairly easy to read, which makes it easy to recommend to many people.

I’ve met Julia a few times and am friendly with her. I’d be happy if this book does well, and expect that to lead to a (slightly) more reasonable world.

That said, in the interest of having a Scout Mindset, I want to be honest about my impression.

The Scout Mindset is the sort of book I’m both happy with and frustrated by. I’m frustrated because this is a relatively casual overview of what I wish were a thorough Academic specialty. I felt similarly with The Life You Can Save when that was released.

Another way of putting this is that I was sort of hoping for an academic work, but instead, think of this more as a journalistic work. It reminds me of Vice Documentaries (which I like a lot) and Malcolm Gladwell (in a nice way), instead of Superforecasting or The Elephant in the Brain. That said, journalistic works have their unique contributions in the literature, it’s just a very different sort of work.

I just read through the book on Audible and don’t have notes. To write a really solid review would take more time than I have now, so instead, I’ll leave scattered thoughts.

1. The main theme of the book is the dichotomy of “The Scout Mindset” vs. “The Soldier Mindset”, and more specifically, why the Scout Mindset is (almost always?) better than the Solider Mindset. Put differently, we have a bunch of books about “how to think accurately”, but surprisingly few on “you should even try thinking accurately.” Sadly, this latter part has to be stated, but that’s how things are.

2. I was expecting a lot of references to scientific studies, but there seemed to be a lot more text on stories and a few specific anecdotes. The main studies I recall were a very few seemingly small psychological studies, which at this point I’m fairly suspect of. One small note: I found it odd that Elon Musk was described multiple times as something like an exemplar of honesty. I agree with the particular examples pointed to, but I believe Elon Musk is notorious for making explicit overconfident statements.

3. Motivated reasoning is a substantial and profound topic. I believe it already has many books detailing not only that it exists, but why it’s beneficial and harmful in different settings. The Scout Mindset didn’t seem to engage with much of this literature. It argued that “The Scout Mindset is better than the Soldier Mindset”, but that seems like an intense simplification of the landscape. Lies are a much more integral part of society than I think they are given credit for here, and removing them would be a very radical action. If you could go back in time and strongly convince particular people to be atheistic, that could be fatal.

4. The most novel part to me was the last few chapters, on “Rethinking Identity”. This section seems particularly inspired by the blog post Keep Your Identity Small by Paul Graham, but of course, goes into more detail. I found the mentioned stories to be a solid illustration of the key points and will dwell on these more.

5. People close to Julia’s work have heard much of this before, but maybe half or so seemed rather new to me.

6. As a small point, if the theme of the book is about the benefits of always being honest, the marketing seemed fairly traditionally deceiving. I wasn’t sure what to expect from the cover and quotes. I could easily see potential readers getting the wrong impression looking at the marketing materials, and there seems to be little work to directly make the actual value of the book more clear. There’s nothing up front that reads, “This book is aiming to achieve X, but doesn’t do Y and Z, which you might have been expecting.” I guess that Julia didn’t have control over the marketing.

• Thanks for the reasoning here. I also don’t want to detract people from purchasing these books, I imagine if people really wanted they could write the dates on them manually.

That said—

To better explain my intuitions here:

In 5 years from now, I care about whether the essays came out in 2018 or in 2017 if I am trying to find a particular one in a book, or recommend one to another person. Ordering is really simple to remember compared to other kinds of naming one could use. When going between different books the date is particularly relevant because names and concepts will change over time. I’d hope that 10 years from now much of the 2018 content will look antiquated and old.

If you’re just aiming for “timeless and good quality posts” (this sounds like the value proposition for the readers you are referring to), then I don’t understand the need to only choose ones from 2018. Many good ones came out before 2018 that I imagine would be interesting to readers. That said, if you plan on releasing them on yearly intervals later I’d imagine some restriction might be necessary. Or, it could be that whenever a few topics seem to have come full circle or be in a good place for a book, you publish a book focused on those topics.

I agree that “LessWrong Review 2018” sounds strange, but there are other phrases that could have with 2018 in them. Many Academic periodicals (including things like Philosophy, which are at least as timeless as LessWrong content) have yearly collections. With those I don’t assume I need to read all of the old ones before reading the current year, that would take quite a while (it becomes more obvious after a few are out). I imagine the name could be something like, “LessWrong Highlighted Content: 2018″ or “The Best of LessWrong: 2018”.

It’s very possible that there’s kind of a “free pass” for the first 1-3 years, if this is a repeating thing, and then you could start adding the year. It’s not that big a deal if there are just 2-3 of these, but I imagine it will get to be annoying if there are 5+ (and by that time it will be more obvious if it’s an issue or not)

• There are a few curated communities you can join and begin predicting in now. Note you must log in to Foretold before accessing these pages.

Elizabeth Van Nostrand will be evaluating several statements from the book The Unbound Prometheus. Predict how she will judge these statements. You can earn up to $65 per predicted question. EA Survey 2019&2020 Instructions Document Predict questions about the upcoming EA surveys. There are two rounds, with multiple cash prizes each. Apple Inc. Updates Predict things about Apple’s new product announcements and stock price. Slate Star Codex 2019 Scott Alexander made several predictions in the beginning of 2019. Even though 2019 is mostly over, there’s still some uncertainty left. LessWrong Forecast the karma of this post, and several other things. Feel free to make new questions for posts or parameters you may be interested in. • As someone part of the social communities, I can confirm that Leverage was definitely a topic of discussion for a long time around Rationalists and Effective Altruists. That said, often the discussion went something like, “What’s up with Leverage? They seem so confident, and take in a bunch of employees, but we have very little visibility.” I think I experienced basically the same exact conversation about them around 10 times, along these lines. As people from Leverage have said, several Rationalists/​EAs were very hostile around the topic of Leverage, particularly in the last ~4 years or so. (I’ve heard stories of people getting shouted at just for saying they worked at Leverage at a conference). On the other hand, they definitely had support by a few rationalists/​EA orgs and several higher-ups of different kinds. They’ve always been secretive, and some of the few public threads didn’t go well for them, so it’s not too surprising to me that they’ve had a small LessWrong/​EA Forum presence. I’ve personally very much enjoyed staying mostly staying away from the controversy, though very arguably I made a mistake there. (I should also note that I had friends who worked at or worked close to Leverage, I attended like 2 events there early on, and I applied to work from there around 6 years ago) • # Questions around Making Reliable Evaluations Most existing forecasting platform questions are for very clearly verifiable questions: • “Who will win the next election” • “How many cars will Tesla sell in 2030?” But many of the questions we care about are much less verifiable: • “How much value has this organization created?” • “What is the relative effectiveness of AI safety research vs. bio risk research?” One solution attempt would be to have an “expert panel” assess these questions, but this opens up a bunch of issues. How could we know how much we could trust this group to be accurate, precise, and understandable? The topic of, “How can we trust that a person or group can give reasonable answers to abstract questions” is quite generic and abstract, but it’s a start. I’ve decided to investigate this as part of my overall project on forecasting infrastructure. I’ve recently been working with Elizabeth on some high-level research. I believe that this general strand of work could be useful both for forecasting systems and also for the more broad-reaching evaluations that are important in our communities. ## Early concrete questions in evaluation quality One concrete topic that’s easily studiable is evaluation consistency. If the most respected philosopher gives wildly different answers to “Is moral realism true” on different dates, it makes you question the validity of their belief. Or perhaps their belief is fixed, but we can determine that there was significant randomness in the processes that determined it. Daniel Kahneman apparently thinks a version of this question is important enough to be writing his new book on it. Another obvious topic is in the misunderstanding of terminology. If an evaluator understands “transformative AI” in a very different way to the people reading their statements about transformative AI, they may make statements that get misinterpreted. These are two specific examples of questions, but I’m sure there are many more. I’m excited about understanding existing work in this overall space more, and getting a better sense of where things stand and what the next right questions are to be asking. • If the main thing that separates this book from the 2019 and 2020 books is that it’s the collection of posts from 2018, it’s counterintutive to me that that’s not the prominent feature of the title here. Other “journals of the year” often make the year really prominent. I feel like 5 years from now I’m going to have trouble remembering that “A Map That Reflects the Territory” refers to the 2018 edition, and some other equally elegant but abstract name refers to the 2019 edition. If you do go with really premium books especially, I’d recommend considering making the date the prominent bit. Honestly I expect to memorize the “lesswrong”ness from the branding (which is distinct), so the year seems like the most important part to me. That said, I feel like I’m not exactly in the target audience (generally don’t prefer physical books), so it would come down to the preferences of others. I realize you’ve probably thought about this a lot and have reasons, just giving my 2 cents. • +1 for the detail. Right now there’s very little like this explained publicly (or accessible in other ways to people like myself). I found this really helpful. I agree that the public discussion on the topic has been quite poor. • There’s an “EA Mental Health Navigator” now to help people connect to the right care. https://​​eamentalhealth.wixsite.com/​​navigator I don’t know how good it is yet. I just emailed them last week, and we set up an appointment for this upcoming Wednesday. I might report back later, as things progress. • I found this article interesting: https://​​www.thegentlemansjournal.com/​​25-iconic-moments-that-define-the-21st-century-thus-far/​​ It lists several events that caused large celebrations. However, you can notice a pattern: 2008 — Barack Obama wins the 2008 election, becoming the first African American President 2011 — Commandos conduct a raid in Pakistan, which ends with the killing of Osama bin Laden 2012 — The US rover, Curiosity, takes a selfie on Mars 2014 — Malala Yousafazi becomes the youngest ever recipient of a Nobel Prize 2015 — Same-sex marriage is legalised across all fifty states in the USA Almost all were political or nontechnical. Personally, I think that most kinds of modern technology are highly incremental, and as of recent have been treated with suspicion. I also could imagine that real technology change has slowed down a fair bit (especially outside of AI), as has been discussed extensively. • As someone who’s been close to these, some had a few related issues, but Leverage seemed much more extreme in many of these dimensions to me. However, now there are like 50 small EA/​rationalist groups out there, and I am legitimately worried about quality control. • I’m really sorry if I hurt or offended you. I assumed that a brief description of where I was at would be preferred to not replying at all. I clearly was incorrect about that. I disagree with some of your specific implications. I’m fairly sure though that you’d disagree with my responses. I could easily imagine that you’ve already predicted them, well enough, and wouldn’t find them very informative, particularly for what I could write in a few sentences. This isn’t unusual for me. I try to stay out of almost all online discussion. I have things to do, I’m sure you have things to do as well. Online discussion is costly, and it’s especially costly when people know very little about each other[1], and the conversation topic (White Fragility) is as controversial as this one is. [1]: I know almost nothing about you. I feel like I’d have a very difficult time feeling comfortable saying things in ways I can predict you’d be receptive to, or things that you wouldn’t actively attack me for. I find that I’ve had a difficult time modeling people online; particularly people who I barely know. This could easily lead to problems of several different kinds. It’s very, very possible that none of this applies to you, but it would take a fair amount of discussion for me to find that out and feel safe with my impressions of you. This also applies for all the other people I don’t know, but who might be watching this conversation or jump in at any point. • I very much agree about the worry, My original comment was to make the easiest case quickly, but I think more extensive cases apply to. For example, I’m sure there have been substantial problems even in the other notable orgs, and in expectation we should expect there to continue to be so. (I’m not saying this based on particular evidence about these orgs, more that the base rate for similar projects seems bad, and these orgs don’t strike me as absolutely above these issues.) One solution (of a few) that I’m in favor of is to just have more public knowledge about the capabilities and problems of orgs. I think it’s pretty easy for orgs of about any quality level to seem exciting to new people and recruit them or take advantage of them. Right now, some orgs have poor reputations among those “in the know” (generally for producing poor quality output), but this isn’t made apparent publicly.[1] One solution is to have specialized systems that actually present negative information publicly; this could be public rating or evaluation systems. This post by Nuno was partially meant as a test for this: https://​​forum.effectivealtruism.org/​​posts/​​xmmqDdGqNZq5RELer/​​shallow-evaluations-of-longtermist-organizations Another thing to do, of course, would be to just do some amounts of evaluation and auditing of all these efforts, above and beyond what even those currently “in the know” have. I think that in the case of Leverage, there really should have been some deep investigation a few years ago, perhaps after a separate setup to flag possible targets of investigation. Back then things were much more disorganized and more poorly funded, but now we’re in a much better position for similar efforts going forward. [1] I don’t particularly blame them, consider the alternative. • I’m really happy to see this become public! Personally, I find PDFs nicer than paper books for multiple reasons (can listen to, can annotate and keep easier). Was there anything in particular that convinced the team to make it public at this point? • ## Experimental predictability and generalizability are correlated A criticism to having people attempt to predict the results of experiments is that this will be near impossible. The idea is that experiments are highly sensitive to parameters and these would need to be deeply understood in order for predictors to have a chance at being more accurate than an uninformed prior. For example, in a psychological survey, it would be important that the predictors knew the specific questions being asked, details about the population being sampled, many details about the experimenters, et cetera. One counter-argument may not be to say that prediction will be easy in many cases, but rather that if these experiments cannot be predicted in a useful fashion without very substantial amounts of time, then these experiments aren’t probably going to be very useful anyway. Good scientific experiments produce results are generalizable. For instance, a study on the effectiveness of Malaria on a population should give us useful information (probably for use with forecasting) about the effectiveness on Malaria on other populations. If it doesn’t, then value would be limited. It would really be more of a historic statement than a scientific finding. Possible statement from a non-generalizable experiment: “We found that intervention X was beneficial within statistical significance for a population of 2,000 people. That’s interesting if you’re interested in understanding the histories of these 2,000 people. However, we wouldn’t recommend inferring anything about this to other groups of people, or to understanding anything about these 2,000 people going forward.” ## Formalization One possible way of starting to formalize this a bit is to imagine experiments (assuming internal validity) as mathematical functions. The inputs would be the parameters and details of how the experiment was performed, and the results would be the main findings that the experiment found. If the experiment has internal validity, then observers should predict that if an identical (but subsequent) experiment were performed, it would result in identical findings. We could also say that if we took a probability distribution of the chances of every possible set of findings being true, the differential entropy of that distribution would be 0, as smart forecasters would recognize that is correct with ~100% probability. ### Generalizability Now, to be generalizable, then hopefully we could perturb the inputs in a minor way, but still have the entropy be low. Note that the important thing is not that the outputs not be changed, but rather that they remain predictable. For instance, a physical experiment that describes the basics of mechanical velocity may be performed on data with velocities of 50-100 miles/​hour. This experiment would not be useful only if future experiments also described situations with similar velocities; but rather, if future experiments on velocity could be better predicted, no matter the specific velocities used. We can describe a perturbation of to be . Thus, hopefully, the following will be true for low values of . So, perhaps generalizability can be defined something like, Generalizability is the ability for predictors to better predict the results of similar experiments upon seeing the results of a particular experiment, for increasingly wide definitions of “similar”. ### Predictability and Generalizability I could definitely imagine trying to formalize predictability better in this setting, or more specifically, formalize the concept of “do forecasters need to spend a lot of time understanding the parameters of an experiment.” In this case, that could look something like modeling how the amount of uncertainty forecasters have about the inputs correlates with their uncertainty about the outputs. The general combination of predictability and generality would look something like adding an additional assumption: If forecasters require a very high degree of information on the inputs to an experiment in order to predict it’s outputs, then it’s less likely they can predict (with high confidence) the results of future experiments with significant changes, once they see the results of said experiment. Admitting, this isn’t using the definition of predictability that people are likely used to, but I imagine it correlates well enough. ### Final Thoughts I’ve been experimenting more with trying to formalize concepts like this. As such, I’d be quite curious to get any feedback from this work. I am a bit torn; on one hand I appreciate formality, but on the other this is decently messy and I’m sure it will turn off many readers. • I’m curious, what kinds of events follow Chatham House Rules? I’ve never heard of them until now. Is it just official ones from the Chatham House itself, or have other organizations been using them? • A few quick thoughts: 1) This seems great, and I’m impressed by the agency and speed. 2) From reading the comments, it seems like several people were actively afraid of how Leverage could retaliate. I imagine similar for accusations/​whistleblowing for other organizations. I think this is both very, very bad, and unnecessary; as a whole, the community is much more powerful than individual groups, so it seems poorly managed when the community is scared of a specific group. Resources should be spent to cancel this out. In light of this, if more money were available, it seems easy to justify a fair bit more. Or even better could be something like, “We’ll help fund lawyers in case you’re attacked legally, or anti-harassing teams if you’re harassed or trolled”. This is similar to how the EFF helps with cases from small people/​groups being attacked by big companies. I don’t mean to complain; I think any steps here, especially so quickly are fantastic. 3) I’m afraid this will get lost in this comment section. I’d be excited about a list of “things to keep in mind” like this to be repeatedly made prominent somehow. For example, I could imagine that at community events or similar, there could be necessary papers like, “Know your rights, as a Rationalist/​EA”, which flags how individuals can report bad actors and behavior. 4) Obviously a cash prize can encourage lying, but I think this can be decently managed. (It’s a small community, so if there’s good moderation,$15K would be very little compared to the social stigma that would come and you’ve found out to have destructively lied for \$15k)

• The books look very pretty, nice work.

Is this content from 2018 specifically, or is it taken from all of historic LessWrong? My impression was that this was from the 2018 review, but I don’t see anything about that in the description above.

If it is from the 2018 review, do you have ideas on how you will differentiate the 2019/​2020/​etc versions?

• I was recently pointed to the Youtube channel Psychology in Seattle. I think it’s one of my favorites in a while.

I’m personally more interested in workspace psychology than relationship psychology, but my impression is that they share a lot of similarities.

Emotional intelligence gets a bit of a bad rap due to the fuzzy nature, but I’m convinced it’s one of the top few things for most people to get better at. I know lots of great researchers and engineers who repeat a bunch of repeated failure modes, and this causes severe organizational and personal problems as a result.

Emotional intelligence books and training typically seem quite poor to me. The alternative format here of “let’s just show you dozens of hours of people interacting with each other, and point out all the fixes they could make” seems much better than most books or lectures I’ve seen.

This Youtube series does an interesting job at that. There’s a whole bunch of “let’s watch this reality TV show, then give our take on it.” I’d be pretty excited about there being more things like this posted online, especially in other contexts.

Related, I think the potential of reality TV is fairly underrated in intellectual circles, but that’s a different story.