Got it, that’s helpful. Thank you!
Very clear presentation! As someone outside the field who likes to follow along, I very much appreciate these clear conceptual frameworks and explanations.
I did however get slightly lost in section 1.2. At first reading I was expecting this part:
which we will contrast with the outer alignment problem of eliminating the gap between the base objective and the intended goal of the programmers.
to say, ”… gap between the behavioral objective and the intended goal of the programmers.” (In which case the inner alignment problem would be a subcomponent of the outer alignment problem.)
On second thought, I can see why you’d want to have a term just for the problem of making sure the base objective is aligned. But to help myself (and others who think similarly) keep this all straight, do you have a pithy term for “the intended goal of the programmers” that’s analogous to base objective, mesa objective, and behavioral objective?
Would meta objective be appropriate?
(Apologies if my question rests on a misunderstanding or if you’ve defined the term I’m looking for somewhere and I’ve missed it.)
Apparently the author is a science writer (makes sense), and it’s his first book:
I’m a freelance science writer. Until January 2018 I was science writer for BuzzFeed UK; before that, I was a comment and features writer for the Telegraph, having joined in 2007. My first book, The Rationalists: AI and the geeks who want to save the world, for Weidenfeld & Nicolson, is due to be published spring 2019. Since leaving BuzzFeed, I’ve written for the Times, the i, the Telegraph, UnHerd, politics.co.uk, and elsewhere.
Someone wrote a book about us:
Overall, they have sparked a remarkable change. They’ve made the idea of AI as an existential risk mainstream; sensible, grown-up people are talking about it, not just fringe nerds on an email list. From my point of view, that’s a good thing. I don’t think AI is definitely going to destroy humanity. But nor do I think that it’s so unlikely we can ignore it. There is a small but non-negligible probability that, when we look back on this era in the future, we’ll think that Eliezer Yudkowsky and Nick Bostrom — and the SL4 email list, and LessWrong.com — have saved the world. If Paul Crowley is right and my children don’t die of old age, but in a good way — if they and humanity reach the stars, with the help of a friendly superintelligence — that might, just plausibly, be because of the Rationalists.
Figuring out what’s up with that seems like a major puzzle of our time.
Would be curious to hear more about your confusion and why it seems like such a puzzle. Does “when you aggregate over large numbers of things, complex lumpiness smooths out into boring sameness” not feel compelling to you?
If not, why not? Maybe you can confuse me too ;-)
In English the theorem says that the probability we should expect to assign to the true value of H after observing the true value of D is greater than or equal to the expected probability we assign to the true value of H before observing the value of D.
I have a very basic question about notation—what tells me that H in the equation refers to the true hypothesis?
Put another way, I don’t really understand why that equation has a different interpretation than the conservation-of-expected-evidence equation: E[P(H=hi|D)]=P(H=hi).
In both cases I would interpret it as talking about the expected probability of some hypothesis, given some evidence, compared to the prior probability of that hypothesis.
I think I’ve commented on your newsletters a few times, but haven’t comment more because it seems like the number of people who would read and be interested in such a comment would be relatively small, compared to a comment on a more typical post.
I am surprised you think this. Don’t the newsletters tend to be relatively highly upvoted? They’re one of the kinds of links that I always automatically click on when I see them on the LW front page.
Maybe I’m basing this too much on my own experience, but I would love to see more discussion on the newsletter posts.
For freedom-as-arbitrariness, see also: Slack
If your car was subject to a perpetual auction and ownership tax as Weyl proposes, bashing your car to bits with a hammer would cost you even if you didn’t personally need a car, because it would hurt the rental or resale value and you’d still be paying tax.
I don’t think this is right. COST stands for “Common Ownership Self-Assessed Tax”. The self-assessed part refers to the idea that you personally state the value you’d be willing to sell the item for (and pay tax on that value). Once you’ve destroyed the item, presumably you’d be willing to part with the remains for a lower price, so you should just re-state the value and pay a lower tax.
It’s true that damaging the car hurts the resale value and thus costs you (in terms of your material wealth), but this would be true whether or not you were living under a COST regime.
Whatever ability IQ tests and math tests measure, I believe that lacking that ability doesn’t have any effect on one’s ability to make a good social impression or even to “seem smart” in conversation.
That section of Sarah’s post jumped out at me too, because it seemed to be the opposite of my experience. In my (limited, subject-to-confirmation-bias) experience, how smart someone seems to me in conversation seems to match pretty well with how they did on standardized tests (or other measures of academic achievement). Obviously not perfectly, but way way better than chance.
I would also expect that courtesy of things like Dunning-Kruger, people towards the bottom will be as bad at estimating IQ as they are competence at any particular thing.
FWIW, the original Dunning-Kruger study did not show the effect that it’s become known for. See: https://danluu.com/dunning-kruger/
In two of the four cases, there’s an obvious positive correlation between perceived skill and actual skill, which is the opposite of the pop-sci conception of Dunning-Kruger.
I’m not totally sure I’m parsing this sentence correctly. Just to clarify, “large firm variation in productivity” means “large variation in the productivity of firms” rather than “variation in the productivity of large firms”, right?
Also, the second part is saying that on average there is productivity growth across firms, because the productive firms expand more than the less productive firms, yes?
Not sure exactly what you mean by “numerical simulation”, but you may be interested in https://ought.org/ (where Paul is a collaborator), or in Paul’s work at OpenAI: https://openai.com/blog/authors/paul/ .
Just had a call with Nick Bostrom who schooled me on AI issues of the future. We have a lot of work to do.
This same candidate (whom the markets currently give a 5% chance of being the Democratic nominee) also wants to create a cabinet-level position to monitor emerging technology, especially AI:
Advances in automation and Artificial Intelligence (AI) hold the potential to bring about new levels of prosperity humans have never seen. They also hold the potential to disrupt our economies, ruin lives throughout several generations, and, if experts such as Stephen Hawking and Elon Musk are to be believed, destroy humanity.
...As President, I will…* Create a new executive department – the Department of Technology – to work with private industry and Congressional leaders to monitor technological developments, assess risks, and create new guidance. The new Department would be based in Silicon Valley and would initially be focused on Artificial Intelligence.* Create a new Cabinet-level position of Secretary of Technology who will be tasked with leading the new Department.* Create a public-private partnership between leading tech firms and experts within government to identify emerging threats and suggest ways to mitigate those threats while maximizing the benefit of technological innovation to society.
It seems to me that perhaps the major difference between active/concentrated curiosity and open/diffuse curiosity is how much of an expectation you have that there’s one specific piece of information you could get that would satisfy the curiosity. (And for this reason the “concentrated” and “diffuse” labels do seem somewhat apt to me.)
Active/concentrated curiosity is focused on finding the answer to a specific question, while open/diffuse curiosity seeks to explore and gain understanding. (And that exploration may or may not start out with its attention on a single object/emotion/question.)
See also my comment here on non-exploitability.
Nitpick: I think the intro example would be clearer if there were explicit numbers of grapes/oranges rather than “some”. Nothing is surprising about the original story if Beatriz got more oranges from Deion than she gave up to Callisto. (Or gave away fewer grapes to Deion than she received from Callisto.)
Unless I missed it, neither this comment nor the main post explains why you ultimately decided in favor of karma notifications. You’ve listed a bunch of cons—I’m curious what the pros were.
Was it just an attempt to achieve this?
I want new users who show up on the site to feel rewarded when they engage with content
Great long-form interview with Andrew Yang here: Joe Rogan Experience #1245 - Andrew Yang.