Mass_Driver

Karma: 3,379

Mass_Driver Sep 1, 2021, 10:05 PM
4 points
0
on: Introduction to Reducing Goodhart
I appreciate how much detail you’ve used to lay out why you think a lack of human agency is a problem—compared to our earlier conversations, I now have a better sense of what concrete problem you’re trying to solve and why that problem might be important. I can imagine that, e.g., it’s quite difficult to tell how well you’ve fit a curve if the context in which you’re supposed to fit that curve is vulnerable to being changed in ways whose goodness or badness is difficult to specify. I look forward to reading the later posts in this sequence so that I can get a sense of exactly what technical problems are arising and how serious they are.
That said, until I see a specific technical problem that seems really threatening, I’m sticking by my opinion that it’s OK that human preferences vary with human environments, so long as (a) we have a coherent set of preferences for each individual environment, and (b) we have a coherent set of preferences about which environments we would like to be in. Right, like, in the ancestral environment I prefer to eat apples, in the modern environment I prefer to eat Doritos, and in the transhuman environment I prefer to eat simulated wafers that trigger artificial bliss. That’s fine; just make sure to check what environment I’m in before feeding me, and then select the correct food based on my environment. What do you do if you have control over my environment? No big deal, just put me in my preferred environment, which is the transhuman environment.
What happens if my preferred environment depends on the environment I’m currently inhabiting, e.g., modern me wants to migrate to the transhumanist environment, but ancestral me thinks you’re scary and just wants you to go away and leave me alone? Well, that’s an inconsistency in my preferences—but it’s no more or less problematic than any other inconsistency. If I prefer oranges when I’m holding an apple, but I prefer apples when I’m holding an orange, that’s just as annoying as the environment problem. We do need a technique for resolving problems of utility that are sensitive to initial conditions when those initial conditions appear arbitrary, but we need that technique anyway—it’s not some special feature of humans that makes that technique necessary; any beings with any type of varying preferences would need that technique in order to have their utility fully optimized.
It’s certainly worth noting that standard solutions to Goodhart’s law won’t work without modification, because human preferences vary with their environments—but at the moment such modifications seem extremely feasible to me. I don’t understand why your objections are meant to be fatal to the utility of the overall framework of Goodhart’s Law, and I hope you’ll explain that in the next post.

Mass_Driver Jul 3, 2021, 3:02 PM
2 points
0
in reply to: jsalvatier’s comment on: Being a teacher
Hmm. Nobody’s ever asked me to try to teach them that before, but here’s my advice:
1. Think about what dimensions or components success at the task will include. E.g., if you’re trying to play a song on the guitar, you might decide that a well-played song will have the correct chords played with the correct fingering and the correct rhythm.
2. Think about what steps are involved in each of the components of success, with an eye toward ordering those steps in terms of which steps are easiest to learn and which steps are logical prerequisites for the others. E.g., in order to learn how to play a rhythm, you first need an understanding of rhythmic concepts like beats and meters. Then, once you have a language that you can use to describe a rhythm, you need some concrete examples of rhythms, e.g., a half note followed by two quarter-notes. Then you need to translate that into the physical motions taken on the guitar, e.g., downstrokes and upstrokes with greater or lesser emphasis. Those are two different steps; first you teach the difference between a downstroke and an upstroke, and then you teach the difference between a stressed beat and an unstressed beat. You might change the order of those steps if you are working with a student who’s more comfortable with physical techniques than with language, e.g., demonstrate some rhythms first, and then only after that explain what they mean in words. In general, most values will have a vocabulary that lets you describe them, a series of examples that help you understand them, and a set of elements that constitute them; using each new word in the vocabulary and recognizing each type of example and recognizing each element and using each element is a separate step in learning the technique.
3. Leave some room at the end for integration, e.g., if you’ve learned rhythm and fingering and chords, you still need some time to practice using all three of those correctly at once. This may include learning how to make trade-offs among the various components, e.g., if you’ve got some very tricky fingering in one measure, maybe you simplify the chord to make that easier.

Mass_Driver Jul 3, 2021, 1:12 PM
4 points
0
on: Impossible moral problems and moral authority
I’m curious about the source of your intuition that we are obligated to make an optimal selection. You mention that the utility difference between two plausibly best meals could be large, which is true, especially when we drop the metaphor and reflect on the utility difference between two plausibly best FAI value schemes. And I suppose that, taken literally, the utilitarian code urges us to maximize utility, so leaving any utility on the table would technically violate utilitarianism.
On a practical level, though, I’m usually not in the habit of nitpicking people who do things for me that are sublimely wonderful yet still marginally short of perfect, and I try not to criticize people who made a decision that was plausibly the best available decision simply because some other decision was also plausibly the best available decision. If neither of us can tell for sure which of two options is the best, and our uncertainty isn’t of the kind that seems likely to be resolvable by further research, then my intuition is that the morally correct thing to do is just pick one and enjoy it, especially if there are other worse options that might fall upon us by default if we dither for too long.
I agree with you that a trusted moral authority figure can make it easier for us to pick one of several plausibly best options...but I disagree with you that such a figure is morally necessary; instead, I see them as useful moral support for an action that can be difficult due to a lack of willpower or self-confidence. Ideally, I would just always pick a plausibly best decision by myself; since that’s hard and I am a human being who sometimes experiences angst, it’s nice when my friends and my mom help me make hard decisions. So the role of the moral authority, in my view, isn’t that they justify a hard decision, causing it to become correct where it was not correct prior to their blessing; it’s that the moral authority eases the psychological difficulty of making a decision that was hard to accept but that was nevertheless correct even without the authority’s blessing.

Mass_Driver Jul 1, 2021, 3:34 PM
3 points
0
on: What am I fighting for?
Thank you for sharing this; there are several useful conceptual tools in here. I like the way you’ve found crisply different adjectives to describe different kinds of freedom, and I like the way you’re thinking about the computational costs of surplus choices.
Building on that last point a bit, I might say that a savvy agent who has already evaluated N choices could try to keep a running estimate of their expected gains from choosing the best option available after considering X more choices and then compare that gain to their cost of computing the optimal choice out of X + N options. Right, like if the utility of an arbitrary choice follows anything like a normal distribution, then as N increases, we expect U(N+X) to have tinier and tinier advantages over U(N), because N choices already cover most of the distribution, so it’s unlikely that an even better choice is available within the X additional choices you look at, and even if you do find a better choice, it’s probably only slightly better. Yet for most humans, computing the best choice out of N+X options is more costly than computing the best choice for only N options, because you start to lose track of the details of the various options you’re considering as you add more and more possibilities to the list, and the list starts to feel boring or overwhelming, so it gets harder to focus. So there’s sort of a natural stopping point where the cost of considering X additional options can be confidently predicted to outweigh the expected benefit of considering X additional options, and when you reach that point, you should stop and pick the best choice you’ve already researched.
I like having access to at least some higher-order freedoms because I enjoy the sensation of planning and working toward long-term goal, but I don’t understand why the order of a freedom is important enough to justify orienting our entire system of ethics around it. Right, like, I can imagine some extremely happy futures where everyone has stable access to dozens of high-quality choices in all areas of their lives, but, sadly, none of those choices exceed order 4, and none of them ever will. I think I’d take that future over our present and be quite grateful for the exchange. On the other hand, I can imagine some extremely dark futures where the order of choices is usually increasing for most people, because, e.g., they’re becoming steadily smarter and/or more resilient and they live in a complicated world, but they’re trapped in a kind of grindy hellscape where they have to constantly engage in that sort of long-term planning in order to purchase moderately effective relief from their otherwise constant suffering.
So I’d question whether the order of freedoms is (a) one interesting heuristic that is good to look at when considering possible futures, or (b) actually the definition of what it would mean to win. If it’s (b), I think you have some more explaining to do.

Mass_Driver May 12, 2020, 9:38 PM
8 points
0
on: The Technique Taboo
I agree with this post. I’d add that from what I’ve seen of medical school (and other high-status vocational programs like law school, business school, etc.), there is still a disproportionate emphasis on talking about the theory of the subject matter vs. building skill at the ultimate task. Is it helpful to memorize the names of thousands of arteries and syndromes and drugs in order to be a doctor? Of course. Is that *more* helpful than doing mock patient interviews and mock chart reviews and live exercises where you try to diagnose a tumor or a fracture or a particular kind of pus? Is it *so* much more helpful that it makes sense to spend 40x more hours on biochemistry than on clinical practice? Because my impression of medical school is that you do go on clinical rounds and do internships and things, but that the practical side of things is mostly a trial-by-fire where you are expected to improvise many of your techniques, often after seeing them demonstrated only once or twice, often with minimal supervision, and usually with little or no coaching or after-the-fact feedback. The point of the internships and residencies seems to be primarily to accomplish low-prestige medical labor, not primarily to help medical students improve their skills.
I’d be curious to hear from anyone who disagrees with me about medical school. I’m not super-confident about this assessment of medical school; I’m much more confident that an analogous critique applies well to law school and business school. Lawyers learn the theory of appellate decision-making, not how to prepare a case for trial or negotiate a settlement or draft a contract. MBAs learn economics and financial theory, not how to motivate or recruit or evaluate their employees.
As far as *why* we don’t see more discussion about how to improve technique, I think part of it is just honest ignorance. Most people aren’t very self-reflective and don’t think very much about whether they’re good at their jobs or what it means to be good at their jobs or how they could become better. Even when people do take time to reflect on what makes a good [profession], they may not have the relevant background to draw useful conclusions. Academic authorities often have little or no professional work experience; the median law professor has tried zero lawsuits; the median dean of a business school has never launched a startup; the median medical school lecturer has never worked as a primary care physician in the suburbs.
Some of it may be, as Isnasene points out, a desire to avoid unwanted competition. If people are lazy and want to enjoy high status that they earned a long time ago without putting in further effort, they might not want to encourage comparisons of skill levels.
Finally, as Isusr suggests, some of the taboo probably comes from an effort to preserve a fragile social hierarchy, but I don’t think the threat is “awareness of internal contradictions;” I think the threat is simply a common-sense idea of fairness or equity. If authorities or elites are no more objectively skillful than a typical member of their profession, then there is little reason for them to have more power, more money, or easier work. Keeping the conversation firmly fixed on discussion *about* the profession (rather than discussion about *how to do* the profession) helps obscure the fact that the status of elites is unwarranted.

Mass_Driver Mar 10, 2018, 5:40 AM
5 points
0
on: The abruptness of nuclear weapons
I like the style of your analysis. I think your conclusion is wrong because of wonky details about World War 2. 4 years of technical progress at anything important, delivered for free on a silver platter, would have flipped the outcome of the war. 4 years of progress in fighter airplanes means you have total air superiority and can use enemy tanks for target practice. 4 years of progress in tanks means your tanks are effectively invulnerable against their opponents, and slice through enemy divisions with ease. 4 years of progress in manufacturing means you outproduce your opponent 2:1 at the front lines each and overwhelm them with numbers. 4 years of progress in cryptography means you know your opponent’s every move and they are blind to your strategy.
Meanwhile, the kiloton bombs were only able to cripple cities “in a single mission” because nobody was watching out for them. Early nukes were so heavy that it’s doubtful whether the slow clumsy planes that carried them could have arrived at their targets against determined opposition.
There is an important sense in which fission energy is discontinuously better than chemical energy, but it’s not obvious that this translates into a discontinuity in strategic value per year of technological progress.

Mass_Driver May 26, 2017, 7:51 AM
40 points
0
on: Dragon Army: Theory & Charter (30min read)
1) I agree with the very high-level point that there are lots of rationalist group houses with flat / egalitarian structures, and so it might make sense to try one that’s more authoritarian to see how that works. Sincere kudos to you for forming a concrete experimental plan and discussing it in public.

2) I don’t think I’ve met you or heard of you before, and my first impression of you from your blog post is that you are very hungry for power. Like, you sound like you would really, really enjoy being the chief of a tribe, bossing people around, having people look up to you as their leader, feeling like an alpha male, etc. The main reason this makes me uncomfortable is that I don’t see you owning this desire anywhere in your long post. Like, if you had said, just once, “I think I would enjoy being a leader, and I think you might enjoy being led by me,” I would feel calmer. Instead I’m worried that you have convinced yourself that you are grudgingly stepping up as a leader because it’s necessary and no one else will. If you’re not being fully honest about your motivations for nominating yourself to be an authoritarian leader, what else are you hiding?

3) Your post has a very high ratio of detailed proposals to literature review. I would have liked to see you discuss other group houses in more detail, make reference to articles or books or blog posts about the theory of cohousing and of utopian communities more generally, or otherwise demonstrate that you have done your homework to find out what has worked, what has not worked, and why. None of your proposals sound obviously bad to me, and you’ve clearly put some thought and care into articulating them, but it’s not clear whether your proposals are backed up by research, or whether you’re just reasoning from your armchair.

4) Why should anyone follow you on an epic journey to improve their time management skills if you’re sleep-deprived and behind schedule on writing a blog post? Don’t you need to be more or less in control of your own lifestyle before you can lead others to improve theirs?

Mass_Driver Dec 23, 2016, 10:08 AM
1 point
0
on: Expecting Short Inferential Distances

And if you think you can explain the concept of “systematically underestimated inferential distances” briefly, in just a few words, I’ve got some sad news for you...

“I know [evolution] sounds crazy—it didn’t make sense to me at first either. I can explain how it works if you’re curious, but it will take me a long time, because it’s a complicated idea with lots of moving parts that you probably haven’t seen before. Sometimes even simple questions like ‘where did the first humans come from?’ turn out to have complicated answers.”

Mass_Driver Dec 14, 2016, 1:28 AM
2 points
0
in reply to: Qiaochu_Yuan’s comment on: Further discussion of CFAR’s focus on AI safety, and the good things folks wanted from “cause neutrality”
I am always trying to cultivate a little more sympathy for people who work hard and have good intentions! CFAR staff definitely fit in that basket. If your heart’s calling is reducing AI risk, then work on that! Despite my disappointment, I would not urge anyone who’s longing to work on reducing AI risk to put that dream aside and teach general-purpose rationality classes.

That said, I honestly believe that there is an anti-synergy between (a) cultivating rationality and (b) teaching AI researchers. I think each of those worthy goals is best pursued separately.

Mass_Driver Dec 13, 2016, 8:25 AM
4 points
0
in reply to: Qiaochu_Yuan’s comment on: Further discussion of CFAR’s focus on AI safety, and the good things folks wanted from “cause neutrality”
Yeah, that pretty much sums it up: do you think it’s more important for rationalists to focus even more heavily on AI research so that their example will sway others to prioritize FAI, or do you think it’s more important for rationalists to broaden their network so that rationalists have more examples to learn from?

Shockingly, as a lawyer who’s working on homelessness and donating to universal income experiments, I prefer a more general focus. Just as shockingly, the mathematicians and engineers who have been focusing on AI for the last several years prefer a more specialized focus. I don’t see a good way for us to resolve our disagreement, because the disagreement is rooted primarily in differences in personal identity.

I think the evidence is undeniable that rationality memes can help young, awkward engineers build a satisfying social life and increase their productivity by 10% to 20%. As an alum of one of CFAR’s first minicamps back in 2011, I’d hoped that rationality would amount to much more than that. I was looking forward to seeing rationalist tycoons, rationalist Olympians, rationalist professors, rationalist mayors, rationalist DJs. I assumed that learning how to think clearly and act accordingly would fuel a wave of conspicuous success, which would in turn attract more resources for the project of learning how to think clearly, in a rapidly expanding virtuous cycle.

Instead, five years later, we’ve got a handful of reasonably happy rationalist families, an annual holiday party, and a couple of research institutes dedicated to pursuing problems that, by definition, will provide no reliable indicia of their success until it is too late. I feel very disappointed.

Mass_Driver Dec 13, 2016, 5:13 AM
1 point
0
in reply to: Qiaochu_Yuan’s comment on: Further discussion of CFAR’s focus on AI safety, and the good things folks wanted from “cause neutrality”
Well, like I said, AI risk is a very important cause, and working on a specific problem can help focus the mind, so running a series of AI-researcher-specific rationality seminars would offer the benefit of (a) reducing AI risk, (b) improving morale, and (c) encouraging rationality researchers to test their theories using a real-world example. That’s why I think it’s a good idea for CFAR to run a series of AI-specific seminars.

What is the marginal benefit gained by moving further along the road to specialization, from “roughly half our efforts these days happen to go to running an AI research seminar series” to “our mission is to enlighten AI researchers?” The only marginal benefit I would expect is the potential for an even more rapid reduction in AI risk, caused by being able to run, e.g., 4 seminars a quarter for AI researchers, instead of 2 for AI researchers and 2 for the general public. I would expect any such potential to be seriously outweighed by the costs I describe in my main post (e.g., losing out on rationality techniques that would be invented by people who are interested in other issues), such that the marginal effect of moving from 50% specialization to 100% specialization would be to increase AI risk. That’s why I don’t want CFAR to specialize in educating AI researchers to the exclusion of all other groups.

Mass_Driver Dec 12, 2016, 4:23 PM
9 points
0
on: Further discussion of CFAR’s focus on AI safety, and the good things folks wanted from “cause neutrality”
I dislike CFAR’s new focus, and I will probably stop my modest annual donations as a result.

In my opinion, the most important benefit of cause-neutrality is that it safeguards the integrity of the young and still-evolving methods of rationality. If it is official CFAR policy that reducing AI risk is the most important cause, and CFAR staff do almost all of their work with people who are actively involved with AI risk, and then go and do almost all of their socializing with rationalists (most of whom also place a high value on reducing AI risk), then there will be an enormous temptation to discover, promote, and discuss only those methods of reasoning that support the viewpoint that reducing AI risk is the most important value. This is bad partly because it might stop CFAR from changing its mind in the face of new evidence, but mostly because the methods that CFAR will discover (and share with the world) will be stunted—students will not receive the best-available cognitive tools; they will only receive the best-available cognitive tools that encourage people to reduce AI risk. You might also lose out on discovering methods of (teaching) rationality that would only be found by people with different sorts of brains—it might turn out that the sort of people who strongly prioritize friendly AI think in certain similar ways, and if you surround yourself with only those people, then you limit yourself to learning only what those people have to teach, even if you somehow maintain perfect intellectual honesty.

Another problem with focusing exclusively on AI risk is that it is such a Black Swan-type problem that it is extremely difficult to measure progress, which in turn makes it difficult to assess the value or success of any new cognitive tools. If you work on reducing global warming, you can check the global average temperature. More importantly, so can any layperson, and you can all evaluate your success together. If you work on reducing nuclear proliferation for ten years, and you haven’t secured or prevented a single nuclear warhead, then you know you’re not doing a good job. But how do you know if you’re failing to reduce AI risk? Even if you think you have good evidence that you’re making progress, how could anyone who’s not already a technical expert possibly assess that progress? And if you propose to train all of the best experts in your methods, so that they learn to see you as a source of wisdom, then how many of them will retain the capacity to accuse you of failure?

I would not object to CFAR rolling out a new line of seminars that are specifically intended for people working on AI risk—it is a very important cause, and there’s something to be gained in working on a specific problem, and as you say, CFAR is small enough that CFAR can’t do it all. But what I hear you saying that the mission is now going to focus exclusively on reducing AI risk. I hear you saying that if all of CFAR’s top leadership is obsessed with AI risk, then the solution is not to aggressively recruit some leaders who care about other topics, but rather to just be honest about that obsession and redirect the institution’s policies accordingly. That sounds bad. I appreciate your transparency, but transparency alone won’t be enough to save the CFAR/MIRI community from the consequences of deliberately retreating into a bubble of AI researchers.

LINK: Quora brainstorms strategies for containing AI risk

Mass_DriverMay 26, 2016, 4:32 PM

10 points

1 comment1 min readLW link

Mass_Driver Feb 19, 2016, 5:03 PM
2 points
0
in reply to: RomeoStevens’s comment on: Rationality Quotes Thread February 2016
Does anyone know what happened to TC Chamberlin’s proposal? In other words, shortly after 1897, did he in fact manage to spread better intellectual habits to other people? Why or why not?

Mass_Driver Aug 14, 2015, 10:04 PM
0 points
0
in reply to: EngineerofScience’s comment on: Help Build a Landing Page for Existential Risk?
Thank you! I see that some people voted you down without explaining why. If you don’t like someone’s blurb, please either contribute a better one or leave a comment to specifically explain how the blurb could be improved.

Mass_Driver Aug 9, 2015, 4:44 PM
0 points
0
in reply to: EngineerofScience’s comment on: Help Build a Landing Page for Existential Risk?
Sure!

Mass_Driver Aug 7, 2015, 6:38 PM
1 point
0
in reply to: Dorikka’s comment on: Help Build a Landing Page for Existential Risk?
Again, fair point—if you are reading this, and you have experience designing websites, and you are willing to donate a couple of hours to build a very basic website, let us know!

Mass_Driver Aug 7, 2015, 6:36 PM
4 points
0
in reply to: CCC’s comment on: Help Build a Landing Page for Existential Risk?
Sounds good to me. I’ll keep an eye out for public domain images of the Earth exploding. If the starry background takes up enough of the image, then the overall effect will probably still hit the right balance between alarm and calm.

A really fun graphic would be an asteroid bouncing off a shield and not hitting Earth, but that might be too specific.

Mass_Driver Aug 7, 2015, 6:34 PM
0 points
0
in reply to: EngineerofScience’s comment on: Help Build a Landing Page for Existential Risk?
Great! Pick one and get started, please. If you can’t decide which one to do, please do asteroids.

Mass_Driver Aug 7, 2015, 6:33 PM
1 point
0
in reply to: EngineerofScience’s comment on: Help Build a Landing Page for Existential Risk?
It would go to the best available charity that is working to fight that particular existential risk. For example, the ‘donate’ button for hostile AI might go to MIRI. The donate button for pandemics might go the Center for Disease Control, and the donate button for nuclear holocaust might go to the Global Threat Reduction Initiative. If we can’t agree on which agency is best for a particular risk, we can pick one at random from the front-runners.

If you have ideas for which charities are the best for a particular risk, please share them here! That is part of the work that needs to get done.

Mass_Driver

LINK: Quora brain­storms strate­gies for con­tain­ing AI risk

LINK: Quora brainstorms strategies for containing AI risk