My name is Alex Turner. I’m a research scientist at Google DeepMind on the Scalable Alignment team. My views are strictly my own; I do not represent Google. Reach me at alex[at]turntrout.com
TurnTrout
I agree that HPMOR may be the best way to get someone to want read the initially opaque-seeming Sequences: “what if my thought processes were as clear as Rational!Harry’s?”. But the issue then becomes how to send a credible signal about why HPMOR is more than a fun read for those who have less to do, especially for individuals who don’t already read regularly (which was me at the time; luckily, I have a slightly addictive personality and got sucked in).
My little brother will be entering college soon, so I gave him the gift I wish I had received at that age: a set of custom-printed HPMOR tomes. I think this is a stronger signal, but it‘s too costly (and probably strange) to do this for people with whom we aren’t as close.
I agree, and I realized this a bit after leaving my keyboard. The problem is that we don’t have enough people doing this kind of outreach, in my opinion. It might be a good idea to get more people doing pretty good outreach than just have a few doing great outreach.
The other question is how hard it is to find people like me—constant effort for a very low probability outcome could be suboptimal compared to just spending more time on the problem ourselves. I don’t think we’re there yet, but it’s something to consider.
can confirm, that’s what I had in mind (at least in my case).
Interpersonal Approaches for X-Risk Education
Nitpick: I’m not yet working on alignment. Also, if someone had given me Superintelligence a year ago, I probably would have fixated on all the words I didn’t know instead of taking the problem seriously. They might become aware of the problem, and maybe even work on it—but as habryka pointed out, they wouldn’t be using rationalist methodology.
Edit: a lot of the value of reading Superintelligence came from having to seriously consider the problem for an extended period of time. I had already read CEV, WaitButWhy, IntelligenceExplosion, and various LW posts about malevolent genies, but it still hadn’t reached the level of “I, personally, want and need to take serious action on this”. I find it hard to imagine that someone could simply pick up Superintelligence and skip right to this state of mind, but maybe I’m generalizing too much from my situation.
I think that we could increase the proportion of LWers actually doing something about this via positive social expectation: peer-centric goal-setting and feedback. Positive social expectation (as I’ve taken to calling it) is what happens when you agree to meet a friend at the gym at 5 - you’re much more likely to honor a commitment to a friend than one to yourself. I founded a student group to this effect at my undergrad and am currently collaborating with my university (I highly recommend reading the writeup) to implement it on a larger scale.
Basically, we could have small groups of people checking in once a week for half an hour. Each person briefly summarizes their last week and what they want to do in the next week; others can share their suggestions. Everyone sets at least one habit goal (stop checking email more than once a day) and one performance goal (read x chapters of set theory, perhaps made possible by improved methodology suggested by more-/differently-experienced group members).
I believe that the approach has many advantages over having people self-start:
lowered psychological barrier to getting started on x-risk (all they have to do is join a group; they see other people who aren’t (already) supergeniuses like Eliezer doing work, so they feel better about their own influence)
higher likelihood of avoiding time / understanding sinks (bad / unnecessary textbooks)
increased instrumental rationality
lower likelihood of burnout / negative affect spirals / unsustainable actions being taken
a good way to form friendships in the LW community
robust way to get important advice (not found in the Sequences) to newer people that may not be indexed under the keywords people initially think to search for
The downside is the small weekly time commitment.
I’ll probably make a post on this soon, and perhaps even a sequence on Actually Getting Started (as I’ve been reorganizing my life with great success).
I think the social commitment device can be useful to get started, but I think you should very rapidly try to evolve such that you don’t need it
I agree. At uni, the idea is that it gets people into a framework where they’re able to get started, even if they aren’t self-starters. Here, one of the main benefits would be that people at various stages of the pipeline could share what worked and what didn’t. For example, knowing that understanding one textbook is way easier if you’ve already learned a prereq is valuable information, and that doesn’t seem to always be trivially-knowable ex ante. The onus is less on the social commitment and more on the “team of people working to learn AI Safety fundamentals”.
I think x-risk really desperately needs people who already have the “I can self-start on my own” and “I can think usefully for myself” properties.
Agreed. I’m not looking to make the filter way easier to pass, but rather to encourage people to keep working. “I can self-start” is necessary, but I don’t think we can expect everyone to be able to self-motivate indefinitely in the face of a large corpus of unfamiliar technical material. Sure, a good self-starter will reboot eventually, but it’s better to have lightweight support structures that maintain a smooth rate of progress.
Additionally, my system 1 intuition is that there are people close to the self-starter threshold who are willing to work on safe AI, and that these people can be transformed into grittier, harder workers with the right structure. Maybe that’s not even worth our time, but it’s important to keep in mind the multiplicative benefits possible from actually getting more people involved. I could also be falling prey to the typical mind fallacy, as I only got serious when my worry overcame my doubts of being able to do anything.
And if it then turns out that the person they’re basically mentoring doesn’t have the “I can self start, self motivate, and think for myself” properties, then the mentor hasn’t gained an ally—they’ve gained a new obligation to take care of, or spend energy checking in on, or they just wasted their time.
Perhaps a more beneficial structure than “one experienced person receives lots of obligations” could be “three pairs of people (all working on learning different areas of the syllabus at any given time) share insights they picked up in previous iterations”. Working in pairs could spike efficiency relative to working alone due to each person making separate mistakes; together, they smooth over rough spots in their learning. I remember this problem being discussed in a post a few years back about how most of the poster’s autodidact problems were due to trivial errors that weren’t easily fixable by someone not familiar with that specific material.
One more thought -
The AI safety field keeps having people show up who say “I want to help”, and then it turns out not to be that easy to help, so those people sort of shrug and go back to their lives.
I think this can be nearly completely solved by using a method detailed in Decisive—expectation-setting. I remember that employers found that warning potential employees of the difficulty and frustration involved with the job, retention skyrocketed. People (mostly) weren’t being discouraged from the process, but having their expectations set properly actually made them not mind the experience.
I haven’t reached out to anyone yet, primarily because I imagined that they (Luke, Eliezer, etc) receive many of these kinds of “I’m super excited to help, what can I do?” emails and pattern-matched that onto “annoying person who didn’t read the syllabus”. What has your experience been?
(I live in Oregon)
I think that could be a good idea, too. The concern is whether there is substantial meaningful (and non-technical) discussion left to be had. I haven’t been on LW very long, but in my time here it has seemed like most people agree on FAI being super important. This topic seems (to me) to have already been well-discussed, but perhaps that was in part because I was searching out that content.
For many people, the importance hasn’t reached knowing, on a gut level, that unfriendly AI can plausibly, and will probably (in longer timescales, at the very minimum), annihilate everything we care about, to a greater extent than even nuclear war. That is—if we don’t act. That’s a hell of a claim to believe, and it takes even more to be willing to do something.
I don’t have a bug fix story yet, but I don’t think I just speak for myself when I say my biggest (non-x-risk) fear is becoming and achieving nothing compared to my potential.
I also generally place great trust in my future self (as my rationality gradient has been positive for quite some time) and firm trust in my immediate past self (allowing me to precommit to emotionally fraught decisions, like a difficult breakup, wherein my in-the-moment regret might otherwise overwhelm a sound judgment).
I’m sorry I couldn’t answer the main question, but I’m definitely appreciating this sequence so far.
One thing to be aware of is you probably need to make the reading of the notes intentional, as once they stop being new, you’ll stop noticing them.
meta-TAP: wake up and read post-its
I can’t make it in person due to school commitments, but I’m extremely interested
I’d probably be able to make that! Depends how long I could get off work and whether I’m able to make a CFAR workshop like I planned
I wouldn’t mind having that kind of group. I’d love to have LW friends
Will just email about this.
How should we connect? Do you get notifications for private messages? I don’t know how see conversations.
I think this post is great, but I personally already have most of this implemented. My room is clean and well-decorated, my computer cabling (meaning both internal and peripheral) is well-maintained, I even keep my Desktop totally clear. My programs update automatically using Chocolatey, backups occur automatically, and antivirus takes care of itself via scheduled tasks.
My space feels pretty well optimized, but perhaps I’m missing yet further applications.
Chocolatey is package management for Windows. If you install your software via its command-line, you can set a daily system task to run a script and update everything (‘cup -all y’, I think).
Beyond that, I think you covered it!
Regarding 3), both of the AI-minded professors I spoke to at my university dismissed AI alignment work due to this epistemic issue.
Nothing related to AI safety taught here, but I’ll spend my free time at my PhD program going through MIRI’s reading list.