Things will basically be fine regarding job loss and unemployment due to AI in the next several years and those worries are overstated
Quadratic Reciprocity
[Question] How do AI timelines affect how you live your life?
As someone with limited knowledge of AI or alignment, I found this post accessible. There were times when I thought I knew vaguely what Nate meant but would not be able to explain it so I’m recording my confusions here to come back to when I’ve read up more. (If anyone wants to answer any of these r/NoStupidQuestions questions, that would be very helpful too).
“Your first problem is that the recent capabilities gains made by the AGI might not have come from gradient descent”. This is something that comes up in response to a few of the plans. Is the idea that during training, for advanced enough AIs capabilities gains come from gradient descent and also through processing input / interacting with the world. Or is the second part only after it has finished training. What does that concretely look like in ML?
Is a lot of the disagreement about these plans just because of others finding the idea of a “sharp left turn” more unlikely than Nate or is there more agreement about that idea but the disagreement is about what proposals might give us a shot at solving it?
What might an ambitious interpretability agenda focused on the sharp left turn and the generalization problem look like besides just trying harder at interpretability?
Another explanation of the “sharp left turn” would also be really helpful to me. At the moment, it feels like I can only explain why that happens by using analogies to humans/apes rather than being able to give a clear explanation for why we should expect that by default, using ML/alignment language.
EAs and rationalists should strongly consider having lots more children than they currently are
Is your “alignment research experiments I wish someone would run” list shareable :)
I attended an AI pause protest recently and thought I’d write up what my experience was like for people considering going to future ones.
I hadn’t been to a protest ever before and didn’t know what to expect. I will probably attend more in the future.
Some things that happened:
There were about 20ish people protesting. I arrived a bit after the protest had begun and it was very easy and quick to get oriented. It wasn’t awkward at all (and I’m normally pretty socially anxious and awkward). The organisers had flyers printed out to give away and there were some extra signs I could hold up.
I held up a sign for some of the protest and tried handing out flyers the rest of the time. I told people who passed by that we were talking about the danger from AI and if they’d like a flyer. Most of them declined but a substantial minority accepted the flyer.
I got the sense that a lot of people who picked up a flyer weren’t just doing it to be polite. For example, I had multiple people walking by mention to me that they agreed with the protest. A person in a group of friends who walked by looked at the flyer and mentioned to their friends that they thought it was cool someone was talking about this.
There were also people who got flyers who misunderstood or didn’t really care for what we were talking about. For example, a mother pointed at the flyer and told her child “see, this is why you should spend less time on your phone.”
I think giving out the flyers was a good thing overall. Some people seemed genuinely interested. Others, even those who rejected it, were pretty polite. Felt like a wholesome experience. If I had planned more for the protest, I think I would have liked to print my own flyers, I also considered adding contact details to the flyers in case people wanted to talk about the content. It would have been interesting to get a better sense of what people actually thought.
During the protest, a person was using a megaphone to talk about AI risk and there were chants and a bit of singing at the end. I really liked the bit at the end, it felt a bit emotional for me in a good way and I gave away a large fraction of the flyers near the end when more people stopped by to see what was going on.
I overheard some people talk about wanting to debate us. I was sad I didn’t get the chance to properly talk to them (plausibly I could have started a conversation while they were waiting for the pedestrian crossing lights to turn green). I think at a future protest, I would like to have a “debate me” or “ask me questions” sign to be able to talk to people in more depth rather than just superficially.
It’s hard to give people a pitch for AI risk in a minute
I feel more positive about AI pause advocacy after the protest, though I do feel uneasy because of not having total control of the pause AI website and the flyers. It still feels roughly close to my views though.
I liked that there were a variety of signs at the protest, representing a wider spectrum of views than just the most doomy ones. Something about there being multiple people with whom I would probably disagree a lot with being there made it feel nicer.
Lots more people are worried about job loss than extinction and want to hear about that. The economist in me will not stop giving them an optimistic picture of AI and employment before telling them about extinction. This is hard to do when you only have a couple of minutes but it feels good being honest about my actual views.
Things I wish I’d known in advance:
It’s pretty fun talking to strangers! A person who was there briefly asked about AI risk, I suggested podcast episodes to him, and he invited me to a Halloween party. It was cool!
I did have some control over when I was photographed and could choose to not be in photos that might be on Twitter if I didn’t feel comfortable with that yet.
I could make my own signs or flyers that represented my views accurately (though it’s still good to have the signs not have many words)
Reflections on bay area visit
GPT-4 generated TL;DR (mostly endorsed but eh):
The beliefs of prominent AI safety researchers may not be as well-founded as expected, and people should be cautious about taking their beliefs too seriously.
There is a tendency for people to overestimate their own knowledge and confidence in their expertise.
Social status plays a significant role in the community, with some individuals treated like “popular kids.”
Important decisions are often made in casual social settings, such as lunches and parties.
Geographical separation of communities can be helpful for idea spread and independent thought.
The community has a tendency to engage in off-the-cuff technical discussions, which can be both enjoyable and miscalibrated.
Shared influences, such as Eliezer’s Sequences and HPMOR, foster unique and enjoyable conversations.
The community is more socially awkward and tolerant of weirdness than other settings, leading to more direct communication.
I was recently in Berkeley and interacted a bunch with the longtermist EA / AI safety community there. Some thoughts on that:
I changed my mind about how much I should trust the beliefs of prominent AI safety researchers. It seems like they have thought less deeply about things to arrive at their current beliefs and are less intimidatingly intelligent and wise than I would have expected. The problem isn’t that they’re overestimating their capabilities and how much they know but that some newer people take the more senior people’s beliefs and intuitions more seriously than they should.
I noticed that many people knew a lot about their own specific area and not as much about others’ work as I would have expected. This observation makes me more likely to point out when I think someone is missing something instead of assuming they’ve read the same things I have and so already accounted for the thing I was going to say.
It seemed like more people were overconfident about the things they knew. I’m not sure if that is necessarily bad in general for the community; I suspect pursuing fruitful research directions often means looking overconfident to others because you trust your intuitions and illegible models over others’ reasoning. However, from the outside, it did look like people made confident claims about technical topics that weren’t very rigorous and that I suspect would fall apart when asked to actually clarify things further. I sometimes heard claims like “I’m the only person who understands X” where X was some hot topic related to AI safety followed by some vague description about X which wasn’t very compelling on its own.
What position or status someone has in the community doesn’t track their actual competence or expertise as much as I would have expected and is very affected by how and when they got involved in the community.
Social status is a big thing, though more noticeable in settings where there are many very junior people and some senior researchers. I also got the impression that senior people were underestimating how seriously people took the things they said, such as off-the-cuff casual remarks about someone’s abilities, criticism of someone’s ideas, and random hot takes they hadn’t thought about for too long. (It feels weird to call them “senior” people when everyone’s basically roughly the same age.)
In some ways, it felt like a mild throwback to high school with there being “popular kids” that people wanted to be around, and also because of how prevalent gossiping about the personal lives of those people is.
Important decisions are made in very casual social settings like over lunch or at random parties. Multiple people mentioned they primarily go to parties or social events for professional reasons. Things just seem more serious/“impactful”. It sometimes felt like I was being constantly evaluated especially on intelligence even while trying to just have enjoyable social interactions, though I did manage to find social environments in the end that did not feel this way, or possibly I just stopped being anxious about that as much.
It possibly made it more difficult for me to switch off the part of my brain that thinks constantly about AI existential risk.
I think it is probably quite helpful to have multiple communities separated geographically to allow ideas to spread. I think my being a clueless outsider with limited knowledge of what various people thought of various other people’s work made it easier for me to form my own independent impressions.
Good parts
The good parts were that it was easier to have more technical conversations that assumed lots of context even while at random parties which is sometimes enjoyable for me and something I now miss. Though I wish a greater proportion of them had been about fun mathy things in general rather than just things directly relevant to AI safety.
It also felt like people stated their off-the-cuff takes on technical topics (eg: random areas of biology) a lot more than usual. This was a bit weird for me in the beginning when I was experiencing deep imposter syndrome because I felt like they knew a lot about the thing they were talking about. Once I realised they did not, this was a fun social activity to participate in. Though I think some people take it too far and are miscalibrated about how correct their armchair thinking is on topics they don’t have actual expertise in.
I also really enjoyed hanging out with people who had been influenced by some of the same things I had been influenced by such as Eliezer’s Sequences and HPMOR. It felt like there were some fun conversations that happened there as a result that I wouldn’t be able to have with most people.
There was also noticeably slightly more social awkwardness in general which was great for me as someone who doesn’t have the most elite social skills in normal settings. It felt like people were more tolerant of some forms of weirdness. It also felt like once I got back home, I was noticeably more direct in the way I communicated (a friend mentioned this) as a result of the bay area culture. I also previously thought some bay area people were a bit rude and unapproachable, having only read their interactions on the internet but I think this was largely just caused by it being difficult to convey tone via text, especially when you’re arguing with someone. People were more friendly, approachable, and empathetic in real life than I assumed and now I view the interactions I have with them online somewhat differently.
Cullen O’Keefe also no longer at OpenAI (as of last month)
This was a somewhat emotional read for me.
When I was between the ages of 11-14, I remember being pretty intensely curious about lots of stuff. I learned a bunch of programming and took online courses on special relativity, songwriting, computer science, and lots of other things. I liked thinking about maths puzzles that were a bit too difficult for me to solve. I had weird and wild takes on things I learned in history class that I wanted to share with others. I liked looking at ants and doing experiments on their behaviour.
And then I started to feel like all my learning and doing had to be directed at particular goals and this sapped my motivation and curiosity. I am regaining some of it back but it does feel like my ability to think in interesting and fun directions has been damaged. It’s not just the feeling of “I have to be productive” that was very bad for me but also other things like wanting to have legible achievements that I could talk about (trying to learn more maths topics off a checklist instead of exploring and having fun with the maths I wanted to think about) and some anxiety around not knowing or being able to do the same things as others (not trying my hand at thinking about puzzles/questions I think I’ll fail at and instead trying to learn “important” things I felt bored/frustrated by because I wanted to feel more secure about my knowledge/intelligence when around others who knew lots of things).
In my early attempts to fix this, I tried to force playful thinking and this frame made things worse. Because like you said my mind already wants to play. I just have to notice and let it do that freely without judgment.
Was having an EA conversation with some uni group organisers recently and it was terrifying to me that a substantial portion of them, in response to FTX, wanted to do PR for EA (implied in for eg supporting putting out messages of the form “EA doesn’t condone fraud” on their uni group’s social media accounts) and also that a couple of them seem to be running a naive version of consequentialism that endorsed committing fraud/breaking promises if the calculations worked out in favour of doing that for the greater good. Most interesting was that one group organiser was in both camps at once.
I think it is bad vibes that these uni students feel so emotionally compelled to defend EA, the ideology and community, from attack, and this seems plausibly really harmful for their own thinking.
I had this idea in my head of university group organisers modifying what they’re saying to be more positive about EA ideas to newcomers but thought this was a scary concern I was mostly making up but after some interactions with uni group organisers outside my bubble, this feels more important to me. People explicitly mentioned policing what they said to newcomers in order to not turn them off or give them reasons to doubt EA, and tips like “don’t criticise new people’s ideas in your first interactions with them as an EA community builder in order to be welcoming” were mentioned.
All this to say: I think some rationality ideas I consider pretty crucial for people trying to do EA uni group organising to be exposed to are not having the reach they should.
It would also be interesting to see examples of what terrible ops looks like. As one of the “kids”, here are some examples of things that were bad in previous projects I worked on (some mistakes for which I was responsible):
- getting obsessed with some bad metric (eg: number of people who come to an event) and spending lots of hours getting that number up instead of thinking about why I was doing that
- so many meetings, calling a meeting if there’s any uncertainty about what to do next
- there being uncertainty about what to do next because team members lack context, don’t know who’s responsible for what, who is working on what, etc—
some people taking on too much responsibility and not being able to pass it on because having to explain how to do a task to someone else would itself take up too much time—
very disorganised meetings where everyone wanted to have a say and it wasn’t clear at the end of it what the action steps were—
an unwillingness for the person with the most context to take the role of explicitly telling others what concrete tasks to do (because the other team members were their friends and they didn’t want to be too bossy)
- There not being enough structure for team members to give feedback if they thought an idea or project someone else was very excited about would be useless, as a result some mini-projects getting incubated that people weren’t excited about or projects that predictably failed because the person who wanted to do them did not have enough information to help them figure out how to avoid the failure modes other team members would have been concerned about (lack of sharing intuitions really well). also, people picking projects and tasks to do for bad reasons rather than via structured thinking about priorities
- including too many people in meetings because “we’d love to get person X’s thoughts on our plans as well (and person Y and person Z...)”
- it being hard to trust that other team members would actually get things assigned to them done on time, partly because it was difficult to see partial progress without having to ask the person how things were going
- changing platforms and processes based on whims rather than figuring things out early and sticking with them
I think some of the things mentioned in the post are pretty helpful for avoiding some of these problems.
Current AI safety university groups are overall a good idea and helpful, in expectation, for reducing AI existential risk
Here are some Twitter accounts I’ve found useful to follow (in no particular order): Quintin Pope, Janus @repligate, Neel Nanda, Chris Olah, Jack Clark, Yo Shavit @yonashav, Oliver Habryka, Eliezer Yudkowsky, alex lawsen, David Krueger, Stella Rose Biderman, Michael Nielsen, Ajeya Cotra, Joshua Achiam, Séb Krier, Ian Hogarth, Alex Turner, Nora Belrose, Dan Hendrycks, Daniel Paleka, Lauro Langosco, Epoch AI Research, davidad, Zvi Mowshowitz, Rob Miles
Hopefully this isn’t too rude to say, but: I am indeed confused how you could be confused
Fwiw, I was also confused and your comment makes a lot more sense now. I think it’s just difficult to convert text into meaning sometimes.
Interesting bet on AI progress (with actual money) made in 1968:
1968 – Scottish chess champion David Levy makes a 500 pound bet with AI pioneers John McCarthy and Donald Michie that no computer program would win a chess match against him within 10 years.
1978 – David Levy wins the bet made 10 years earlier, defeating Chess 4.7 in a six-game match by a score of 4½–1½. The computer’s victory in game four is the first defeat of a human master in a tournament
In 1973, Levy wrote:
“Clearly, I shall win my … bet in 1978, and I would still win if the period were to be extended for another ten years. Prompted by the lack of conceptual progress over more than two decades, I am tempted to speculate that a computer program will not gain the title of International Master before the turn of the century and that the idea of an electronic world champion belongs only in the pages of a science fiction book.”
After winning the bet:
“I had proved that my 1968 assessment had been correct, but on the other hand my opponent in this match was very, very much stronger than I had thought possible when I started the bet.”He observed that, “Now nothing would surprise me (very much).”
In 1996, Popular Science asked Levy about Garry Kasparov’s impending match against Deep Blue. Levy confidently stated that ”...Kasparov can take the match 6 to 0 if he wants to. ‘I’m positive, I’d stake my life on it.’” In fact, Kasparov lost the first game, and won the match by a score of only 4–2. The following year, he lost their historic rematch 2.5–3.5.
So seems like he very much underestimated progress in chess despite winning the original bet.
https://en.wikipedia.org/wiki/David_Levy_(chess_player)
I thought I didn’t get angry much in response to people making specific claims. I did some introspection about times in the recent past when I got angry, defensive, or withdrew from a conversation in response to claims that the other person made.
After some introspection, I think these are the mechanisms that made me feel that way:
They were very confident about their claim. Partly I felt annoyance because I didn’t feel like there was anything that would change their mind, partly I felt annoyance because it felt like they didn’t have enough status to make very confident claims like that. This is more linked to confidence in body language and tone rather than their confidence in their own claims though both matter.
Credentialism: them being unwilling to explain things and taking it as a given that they were correct because I didn’t have the specific experiences or credentials that they had without mentioning what specifically from gaining that experience would help me understand their argument.
Not letting me speak and interrupting quickly to take down the fuzzy strawman version of what I meant rather than letting me take my time to explain my argument.
Morality: I felt like one of my cherished values was being threatened.
The other person was relatively smart and powerful, at least within the specific situation. If they were dumb or not powerful, I would have just found the conversation amusing instead.
The other person assumed I was dumb or naive, perhaps because they had met other people with the same position as me and those people came across as not knowledgeable.
The other person getting worked up, for example, raising their voice or showing other signs of being irritated, offended, or angry while acting as if I was the emotional/offended one. This one particularly stings because of gender stereotypes. I think I’m more calm and reasonable and less easily offended than most people. I’ve had a few conversations with men where it felt like they were just really bad at noticing when they were getting angry or emotional themselves and kept pointing out that I was being emotional despite me remaining pretty calm (and perhaps even a little indifferent to the actual content of the conversation before the conversation moved to them being annoyed at me for being emotional).
The other person’s thinking is very black-and-white, thinking in terms of a very clear good and evil and not being open to nuance. Sort of a similar mechanism to the first thing.
Some examples of claims that recently triggered me. They’re not so important themselves so I’ll just point at the rough thing rather than list out actual claims.
AI killing all humans would be good because thermodynamics god/laws of physics good
Animals feel pain but this doesn’t mean we should care about them
We are quite far from getting AGI
Women as a whole are less rational than men are
Palestine/Israel stuff
Doing the above exercise was helpful because it helped me generate ideas for things to try if I’m in situations like that in the future. But it feels like the most important thing is to just get better at noticing what I’m feeling in the conversation and if I’m feeling bad and uncomfortable, to think about if the conversation is useful to me at all and if so, for what reason. And if not, make a conscious decision to leave the conversation.
Reasons the conversation could be useful to me:
I change their mind
I figure out what is true
I get a greater understanding of why they believe what they believe
Enjoyment of the social interaction itself
I want to impress the other person with my intelligence or knowledge
Things to try will differ depending on why I feel like having the conversation.
If we’re thinking about the same “very, very, very high-level person at OpenAI”, it does seem like this person now buys that inner alignment is a thing and is concerned about it (or says he’s concerned). It is scary because people at these AI labs don’t know all that much about AI alignment but also hopeful because they don’t seem to disagree with it and maybe just need to be given the arguments in a good way by someone they would listen to?
From the comment thread:
I’m not a fan of *generic* regulation-boosting. Like, if I just had a megaphone to shout to the world, “More regulation of AI!” I would not use it. I want to do more targeted advocacy of regulation that I think is more likely to be good and less likely to result in regulatory-captureWhat are specific regulations / existing proposals that you think are likely to be good? When people are protesting to pause AI, what do you want them to be speaking into a megaphone (if you think those kinds of protests could be helpful at all right now)?
I am constantly flipping back and forth between “I have terrible social skills” and “People only think I am smart and competent because I have charmed them with my awesome social skills”.
It is very unlikely AI causes an existential catastrophe (Bostrom or Ord definition) but doesn’t result in human extinction. (That is, non-extinction AI x-risk scenarios are unlikely)