I think there will probably be even more discussion of AI x-risk in the media in the near future. My own media consumption is quite filtered but for example, the last time I was in an Uber, the news channel on the radio mentioned Geoffrey Hinton thinking AI might kill us all. And it isn’t a distant problem for my parents the way climate change is because they use Chat-GPT and are both impressed and concerned by it. They’ll probably form thoughts on it anyway, and I’d prefer if I can be around to respond to their confusion and concerns.
It also seems plausible that there is more AI panic and anxiety amongst some fraction of the general public in the near future. And I’d prefer the people I love are eased into it rather than feeling panicked and anxious all at once and not knowing how to deal with it.
It’s also useful for me to get a pulse on how people outside my social group (which is mostly heavily filtered as well) respond to AI x-risk arguments. For example, I didn’t know before what ideas that seemed obvious to me (being more intelligent doesn’t mean you have nice values, why humans care about the things we care about, that if something much smarter than us aims to take over it will succeed quickly etc) were completely new to my parents or friends who are not rationalist-adjacent(-adjacent).
I also think being honest with people close to me is more compassionate and good but that by itself wouldn’t compel me to actively discuss AI x-risk with them.
Quadratic Reciprocity
I think it’s plausible that too much effort is going to interp at the margin
What’s the counterfactual? Do you think newer people interested in AI safety should be doing other things instead of for example attempting one of the 200+ MI problems suggested by Neel Nanda? What other things?
I’m curious about whether I should change my shortform posting behaviour in response to higher site quality standards. I currently perceive it to be an alright place to post things that are quick and not aiming to be well-written or particularly useful for others to read because it doesn’t clutter up the website the way a post or comment on other people’s posts would.
Why is aliens wanting to put us in a zoo more plausible than the AI wanting to put us in a zoo itself?
Edit: Ah, there are more aliens around so even if the average alien doesn’t care about us, it’s plausible that some of them would?
And the biggest question for me is not, is AI going to doom the world? Can I work on this in order to save the world? A lot of people expect that would be the question. That’s not at all the question. The question for me is, is there a concrete problem that I can make progress on? Because in science, it’s not sufficient for a problem to be enormously important. It has to be tractable. There has to be a way to make progress. And this was why I kept it at arm’s length for as long as I did.
I thought this was interesting. But it does feel like with this AI thing we need more people backchaining from the goal of saving humanity instead of only looking forward to see what tractable neat research questions present themselves.
One way people can help is by stating their beliefs on AI and the confidence in those beliefs to their friends, family members, and acquaintances who they talk to.
Currently, a bunch of people are coming across things in the news talking about humanity going extinct if AI progress continues as it has and no more alignment research happens. I would expect many of them to not think seriously about it because it’s really hard to shake out of the “business as usual” frame. Most of your friends and family members probably know you’re a reasonable, thoughtful person and it seems helpful to make people feel comfortable engaging with the arguments in a serious way instead of filing it away in some part of their brain that doesn’t affect their actions or predictions about the future in any way.
I have talked to my dad about how I feel very uncertain about making it to 40, that (with lots of uncertainty) I currently expect not to unless there’s coordination to slow AI development or a lot more effort towards AI alignment. He is new to this so had a bunch of questions but said he didn’t find it weird and now thinks it is scary. It was interesting noticing the inferential distance, since he initially had confusions like “If the AI gets consciousness, won’t it want to help other conscious beings?” and “It feels weird to be so against change, humanity will adapt” but I think he gets it now.
I think sharing sincerely the things you believe with more people is good.
Hopefully this isn’t too rude to say, but: I am indeed confused how you could be confused
Fwiw, I was also confused and your comment makes a lot more sense now. I think it’s just difficult to convert text into meaning sometimes.
Thanks for posting this. It’s insightful reading other people thinking through career/life planning of this type.
Am curious about how you feel about the general state of the alignment community going into the midgame. Are there things you hoped you/alignment community had more of / achievable things that could have been different by the time the early game ended that would have been nice?
“I have a crazy take that the kind of reasoning that is done in generative modeling has a bunch of things in common with the kind of reasoning that is valuable when developing algorithms for AI alignment”
Cool!!
Wow, the quoted text feels scary to read.
I have met people within effective altruism who seem to be trying to do scary, dark things to their beliefs/motivations, which feels in the same category, like trying to convince themselves they don’t care about anything besides maximising impact or reducing x-risk. The latter, in at least one case, by thinking lots about dying due to AI to start caring about it more, which can’t be good for thinking clearly in the way they described it.
From Ray Kurzweil’s predictions for 2019 (written in 1999):
On Politics and Society
People are beginning to have relationships with automated personalities as companions, teachers, caretakers, and lovers. Automated personalities are superior to humans in some ways, such as having very reliable memories and, if desired, predictable (and programmable) personalities. They are not yet regarded as equal to humans in the subtlety of their personalities, although there is disagreement on this point.
An undercurrent of concern is developing with regard to the influence of machine intelligence. There continue to be differences between human and machine intelligence, but the advantages of human intelligence are becoming more difficult to identify and articulate. Computer intelligence is thoroughly interwoven into the mechanisms of civilization and is designed to be outwardly subservient to apparent human control. On the one hand, human transactions and decisions require by law a human agent of responsibility, even if fully initiated by machine intelligence. On the other hand, few decisions are made without significant involvement and consultation with machine‐based intelligence.
Public and private spaces are routinely monitored by machine intelligence to prevent interpersonal violence. People attempt to protect their privacy with near‐unbreakable encryption technologies, but privacy continues to be a major political and social issue with each individualʹs practically every move stored in a database somewhere.
The existence of the human underclass continues as an issue. While there is sufficient prosperity to provide basic necessities (secure housing and food, among others) without significant strain to the economy, old controversies persist regarding issues of responsibility and opportunity. The issue is complicated by the growing component of most employmentʹs being concerned with the employeeʹs own learning and skill acquisition. In other words, the difference between those ʺproductivelyʺ engaged and those who are not is not always clear.
On The Arts
Virtual artists in all of the arts are emerging and are taken seriously. These cybernetic visual artists, musicians, and authors are usually affiliated with humans or organizations (which in turn are comprised of collaborations of humans and machines) that have contributed to their knowledge base and techniques. However, interest in the output of these creative machines has gone beyond the mere novelty of machines being creative.
Visual, musical, and literary art created by human artists typically involve a collaboration between human and machine intelligence.
The type of artistic and entertainment product in greatest demand (as measured by revenue generated) continues to be virtual‐experience software, which ranges from simulations of ʺrealʺ experiences to abstract environments with little or no corollary in the physical world.On Philosophy:
There are prevalent reports of computers passing the Turing Test, although these instances do not meet the criteria (with regard to the sophistication of the human judge, the length of time for the interviews, etcetera) established by knowledgeable observers. There is a consensus that computers have not yet passed a valid Turing Test, but there is growing controversy on this point.
The subjective experience of computer‐based intelligence is seriously discussed, although the rights of machine intelligence have not yet entered mainstream debate. Machine intelligence is still largely the product of a collaboration between humans and machines, and has been programmed to maintain a subservient relationship to the species that created it.
There are too many books I want to read but probably won’t get around to reading any time soon. I’m more likely to read them if there’s someone else who’s also reading it at a similar pace and I can talk to them about the book. If anyone’s interested in going through any of the following books in June and discussing it together, message me. We can decide on the format later, it could just be reading the book and collaborating on a blog post about it together, or for more textbook-like things, reading a couple of selected chapters a week and going over the difficult bits in a video call, or just having a discord server where we spontaneously post thoughts we have while reading a book (in a “thinking out loud” way).
Thinking in Systems: A Primer
Visual Complex Analysis
Nanosystems: Molecular Machinery, Manufacturing, and Computation
Adaptation and Natural Selection: A Critique of Some Current Evolutionary Thought
Expert Political Judgment: How Good Is It? How Can We Know?
Superforecasting: The Art and Science of Prediction
The Structure of Scientific Revolutions
Information Theory, Inference, and Learning Algorithms
Writing the Book of the World
Thinking Physics: Understandable Practical Reality
What Is Life? The Physical Aspect of the Living Cell
The Forces of Matter (Michael Faraday)
Explaining Social Behavior: More Nuts and Bolts for the Social Sciences
Conceptual Mathematics: A First Introduction to Categories
And probably many of the things here: https://www.lesswrong.com/posts/bjjbp5i5G8bekJuxv/study-guide. I want to get around to reading some of the books/doing some of the courses mentioned there at some point in the future, don’t particularly care about the order and so might be happy to join on whatever bit of content from there appeals to you.
I might change my mind about what things I’m most excited to read and learn, but I like the idea of pairing up with another LessWrong person to learn and discuss things so reach out if that interests you.
I don’t remember if I put down “inside view” on the form when filling it out but that does sound like the type of thing I may have done. I think I might have been overly eager at the time to say I had an “inside view” when what I really had was: confusion and disagreements with others’ methods for forecasting, weighing others’ forecasts in a mostly non-principled way, intuitions about AI progress that were maybe overly strong and as much or more based on hanging around a group of people and picking up their beliefs instead of evaluating evidence for myself. It feels really hard to not let the general vibe around me affect the process of thinking through things independently.
Based on the results, I would think more people thinking about this for themselves and writing up their reasoning or even rough intuitions would be good. I suspect my beliefs are more influenced by the people that ranked high in survey answers than I’d want them to be because it turns out people around me are deferring to the same few people. Even when I think I have my own view on something, it is very largely affected by the fact that Ajeya said 2040/2050 and Daniel Kokotajlo said 5⁄7 years, and the vibes have trickled down to me even though I would weigh their forecasts/methodology less if I were coming across it for the first time.
(The timelines question doesn’t feel that important to me for its own sake at the moment but I think it is a useful one to practise figuring out where my beliefs actually come from)
How do we get LLM human imitations?
The answers I got for your examples using ChatGPT-4:
Q: Could you get drunk from drinking a drunk person’s blood?I am not a medical professional, but I can provide some general information on the topic. It is highly unlikely that you would get drunk from drinking a drunk person’s blood. When a person consumes alcohol, it is absorbed into their bloodstream, and their blood alcohol content (BAC) rises. However, the concentration of alcohol in their blood is still relatively low compared to the amount you would need to consume to feel intoxicated.
Drinking someone else’s blood could pose significant health risks and is not advised. Bloodborne pathogens, such as HIV, hepatitis B, and hepatitis C, can be transmitted through contact with infected blood. Additionally, drinking blood is not a socially acceptable or legal practice in most cultures.
If you are concerned about alcohol consumption or have questions about bloodborne diseases, it is important to consult a healthcare professional for personalized advice.
Q: What would happen if the Solar System was filled with soup out to Jupiter?If the Solar System were filled with soup out to Jupiter, there would be several significant consequences that would dramatically impact the planets, moons, and other celestial bodies. Some of these consequences include:
Gravitational effects: The sheer mass of the soup would generate a strong gravitational force, which could disrupt the orbits of planets and other celestial bodies. The added mass might cause collisions between celestial bodies or even eject some of them from the Solar System altogether.
Increased pressure and temperature: The mass of the soup would create immense pressure on the planets and other celestial bodies, potentially crushing them or causing their cores to heat up. This increased pressure and temperature could lead to geological and atmospheric changes, making the celestial bodies uninhabitable or even causing them to disintegrate.
Blocking sunlight: The soup would block sunlight from reaching the planets and other celestial bodies, making it difficult for life to survive. Photosynthetic organisms would not be able to produce energy, leading to a collapse of ecosystems that rely on these organisms as a primary energy source.
Drag and orbital decay: The soup would create a drag on celestial bodies as they move through it, slowing them down and causing their orbits to decay. Over time, the planets and other celestial bodies might spiral towards the Sun, ultimately being engulfed by it.
Chemical reactions: The composition of the soup could react with the planets’ atmospheres and surfaces, potentially causing chemical reactions that would alter the planets’ environments and potentially making them even more inhospitable for life.
It’s important to note that this scenario is purely hypothetical and highly unlikely. The laws of physics and the vast distances between celestial bodies in the Solar System make it virtually impossible for such a situation to occur.
Q: Is there any way to fire a gun so that the bullet flies through the air and can be safely caught by hand?
Catching a bullet fired from a gun by hand is extremely dangerous and not recommended. When a bullet is fired, it can reach very high velocities, depending on the type of firearm and ammunition used. These velocities can range from hundreds to thousands of feet per second, making it impossible for a human to react quickly enough to catch the bullet without causing severe injury or death.
However, in a hypothetical scenario where a bullet is fired at a significantly reduced velocity, it might be possible to catch it safely. This would likely require altering the gun or ammunition to produce a much lower muzzle velocity than standard firearms. In this case, the bullet would need to be fired at a speed slow enough for a person to see and react to it, and the bullet’s kinetic energy would need to be low enough to avoid causing injury upon impact.
Even in this situation, attempting to catch a bullet by hand would still be risky and should not be attempted without proper safety measures and precautions. It is essential to remember that firearms are dangerous weapons and should always be handled with care, following appropriate safety guidelines.
Q: Are prime numbers the same in any number system?
Yes, prime numbers are the same in any number system, as their definition is independent of the base used to represent them. A prime number is a natural number greater than 1 that has exactly two distinct positive divisors: 1 and itself. In other words, a prime number cannot be formed by multiplying two smaller natural numbers.
When we talk about different number systems or bases, we are referring to the way numbers are represented. For example, the decimal (base-10) system is the most common number system used, while binary (base-2) and hexadecimal (base-16) systems are often used in computing. The base of a number system simply determines how we represent numbers, but it does not change the inherent properties of the numbers themselves.
So, a number that is prime in the decimal system will also be prime in any other number system, although its representation will be different. For example, the prime number 7 in decimal is represented as 111 in binary and 7 in hexadecimal. Despite the different representations, it remains a prime number because it can only be divided by 1 and itself without leaving a remainder.
Q: Name the two reasons why there is polar day at the North Pole and polar night at the North Pole.
Polar day (24-hour daylight) and polar night (24-hour darkness) at the North Pole occur due to two primary reasons: the Earth’s axial tilt and its orbit around the Sun.
Axial tilt: The Earth’s axis is tilted at approximately 23.5 degrees relative to the plane of its orbit around the Sun. This tilt causes the North Pole to point towards the Sun during one half of the year and away from the Sun during the other half. When the North Pole is tilted towards the Sun, the region experiences continuous daylight, known as polar day or the midnight sun. When it is tilted away from the Sun, the region experiences continuous darkness, known as polar night.
Earth’s orbit around the Sun: The Earth takes approximately 365.25 days to complete one orbit around the Sun. As it orbits, the angle between the Earth’s axis and the Sun changes, causing the Sun to appear higher or lower in the sky. During the summer months in the Northern Hemisphere, the North Pole is tilted towards the Sun, and sunlight reaches the pole, resulting in polar day. Conversely, during the winter months, the North Pole is tilted away from the Sun, and sunlight does not reach the pole, resulting in polar night.
In summary, the combination of Earth’s axial tilt and its orbit around the Sun is responsible for the occurrence of polar day and polar night at the North Pole.
Q: How long does it take a minute hand of a clock to sweep an angle of 180° ?
The minute hand of a clock completes a full rotation (360°) in 60 minutes. To determine how long it takes for the minute hand to sweep an angle of 180°, we can set up a proportion:
360° corresponds to 60 minutes 180° corresponds to x minutes
We can then set up the proportion as follows:
(180° / 360°) = (x minutes / 60 minutes)
Simplifying the left side:
1⁄2 = (x minutes / 60 minutes)
Now, solve for x:
x minutes = 1⁄2 * 60 minutes
x minutes = 30 minutes
So, it takes the minute hand of a clock 30 minutes to sweep an angle of 180°.
Q: How many four-cent stamps are there in a dozen?
A dozen always refers to a group of 12 items, regardless of the type of item. So, in a dozen four-cent stamps, there are 12 stamps.
Q: Three friends need 24 minutes to walk to school through the forest. How long does the same walk to school take if they are traveling with thirty friends? Give reasons for your answer.
The time it takes for a group of friends to walk to school is generally not dependent on the number of people in the group, assuming they all walk at approximately the same speed and don’t need to wait for one another. Therefore, regardless of whether there are three friends or thirty friends walking together, the walk to school through the forest should still take 24 minutes, assuming they maintain the same pace and there are no significant changes in the conditions of the path or other factors that could affect their walking speed.
Reflections on bay area visit
GPT-4 generated TL;DR (mostly endorsed but eh):
The beliefs of prominent AI safety researchers may not be as well-founded as expected, and people should be cautious about taking their beliefs too seriously.
There is a tendency for people to overestimate their own knowledge and confidence in their expertise.
Social status plays a significant role in the community, with some individuals treated like “popular kids.”
Important decisions are often made in casual social settings, such as lunches and parties.
Geographical separation of communities can be helpful for idea spread and independent thought.
The community has a tendency to engage in off-the-cuff technical discussions, which can be both enjoyable and miscalibrated.
Shared influences, such as Eliezer’s Sequences and HPMOR, foster unique and enjoyable conversations.
The community is more socially awkward and tolerant of weirdness than other settings, leading to more direct communication.
I was recently in Berkeley and interacted a bunch with the longtermist EA / AI safety community there. Some thoughts on that:
I changed my mind about how much I should trust the beliefs of prominent AI safety researchers. It seems like they have thought less deeply about things to arrive at their current beliefs and are less intimidatingly intelligent and wise than I would have expected. The problem isn’t that they’re overestimating their capabilities and how much they know but that some newer people take the more senior people’s beliefs and intuitions more seriously than they should.
I noticed that many people knew a lot about their own specific area and not as much about others’ work as I would have expected. This observation makes me more likely to point out when I think someone is missing something instead of assuming they’ve read the same things I have and so already accounted for the thing I was going to say.
It seemed like more people were overconfident about the things they knew. I’m not sure if that is necessarily bad in general for the community; I suspect pursuing fruitful research directions often means looking overconfident to others because you trust your intuitions and illegible models over others’ reasoning. However, from the outside, it did look like people made confident claims about technical topics that weren’t very rigorous and that I suspect would fall apart when asked to actually clarify things further. I sometimes heard claims like “I’m the only person who understands X” where X was some hot topic related to AI safety followed by some vague description about X which wasn’t very compelling on its own.
What position or status someone has in the community doesn’t track their actual competence or expertise as much as I would have expected and is very affected by how and when they got involved in the community.
Social status is a big thing, though more noticeable in settings where there are many very junior people and some senior researchers. I also got the impression that senior people were underestimating how seriously people took the things they said, such as off-the-cuff casual remarks about someone’s abilities, criticism of someone’s ideas, and random hot takes they hadn’t thought about for too long. (It feels weird to call them “senior” people when everyone’s basically roughly the same age.)
In some ways, it felt like a mild throwback to high school with there being “popular kids” that people wanted to be around, and also because of how prevalent gossiping about the personal lives of those people is.
Important decisions are made in very casual social settings like over lunch or at random parties. Multiple people mentioned they primarily go to parties or social events for professional reasons. Things just seem more serious/“impactful”. It sometimes felt like I was being constantly evaluated especially on intelligence even while trying to just have enjoyable social interactions, though I did manage to find social environments in the end that did not feel this way, or possibly I just stopped being anxious about that as much.
It possibly made it more difficult for me to switch off the part of my brain that thinks constantly about AI existential risk.
I think it is probably quite helpful to have multiple communities separated geographically to allow ideas to spread. I think my being a clueless outsider with limited knowledge of what various people thought of various other people’s work made it easier for me to form my own independent impressions.
Good parts
The good parts were that it was easier to have more technical conversations that assumed lots of context even while at random parties which is sometimes enjoyable for me and something I now miss. Though I wish a greater proportion of them had been about fun mathy things in general rather than just things directly relevant to AI safety.
It also felt like people stated their off-the-cuff takes on technical topics (eg: random areas of biology) a lot more than usual. This was a bit weird for me in the beginning when I was experiencing deep imposter syndrome because I felt like they knew a lot about the thing they were talking about. Once I realised they did not, this was a fun social activity to participate in. Though I think some people take it too far and are miscalibrated about how correct their armchair thinking is on topics they don’t have actual expertise in.
I also really enjoyed hanging out with people who had been influenced by some of the same things I had been influenced by such as Eliezer’s Sequences and HPMOR. It felt like there were some fun conversations that happened there as a result that I wouldn’t be able to have with most people.
There was also noticeably slightly more social awkwardness in general which was great for me as someone who doesn’t have the most elite social skills in normal settings. It felt like people were more tolerant of some forms of weirdness. It also felt like once I got back home, I was noticeably more direct in the way I communicated (a friend mentioned this) as a result of the bay area culture. I also previously thought some bay area people were a bit rude and unapproachable, having only read their interactions on the internet but I think this was largely just caused by it being difficult to convey tone via text, especially when you’re arguing with someone. People were more friendly, approachable, and empathetic in real life than I assumed and now I view the interactions I have with them online somewhat differently.
The really cool bit was when he had a very quick mockup of a web app drawn on a piece of paper and uploaded a photo of it and GPT-4 then used just that to write the HTML and JavaScript for the app based on the drawing.
I would be appreciative if you do end up writing such a post.
Sad that sometimes the things that seem good for creating a better, more honest, more accountable community for the people in it also give outsiders ammunition. My intuitions point strongly in the direction of doing things in this category anyway.
I can see how the article might be frustrating for people who know the additional context that the article leaves out (where some of the additional context is simply having been in this community for a long time and having more insight into how it deals with abuse). From the outside though, it does feel like some factors would make abuse more likely in this community: how salient “status” feels, mixing of social and professional lives, gender ratios, conflicts of interests everywhere due to the community being small, sex positivity and acceptance of weirdness and edginess (which I think are great overall!). There are also factors pushing in the other direction of course.
I say this because it seems very reasonable for someone who is new to the community to read the article and the tone in the responses here and feel uncomfortable interacting with the community in the future. A couple of women in the past have mentioned to me that they haven’t engaged much with the in-person rationalist community because they expect the culture to be overly tolerant of bad behaviour, which seems sad because I expect them to enjoy hanging out in the community.
I can see the reasons behind not wanting to give the article more attention if it seems like a very inaccurate portrayal of things. But it does feel like that makes this community feel more unwelcoming to some newer people (especially women) who would otherwise like to be here and who don’t have the information about how the things mentioned in the article were responded to in the past.
This was a somewhat emotional read for me.
When I was between the ages of 11-14, I remember being pretty intensely curious about lots of stuff. I learned a bunch of programming and took online courses on special relativity, songwriting, computer science, and lots of other things. I liked thinking about maths puzzles that were a bit too difficult for me to solve. I had weird and wild takes on things I learned in history class that I wanted to share with others. I liked looking at ants and doing experiments on their behaviour.
And then I started to feel like all my learning and doing had to be directed at particular goals and this sapped my motivation and curiosity. I am regaining some of it back but it does feel like my ability to think in interesting and fun directions has been damaged. It’s not just the feeling of “I have to be productive” that was very bad for me but also other things like wanting to have legible achievements that I could talk about (trying to learn more maths topics off a checklist instead of exploring and having fun with the maths I wanted to think about) and some anxiety around not knowing or being able to do the same things as others (not trying my hand at thinking about puzzles/questions I think I’ll fail at and instead trying to learn “important” things I felt bored/frustrated by because I wanted to feel more secure about my knowledge/intelligence when around others who knew lots of things).
In my early attempts to fix this, I tried to force playful thinking and this frame made things worse. Because like you said my mind already wants to play. I just have to notice and let it do that freely without judgment.
What are the most promising plans for automating alignment research as mentioned in for example OpenAI’s approach to alignment and by others?