I would give my dog many treats to stop eating deer poop, since this behavior can lead to expensive veterinary visits. But I can’t communicate with my dog well enough to set up this trade.
Why isn’t this an example of “we would trade with animals if we could communicate better”?
JakubK
GPT-4 solves Gary Marcus-induced flubs
List of technical AI safety exercises and projects
[Question] Can we get full audio for Eliezer’s conversation with Sam Harris?
[Question] Best introductory overviews of AGI safety?
Notes on “the hot mess theory of AI misalignment”
6-paragraph AI risk intro for MAISI
Big list of AI safety videos
To be clear, I haven’t seen many designs that people I respect believed to have a chance of actually working. If you work on the alignment problem or at an AI lab and haven’t read Nate Soares’ On how various plans miss the hard bits of the alignment challenge, I’d suggest reading it.
Can you explain your definition of the sharp left turn and why it will cause many plans to fail?
Next steps after AGISF at UMich
Averting Catastrophe: Decision Theory for COVID-19, Climate Change, and Potential Disasters of All Kinds
I asked GPT-4 to summarize the article and then come up with some alternative terms, here are a few I like:
One-way summary
Insider mnemonic
Contextual shorthand
Familiarity trigger
Conceptual hint
Clue for the familiar
Knowledge spark
Abbreviated insight
Expert’s echo
Breadcrumb for the well-versed
Whisper of the well-acquainted
Insider’s underexplained aphorism
I also asked for some idioms. “Seeing the forest but not the trees” seems apt.
“My guess is that people who are concluding P(Doom) is high will each need to figure out how to live with it for themselves.”
The following perspective helps me feel better.
First, it’s not news that AGI poses a significant threat to humanity. I felt seriously worried when I first encountered this idea in 2018 listening to Eliezer on the Sam Harris podcast. The “Death With Dignity” post revived these old fears, but it didn’t reveal new dangers that were previously unknown to me.
Second, many humans have dealt with believing “P(impending Doom) = high” at many times throughout history. COVID, Ukraine, famine in Yemen, WWI, WWII, the Holocaust, 9/11, incarceration in the US, Mongol conquests, the Congo Free State’s atrocities, the Great Purge, the Cambodian genocide, the Great Depression, the Black Death, the Putumayo genocide, the Holodomor, the Trail of Tears, the Syrian civil war, the Irish Potato Famine, the Vietnam War, slavery, colonialism, more wars, more terrorist attacks, more plagues, more famines, more genocides, etc.
It’s easy to read these events without simulating how the people involved felt. In most of these cases, “Doom” didn’t mean “everyone on the planet will die” but rather “I will lose everything” or “everything that matters will be destroyed” or “everyone I know will die” or “my culture will die” or “my family will die” or simply “I will die.” The thoughts these people had probably felt considerably more devastating than the thoughts I’m having these days. Heck, some people get terrified just reading the news. I’m not alone in worrying about the future.
Again: “I’m not alone in worrying about the future.” I find this immensely comforting. People are not at all oblivious to the world having problems, even if they disagree with me on which problems are the most important. Everyone has fears.- 4 Nov 2022 10:10 UTC; 2 points) 's comment on All AGI Safety questions welcome (especially basic ones) [~monthly thread] by (
I didn’t read it as an argument so much as an emotionally compelling anecdote that excellently conveys this realization:
I had had the upper hand for so long that it became second nature, and then suddenly, I went to losing every game.
Summary of 80k’s AI problem profile
Relevant tweet/quote from Mustafa Suleyman, the co-founder and CEO:
Powerful AI systems are inevitable. Strict licensing and regulation is also inevitable. The key thing from here is getting the safest and most widely beneficial versions of both.
An AI Policy Tool for Today: Ambitiously Invest in NIST (Anthropic 2023)
National Security Addition to the NIST AI RMF (Special Competitive Studies Project 2023)
Existential risk and rapid technological change—a thematic study for UNDRR (Stauffer et al. 2023), especially section 4.3 (“30 actions to reduce existential risk”)
Crafting Legislation to Prevent AI-Based Extinction: Submission of Evidence to the Science and Technology Select Committee’s Inquiry on the Governance of AI (Cohen and Osborne 2023)
Why we need a new agency to regulate advanced artificial intelligence: Lessons on AI control from the Facebook Files (Korinek 2021)
Is this still happening? The website has stopped working for me.
I greatly appreciate this post. I feel like “argh yeah it’s really hard to guarantee that actions won’t have huge negative consequences, and plenty of popular actions might actually be really bad, and the road to hell is paved with good intentions.” With that being said, I have some comments to consider.
The offices cost $70k/month on rent [1], and around $35k/month on food and drink, and ~$5k/month on contractor time for the office. It also costs core Lightcone staff time which I’d guess at around $75k/year.
That is ~$185k/month and ~$2.22m/year. I wonder if the cost has anything to do with the decision? There may be a tendency to say “an action is either extremely good or extremely bad because it either reduces x-risk or increases x-risk, so if I think it’s net positive I should be willing to spend huge amounts of money.” I think this framing neglects a middle ground of “an action could be somewhere in between extremely good and extremely bad.” Perhaps the net effects of the offices were “somewhat good, but not enough to justify the monetary cost.” I guess Ben sort of covers this point later (“Having two locations comes with a large cost”).
its value was substantially dependent on the existing EA/AI Alignment/Rationality ecosystem being roughly on track to solve the world’s most important problems, and that while there are issues, pouring gas into this existing engine, and ironing out its bugs and problems, is one of the most valuable things to do in the world.
Huh, it might be misleading to view the offices as “pouring gas into the engine of the entire EA/AI Alignment/Rationality ecosystem.” They contribute to some areas much more than others. Even if one thinks that the overall ecosystem is net harmful, there could still be ecosystem-building projects that are net helpful. It seems highly unlikely to me that all ecosystem-building projects are bad.
The Lighthouse system is going away when the leases end. Lighthouse 1 has closed, and Lighthouse 2 will continue to be open for a few more months.
These are group houses for members of the EA/AI Alignment/Rationality ecosystem, correct? Relating to the last point, I expect the effects of these to be quite different from the effects of the offices.
FTX is the obvious way in which current community-building can be bad, though in my model of the world FTX, while somewhat of outlier in scope, doesn’t feel like a particularly huge outlier in terms of the underlying generators.
I’m very unsure about this, because it seems plausible that SBF would have done something terrible without EA encouragement. Also, I’m confused about the detailed cause-and-effect analysis of how the offices will contribute to SBF-style catastrophes—is the idea that “people will talk in the offices and then get stupid ideas, and they won’t get equally stupid ideas without the offices?”
My guess is RLHF research has been pushing on a commercialization bottleneck and had a pretty large counterfactual effect on AI investment, causing a huge uptick in investment into AI and potentially an arms race between Microsoft and Google towards AGI: https://www.lesswrong.com/posts/vwu4kegAEZTBtpT6p/thoughts-on-the-impact-of-rlhf-research?commentId=HHBFYow2gCB3qjk2i
Worth noting that there is plenty of room for debate on the impacts of RLHF, including the discussion in the linked post.
Tendencies towards pretty mindkilly PR-stuff in the EA community: https://forum.effectivealtruism.org/posts/ALzE9JixLLEexTKSq/cea-statement-on-nick-bostrom-s-email?commentId=vYbburTEchHZv7mn4
Overall I’m getting a sense of “look, there are bad things happening so the whole system must be bad.” Additionally, I think the negative impact of “mindkilly PR-stuff” is pretty insubstantial. On a related note, I somewhat agree with the idea that “most successful human ventures look—from up close—like dumpster fires.” It’s worth being wary of inferences resembling “X evokes a sense of disgust, so X is probably really harmful.”
I genuinely only have marginally better ability to distinguish the moral character of Anthropic’s leadership from the moral character of FTX’s leadership
Yeah this makes sense. I would really love to gain a clear understanding of who has power at the top AGI labs and what their views are on AGI risk. AFAIK nobody has done a detailed analysis of this?
Also, as in the case of RLHF, it’s worth noting that there are some reasonable arguments for Anthropic being helpful.
I think AI Alignment ideas/the EA community/the rationality community played a pretty substantial role in the founding of the three leading AGI labs (Deepmind, OpenAI, Anthropic)
Definitely true for Anthropic. For OpenAI I’m less sure; IIRC the argument is that there were lots of EA-related conferences that contributed to the formation of OpenAI, and I’d like to see more details than this; “there were EA events where key players talked” feels quite different from “without EA, OpenAI would not exist.” I feel similarly about DeepMind; IIRC Eliezer accidentally convinced one of the founders to work on AGI—are there other arguments?
And again, how do the Lightcone offices specifically contribute to the founding of more leading AGI labs? My impression is that the offices’ vibe conveyed a strong sense of “it’s bad to shorten timelines.”
It’s a bad idea to train models directly on the internet
I’m confused how the offices contribute to this.
The EA and AI Alignment community should probably try to delay AI development somehow, and this will likely include getting into conflict with a bunch of AI capabilities organizations, but it’s worth the cost
Again, I’m confused how the offices have a negative impact from this perspective. I feel this way about quite a few of the points in the list.
I do sure feel like a lot of AI alignment research is very suspiciously indistinguishable from capabilities research
...
It also appears that people who are concerned about AGI risk have been responsible for a very substantial fraction of progress towards AGI
...
A lot of people in AI Alignment I’ve talked to have found it pretty hard to have clear thoughts in the current social environment
To me these seem like some of the best reasons (among those in the list; I think Ben provides some more) to shut down the offices. The disadvantage of the list format is that it makes all the points seem equally important; it might be good to bold the points you see as most important or provide a numerical estimate for what percentage of the negative expect impact comes from each point.
The moral maze nature of the EA/longtermist ecosystem has increased substantially over the last two years, and the simulacra level of its discourse has notably risen too.
I feel similar to the way I felt about the “mindkilly PR-stuff”; I don’t think the negative impact is very high in magnitude.
the primary person taking orders of magnitudes more funding and staff talent (Dario Amodei) has barely explicated his views on the topic and appears (from a distance) to have disastrously optimistic views about how easy alignment will be and how important it is to stay competitive with state of the art models
Agreed. I’m confused about Dario’s views.
I recall at EAG in Oxford a year or two ago, people were encouraged to “list their areas of expertise” on their profile, and one person who works in this ecosystem listed (amongst many things) “Biorisk” even though I knew the person had only been part of this ecosystem for <1 year and their background was in a different field.
This seems very trivial to me. IIRC the Swapcard app just says “list your areas of expertise” or something, with very little details about what qualifies as expertise. Some people might interpret this as “list the things you’re currently working on.”
It also seems to me like people who show any intelligent thought or get any respect in the alignment field quickly get elevated to “great researchers that new people should learn from” even though I think that there’s less than a dozen people who’ve produced really great work
Could you please list the people who’ve produced really great work?
I similarly feel pretty worried by how (quite earnest) EAs describe people or projects as “high impact” when I’m pretty sure that if they reflected on their beliefs, they honestly wouldn’t know the sign of the person or project they were talking about, or estimate it as close-to-zero.
Strongly agree. Relatedly, I’m concerned that people might be exhibiting a lot of action bias.
Last point, unrelated to the quote: it feels like this post is entirely focused on the possible negative impacts of the offices, and that kind of analysis seems very likely to arrive at incorrect conclusions since it fails to consider the possible positive impacts. Granted, this post was a scattered collection of Slack messages, so I assume the Lightcone team has done more formal analyses internally.
At the time of me writing, this comment is still the most recommended comment with 910 recommendations. 2nd place has 877 recommendations:
3rd place has 790 recommendations:
4th place has 682 recommendations:
After that, 5th place has 529, 6th place has 390, and the rest have 350 or fewer.
My thoughts:
2nd place reminds me of Let’s think about slowing down AI. But I somewhat disagree with the comment, because I do sense that many people have a desire for cool new AI tech.
3rd place sounds silly since advanced AI could help with reducing climate change, poverty, and genetic disorders. I also wonder if this commenter knows about AlphaFold.
4th place seems important. But I think that even if AGI jobs offered lower compensation, there would still be a considerable number of workers interested in pursuing them.