Running Lightcone Infrastructure, which runs LessWrong. You can reach me at habryka@lesswrong.com
habryka(Oliver Habryka)
Hmm, most of the ordering should be the same. Here is the ordering on Youtube Music:
The Road To Wisdom
Moloch
Thought That Faster (feat. Eliezer Yudkowsky)
The Litany of Tarrrrrski (feat. Eliezer Yudkowsky)
The Litany of Gendlin
Dath Ilan’s Song (feat. Eliezer Yudkowsky)
Half An Hour Before Dawn In San Francisco (feat. Scott Alexander)
AGI and the EMH (mit Basil Halperin, J. Zachary Mazlish & Trevor Chow)
First they came for the epistemology (feat. Michael Vassar)
Prime Factorization (feat. Scott Alexander)
We Do Not Wish to Advance (feat. Anthropic)
Nihil Supernum (feat. Godric Gryffindor)
More Dakka (feat. Zvi Mowshowitz)
FHI at Oxford (feat. Nick Bostrom)
Answer to Job (feat. Scott Alexander)Which is pretty similar to the order here. The folk album is in a slightly different order (which I do think is worse and we sadly can’t change), but otherwise things are the same.
My current best guess is that actually cashing out the vested equity is tied to an NDA, but I am really not confident. OpenAI has a bunch of really weird equity arrangements.
Hmm, I have sympathy for this tag, but also I do feel like the tagging system probably shouldn’t implicitly carry judgement. Seems valuable to keep your map separate from your incentives and all that.
Happy to discuss here what to do. I do think allowing people to somehow tag stuff that seems like it increases capabilities in some dangerous way seems good, but I do think it should come with less judgement in the site’s voice (judgement in a user’s voice is totally fine, but the tagging system speaks more with the voice of the site than any individual user).
Oh, yeah, admins currently have access to a purely recommended view, and I prefer it. I would be in favor of making that accessible to users (maybe behind a beta flag, or maybe not, depending on uptake).
I think the priors here are very low, so while I agree it looks suspicious, I don’t think it’s remotely suspicious enough to have the correct posterior be “about zero chance that wasn’t murder”. Corporations, at least in the U.S. really very rarely murder people.
My understanding is that the extent of NDAs can differ a lot between different implementations, so it might be hard to speak in generalities here. From the revealed behavior of people I poked here who have worked at OpenAI full-time, the OpenAI NDAs seem very comprehensive and limiting. My guess is also the NDAs for contractors and for events are a very different beast and much less limiting.
Also just the de-facto result of signing non-disclosure-agreements is that people don’t feel comfortable navigating the legal ambiguity and default very strongly to not sharing approximately any information about the organization at all.
Maybe people would do better things here with more legal guidance, and I agree that you don’t generally seem super constrained in what you feel comfortable saying, but like I sure now have run into lots of people who seem constrained by NDAs they signed (even without any non-disparagement component). Also, if the NDA has a gag clause that covers the existence of the agreement, there is no way to verify the extent of the NDA, and that makes navigating this kind of stuff super hard and also majorly contributes to people avoiding the topic completely.
I think having signed an NDA (and especially a non-disparagement agreement) from a major capabilities company should probably rule you out of any kind of leadership position in AI Safety, and especially any kind of policy position. Given that I think Daniel has a pretty decent chance of doing either or both of these things, and that work is very valuable and constrained on the kind of person that Daniel is, I would be very surprised if this wasn’t worth it on altruistic grounds.
Edit: As Buck points out, different non-disclosure-agreements can differ hugely in scope. To be clear, I think non-disclosure-agreements that cover specific data or information you were given seems fine, but non-disclosure-agreements that cover their own existence, or that are very broadly worded and prevent you from basically talking about anything related to an organization, are pretty bad. My sense is the stuff that OpenAI employees are asked to sign when they leave are very constraining, but my guess is the kind of stuff that people have to sign for a small amount of contract work or for events are not very constraining, though I would definitely read any contract carefully in this space.
Oh, hmm, I sure wasn’t tracking a 1000 character limit. If you can submit it, I wouldn’t be worried about it (and feel free to put that into your references section). I certainly have never paid attention to whether anyone stayed within the character limit.
I haven’t engaged with this in enough detail, but some people who engaged with Scott’s sequence who I can imagine being interested in this: @Scott Garrabrant , @James Payor, @Quinn, @Nathan Helm-Burger, @paulfchristiano.
Promoted to curated: I sure tend to have a lot of conversations about honesty and integrity, and this specific post was useful in 2-3 conversations I’ve had since it came out. I like having a concept handle for “trying to actively act with an intent to inform”, I like the list of concrete examples of the above, and I like how the post situates this as something with benefits and drawbacks (while also not shying away too much from making concrete recommendations on what would be better on the margin).
Despite my general interest in open inquiry, I will avoid talking about my detailed hypothesis of how to construct such a virus. I am not confident this is worth the tradeoff, but the costs of speculating about the details here in public do seem non-trivial.
@Daniel Kokotajlo If you indeed avoided signing an NDA, would you be able to share how much you passed up as a result of that? I might indeed want to create a precedent here and maybe try to fundraise for some substantial fraction of it.
I have a lot of uncertainty about the difficulty of robotics, and the difficulty of e.g. designing superviruses or other ways to kill a lot of people. I do agree that in most worlds robotics will be solved to a human level before AI will be capable of killing everyone, but I am generally really averse to unnecessarily constraining my hypothesis space when thinking about this kind of stuff.
>90% seems quite doable with a well-engineered virus (especially one with a long infectious incubation period). I think 99%+ is much harder and probably out of reach until after robotics is thoroughly solved, but like, my current guess is a motivated team of humans could design a virus that kills 90% − 95% of humanity.
The infrastructure necessary to run a datacenter or two is not that complicated. See these Gwern comments for some similar takes:
In the world without us, electrical infrastructure would last quite a while, especially with no humans and their needs or wants to address. Most obviously, RTGs and solar panels will last indefinitely with no intervention, and nuclear power plants and hydroelectric plants can run for weeks or months autonomously. (If you believe otherwise, please provide sources for why you are sure about “soon after”—in fact, so sure about your power grid claims that you think this claim alone guarantees the AI failure story must be “pretty different”—and be more specific about how soon is “soon”.)
And think a little bit harder about options available to superintelligent civilizations of AIs*, instead of assuming they do the maximally dumb thing of crashing the grid and immediately dying… (I assure you any such AIs implementing that strategy will have spent a lot longer thinking about how to do it well than you have for your comment.)
Add in the capability to take over the Internet of Things and the shambolic state of embedded computers which mean that the billions of AI instances & robots/drones can run the grid to a considerable degree and also do a more controlled shutdown than the maximally self-sabotaging approach of ‘simply let it all crash without lifting a finger to do anything’, and the ability to stockpile energy in advance or build one’s own facilities due to the economic value of AGI (how would that look much different than, say, Amazon’s new multi-billion-dollar datacenter hooked up directly to a gigawatt nuclear power plant...? why would an AGI in that datacenter care about the rest of the American grid, never mind world power?), and the ‘mutually assured destruction’ thesis is on very shaky grounds.
And every day that passes right now, the more we succeed in various kinds of decentralization or decarbonization initiatives and the more we automate pre-AGI, the less true the thesis gets. The AGIs only need one working place to bootstrap from, and it’s a big world, and there’s a lot of solar panels and other stuff out there and more and more every day… (And also, of course, there are many scenarios where it is not ‘kill all humans immediately’, but they end in the same place.)
Would such a strategy be the AGIs’ first best choice? Almost certainly not, any more than chemotherapy is your ideal option for dealing with cancer (as opposed to “don’t get cancer in the first place”). But the option is definitely there.
If there is an AI that is making decent software-progress, even if it doesn’t have the ability to maintain all infrastructure, it would probably be able to develop new technologies and better robots controls over the course of a few months or years without needing to have any humans around.
Uploaded them both!
Of course, at some point, we’ll eventually make sufficient progress in robotics that we can’t rely on this safety guarantee
Why would “robotics” be the blocker? I think AIs can do a lot of stuff without needing much advancement in robotics. Convincing humans to do things is a totally sufficient API to have very large effects (e.g. it seems totally plausible to me you can have AI run country-sized companies without needing any progress in robotics).
There are two images provided for a sequence, the banner image and the card image. The card image is required for it to show up in the Library.
It’s not the most obvious place, but the content lives here: https://www.lesswrong.com/posts/zXJfH7oZ62Xojnrqs/lesswrong-moderation-messaging-container?commentId=sLay9Tv65zeXaQzR4
At least Eliezer has been extremely clear that he is in favor of a stop not a pause (indeed, that was like the headline of his article “Pausing AI Developments Isn’t Enough. We Need to Shut it All Down”), so I am confused why you list him with anything related to “pause”.
My guess is me and Eliezer are both in favor of a pause, but mostly because a pause seems like it would slow down AGI progress, not because the next 6 months in-particular will be the most risky period.
We are experimenting with bolding the date on posts that are new and leaving it thinner on posts that are old, though feedback so far hasn’t been super great.