habryka(Oliver Habryka)

Karma: 33,000

Running Lightcone Infrastructure, which runs LessWrong. You can reach me at habryka@lesswrong.com. I have signed no contracts or agreements whose existence I cannot mention.

habryka 8 Jun 2024 0:49 UTC
LW: 3 AF: 3
0
AF
in reply to: Max Harms’s comment on: 0. CAST: Corrigibility as Singular Target
You can do it. Just go to https://www.lesswrong.com/library and scroll down until you reach the “Community Sequences” section and press the “Create New Sequence” button.

habryka 8 Jun 2024 0:04 UTC
LW: 3 AF: 3
0
AF
on: 0. CAST: Corrigibility as Singular Target
Do you want to make an actual sequence for this so that the sequence navigation UI shows up at the top of the post?

habryka 7 Jun 2024 22:31 UTC
2 points
0
in reply to: cousin_it’s comment on: Book review: The Quincunx
We work with T3Audio on the narration, and I think they don’t really update it after initial publication. It costs us some non-trivial amount of money (like $1 or so) to narrate a post, which means we can’t just re-narrate it on every edit without opening ourselves up to burning a bunch of money without reason. Not sure what the ideal thing here is.

habryka 6 Jun 2024 19:54 UTC
2 points
0
in reply to: Thomas Kwa’s comment on: What do coherence arguments actually prove about agentic behavior?
I would be somewhat surprised if Eliezer and Nate disagree very much here, though you might know better. So I would mostly see Nate’s post as clarification of both Eliezer’s and Nate’s views.

habryka 6 Jun 2024 19:52 UTC
9 points
−4
in reply to: Zach Stein-Perlman’s comment on: Zach Stein-Perlman’s Shortform
None of the actors who seem currently likely to me to be to deploy highly capable systems seem to me like they will do anything except approximately scaling as fast as they can. I do agree that proliferation is still bad simply because you get more samples from the distribution, but I don’t think that changes the probabilities that drastically for me (I am still in favor of securing model weights work, especially in the long run).
Separately, I think it’s currently pretty plausible that model weight leaks will substantially reduce the profit of AI companies by reducing their moat, and that has an effect size that seems plausible larger than the benefits of non-proliferation.

habryka 3 Jun 2024 18:04 UTC
2 points
0
in reply to: commissar Yarrick’s comment on: New User’s Guide to LessWrong
Karma is just the sum of votes from other users on your posts, comments and wiki-edit contributions.

habryka 3 Jun 2024 10:43 UTC
30 points
1
in reply to: anon_standards’s comment on: Non-Disparagement Canaries for OpenAI
What’s the source of this? Will also DM you.

habryka 1 Jun 2024 16:20 UTC
21 points
20
in reply to: Sen’s comment on: MIRI 2024 Communications Strategy
just some actual consensus among established researchers to sift mathematical facts from conjecture.
“Scientific consensus” is a much much higher bar than peer review. Almost no topic of relevance has a scientific consensus (for example, there exists basically no trustworthy scientific for urban planning decisions, or the effects of minimum wage law, or pandemic prevention strategies, or cyber security risks, or intelligence enhancement). Many scientific peers think there is an extinction risk.
I think demanding scientific consensus is an unreasonably high bar that would approximately never be met in almost any policy discussion.

habryka 1 Jun 2024 16:16 UTC
7 points
5
in reply to: Lukas_Gloor’s comment on: MIRI 2024 Communications Strategy
(I didn’t get anything out of it, and it seems kind of aggressive in a way that seems non-sequitur-ish, and also I am pretty sure mischaracterizes people. I didn’t downvote it, but have disagree-voted with it)

habryka 31 May 2024 21:32 UTC
59 points
56
in reply to: Jacob_Hilton’s comment on: Non-Disparagement Canaries for OpenAI
Thankfully, most of this is now moot as the company has retracted the contract.
I don’t think any of this is moot, since the thing that is IMO most concerning is people signing these contracts, then going into policy or leadership positions and not disclosing that they signed those contracts. Those things happened in the past and are real breaches of trust.

habryka 31 May 2024 6:00 UTC
12 points
3
on: Truthseeking is the ground in which other principles grow
Promoted to curated: I’ve really appreciated a lot of the sequence you’ve been writing about various epistemic issues around the EA (and to some degree the rationality) community. This post feels like an appropriate capstone to that work and I quite like it as a positive pointer to a culture that I wish had more adherents.

habryka 30 May 2024 22:53 UTC
20 points
4
in reply to: Akash’s comment on: Akash’s Shortform
One reason I feel interested in liability is because it opens up a way to do legal investigations. The legal system has a huge number of privileges that you get to use if you have reasonable suspicion someone has committed a crime or is being negligent. I think it’s quite likely that if there was no direct liability, that even if Microsoft or OpenAI causes some huge catastrophe, that we would never get a proper postmortem or analysis of the facts, and would never reach high-confidence on the actual root-causes.
So while I agree that OpenAI and Microsoft want to of course already avoid being seen as responsible for a large catastrophe, having legal liability makes it much more likely there will be an actual investigation where e.g. the legal system gets to confiscate servers and messages to analyze what happens, which makes it then more likely that if OpenAI and Microsoft are responsible, they will be found out to be responsible.

habryka 30 May 2024 22:46 UTC
15 points
35
in reply to: Nathan Young’s comment on: Nathan Young’s Shortform
Not sure what you mean by “underrated”. The fact that they have $300MM from Vitalik but haven’t really done much anyways was a downgrade in my books.

habryka 30 May 2024 21:39 UTC
11 points
9
in reply to: Ben Pace’s comment on: MIRI 2024 Communications Strategy
I am not that confident about this. Or like, I don’t know, I do notice my psychological relationship to “all the stars explode” and “earth explodes” is very different, and I am not good enough at morality to be confident about dismissing that difference.

habryka 30 May 2024 21:03 UTC
5 points
1
in reply to: Kaj_Sotala’s comment on: MIRI 2024 Communications Strategy
I disagree. I think it matters a good amount. Like if the risk scenario is indeed “humans will probably get a solar system or two because it’s cheap from the perspective of the AI”. I also think there is a risk of AI torturing the uploads it has, and I agree that if that is the reason why humans are still alive then I would feel comfortable bracketing it, but I think Ryan is arguing more that something like “humans will get a solar system or two and basically get to have decent lives”.

habryka 30 May 2024 0:19 UTC
2 points
0
in reply to: ryan_greenblatt’s comment on: Maybe Anthropic’s Long-Term Benefit Trust is powerless
(I missed “this was in the works for a while” on my first read of your comment.)
No, I just gaslit you. I edited it when I saw your reaction as a clarification. Sorry about that, should have left a note that I edited it.

habryka 29 May 2024 23:26 UTC
5 points
−13
in reply to: Zach Stein-Perlman’s comment on: Maybe Anthropic’s Long-Term Benefit Trust is powerless
The timing makes me think it didn’t happen on schedule and they are announcing this now in response to save face and pre-empt bad PR from this post (though I am only like 75% confident that something like that is going on, and my guess is the appointment itself has been in the works for a while). Seems IMO like a bad sign to do that without being clear about the timing and the degree to which a past commitment was violated.
(Also importantly, they said they would appoint a fifth board-member, but instead it seems like this board member replaced Luke, so they actually stuck to 4)

habryka 29 May 2024 23:23 UTC
14 points
0
in reply to: ryan_greenblatt’s comment on: MIRI 2024 Communications Strategy
FWIW I still stand behind the arguments that I made in that old thread with Paul. I do think the game-theoretical considerations for AI maybe allowing some humans to survive are stronger, but they also feel loopy and like they depend on how good of a job we do on alignment, so I usually like to bracket them in conversations like this (though I agree it’s relevant for the prediction of whether AI will kill literally everyone).

habryka 29 May 2024 23:01 UTC
2 points
0
in reply to: Ben Pace’s comment on: OpenAI: Fallout
I also added one to my profile!

habryka 28 May 2024 19:41 UTC
2 points
0
in reply to: IlluminateReality’s comment on: Open Thread Spring 2024
Welcome! I hope you have a good time here!