Zach Stein-Perlman 15 Jul 2022 15:32 UTC
111 points
76
on: Don’t use ‘infohazard’ for collectively destructive info
We already have a Schelling point for “infohazard”: Bostrom’s paper. Redefining “infohazard” now is needlessly confusing. (And most of the time I hear “infohazard” it’s in the collectively-destructive smallpox-y sense, and as Buck notes this is more important and common.)

DeepMind: Model evaluation for extreme risks

Zach Stein-Perlman25 May 2023 3:00 UTC

94 points

11 comments1 min readLW link

(arxiv.org)

Zach Stein-Perlman 7 Sep 2023 18:53 UTC
74 points
54
in reply to: Ben Pace’s comment on: Sharing Information About Nonlinear
Kat, Emerson, and Drew’s reputation is not your concern insofar you’re basically certain that your post is basically true. If you thought there was a decent chance that your post was basically wrong and Nonlinear would find proof in the next week, publishing now would be inappropriate.
When destroying someone’s reputation you have an extra obligation to make sure what you’re saying is true. I think you did that in this case—just clarifying norms.

Zach Stein-Perlman 15 May 2024 7:00 UTC
73 points
54
in reply to: Linch’s comment on: Ilya Sutskever and Jan Leike resign from OpenAI
The commitment—”20% of the compute we’ve secured to date” (in July 2023), to be used “over the next four years”—may be quite little in 2027, with compute use increasing exponentially. I’m confused about why people think it’s a big commitment.

Zach Stein-Perlman 13 May 2024 19:41 UTC
72 points
61
on: OpenAI releases GPT-4o, natively interfacing with text, voice and vision
Safety-wise, they claim to have run it through their Preparedness framework and the red-team of external experts.
I’m disappointed and I think they shouldn’t get much credit PF-wise: they haven’t published their evals, published a report on results, or even published a high-level “scorecard.” They are not yet meeting the commitments in their beta Preparedness Framework — some stuff is unclear but at the least publishing the scorecard is an explicit commitment.
(It’s now been six months since they published the beta PF!)
[Edit: not to say that we should feel much better if OpenAI was successfully implementing its PF—the thresholds are way too high and it says nothing about internal deployment.]

OpenAI: Preparedness framework

Zach Stein-Perlman18 Dec 2023 18:30 UTC

70 points

23 comments4 min readLW link

(openai.com)

Questions for labs

Zach Stein-Perlman30 Apr 2024 22:15 UTC

69 points

10 comments8 min readLW link

Zach Stein-Perlman 18 Nov 2023 0:46 UTC
68 points
11
on: Sam Altman fired from OpenAI
Update: Greg Brockman quit.
Update: Sam and Greg say:
Sam and I are shocked and saddened by what the board did today.
Let us first say thank you to all the incredible people who we have worked with at OpenAI, our customers, our investors, and all of those who have been reaching out.
We too are still trying to figure out exactly what happened. Here is what we know:
- Last night, Sam got a text from Ilya asking to talk at noon Friday. Sam joined a Google Meet and the whole board, except Greg, was there. Ilya told Sam he was being fired and that the news was going out very soon.
- At 12:19pm, Greg got a text from Ilya asking for a quick call. At 12:23pm, Ilya sent a Google Meet link. Greg was told that he was being removed from the board (but was vital to the company and would retain his role) and that Sam had been fired. Around the same time, OpenAI published a blog post.
- As far as we know, the management team was made aware of this shortly after, other than Mira who found out the night prior.
The outpouring of support has been really nice; thank you, but please don’t spend any time being concerned. We will be fine. Greater things coming soon.
Update: three more resignations including Jakub Pachocki.
Update:
Sam Altman’s firing as OpenAI CEO was not the result of “malfeasance or anything related to our financial, business, safety, or security/privacy practices” but rather a “breakdown in communications between Sam Altman and the board,” per an internal memo from chief operating officer Brad Lightcap seen by Axios.
Update: Sam is planning to launch something (no details yet).
Update: Sam may return as OpenAI CEO.
Update: Tigris.
Update: talks with Sam and the board.
Update: Mira wants to hire Sam and Greg in some capacity; board still looking for a permanent CEO.
Update: Emmett Shear is interim CEO; Sam won’t return.
Update: lots more resignations (according to an insider).
Update: Sam and Greg leading a new lab in Microsoft.
Update: total chaos.

Zach Stein-Perlman 9 Sep 2023 17:35 UTC
62 points
20
in reply to: Emerson Spartz’s comment on: Sharing Information About Nonlinear
Ben has also been quietly fixing errors in the post, which I appreciate, but people are going around right now attacking us for things that Ben got wrong, because how would they know he quietly changed the post?
This is why every time newspapers get caught making a mistake they issue a public retraction the next day to let everyone know. I believe Ben should make these retractions more visible.
I used a diff checker to find the differences between the current post and the original post. There seem to be two:
1. “Alice worked there from November 2021 to June 2022” became “Alice travelled with Nonlinear from November 2021 to June 2022 and started working for the org from around February”
2. “using Lightcone funds” became “using personal funds”
Possibly I made a mistake, or Ben made edits and you saw them and then Ben reverted them—if so, I encourage you/anyone to point to another specific edit, possibly on other archive.org versions.
Update: Kat guesses she was thinking of changes from a near-final draft rather than changes from the first published version.

DeepMind: Evaluating Frontier Models for Dangerous Capabilities

Zach Stein-Perlman21 Mar 2024 3:00 UTC

61 points

0 comments1 min readLW link

(arxiv.org)

Zach Stein-Perlman 21 Feb 2023 1:31 UTC
53 points
33
on: AI alignment researchers don’t (seem to) stack
I largely agree. But I think not-stacking is only slightly bad because I think the “crappy toy model [where] every alignment-visionary’s vision would ultimately succeed, but only after 30 years of study along their particular path” is importantly wrong; I think many new visions have a decent chance of succeeding more quickly and if we pursue enough different visions we get a good chance of at least one paying off quickly.
Edit: even if alignment researchers could stack into just a couple paths, I think we might well still choose to go wide.

OpenAI-Microsoft partnership

Zach Stein-Perlman3 Oct 2023 20:01 UTC

51 points

18 comments1 min readLW link

Zach Stein-Perlman 3 Jan 2022 14:30 UTC
46 points
NIL
on: Open Thread—Jan 2022 [Vote Experiment!]
Please tell us what you think! Love it/hate it/think it should be different? Let us know.

I think it’s a fine experiment but… right now I’m closest to “hate it,” at least if it was used for all posts (I’d be much happier if it was only for question-posts, or only if the author requested it or a moderator thought it would be particularly useful, or something).
- It makes voting take longer (with not much value added).
- It makes reading comments take longer (with not much value added). You learn very little from these votes beyond what you learn from reading the comment.
- It’s liable to make the more OCD among us go crazy. Worrying about how other people vote on your writing is bad enough. I, for one, would write worse comments in expectation if I was always thinking about making everyone else believe that my comments were true and well-aimed and clear and truth-seeking &c.
If this system was implemented in general, I would almost always prefer not to interact with it, so I would strongly request a setting to hide all non-karma voting from my view.

Edit in response to Rafael: for me at least the downside isn’t anxiety but mental effort to optimize for comment quality rather than votes and mental effort to ignore votes on my own comments. I’m not sure if the distinction matters; regardless, I’d be satisfied with the ability to hide non-karma votes.

Slowing AI: Reading list

Zach Stein-Perlman17 Apr 2023 14:30 UTC

45 points

3 comments4 min readLW link

Slowing AI: Foundations

Zach Stein-Perlman17 Apr 2023 14:30 UTC

45 points

11 comments17 min readLW link

Zach Stein-Perlman 18 Apr 2024 6:00 UTC
43 points
29
in reply to: jimrandomh’s comment on: FHI (Future of Humanity Institute) has shut down (2005–2024)
Harry let himself be pulled, but as Hermione dragged him away, he said, raising his voice even louder, “It is entirely possible that in a thousand years, the fact that FHI was at Oxford will be the only reason anyone remembers Oxford!”