Adam Scholl

Karma: 3,663

Adam Scholl 31 May 2024 3:09 UTC
6 points
2
in reply to: Orpheus16’s comment on: Non-Disparagement Canaries for OpenAI
Thanks! Edited to fix.

Non-Disparagement Canaries for OpenAI

aysja and Adam Scholl

30 May 2024 19:20 UTC

288 points

51 comments2 min readLW link

Adam Scholl 10 May 2024 2:11 UTC
3 points
−2
in reply to: Matthew Barnett’s comment on: We might be missing some key feature of AI takeoff; it’ll probably seem like “we could’ve seen this coming”
Do you expect AI labs would actually run extensive experimental tests in this world? I would be surprised if they did, even if such a window does arise.

(To roughly operationalize: I would be surprised to hear a major lab spent more than 5 FTE-years conducting such tests, or that the tests decreased the p(doom) of the average reasonably-calibrated external observer by more than 10%).

Adam Scholl 26 Apr 2024 1:53 UTC
2 points
0
in reply to: Davidmanheim’s comment on: Paul Christiano named as US AI Safety Institute Head of AI Safety
This thread isn’t seeming very productive to me, so I’m going to bow out after this. But yes, it is a primary concern—at least in the case of Open Philanthropy, it’s easy to check what their primary concerns are because they write them up. And accidental release from dual use research is one of them.

Adam Scholl 25 Apr 2024 3:10 UTC
4 points
2
in reply to: Davidmanheim’s comment on: Paul Christiano named as US AI Safety Institute Head of AI Safety
the idea that we should have “BSL-5” is the kind of silly thing that novice EAs propose that doesn’t make sense because there literally isn’t something significantly more restrictive
I mean, I’m sure something more restrictive is possible. But my issue with BSL levels isn’t that they include too few BSL-type restrictions, it’s that “lists of restrictions” are a poor way of managing risk when the attack surface is enormous. I’m sure someday we’ll figure out how to gain this information in a safer way—e.g., by running simulations of GoF experiments instead of literally building the dangerous thing—but at present, the best available safeguards aren’t sufficient.
I also think that “nearly all EA’s focused on biorisk think gain of function research should be banned” is obviously underspecified, and wrong because of the details.
I’m confused why you find this underspecified. I just meant “gain of function” in the standard, common-use sense—e.g., that used in the 2014 ban on federal funding for such research.

Adam Scholl 25 Apr 2024 0:49 UTC
6 points
−1
in reply to: Davidmanheim’s comment on: Paul Christiano named as US AI Safety Institute Head of AI Safety
I think we must still be missing each other somehow. To reiterate, I’m aware that there is non-accidental biorisk, for which one can hardly blame the safety measures. But there is also accident risk, since labs often fail to contain pathogens even when they’re trying to.

Adam Scholl 23 Apr 2024 5:15 UTC
6 points
2
in reply to: Ben Pace’s comment on: Paul Christiano named as US AI Safety Institute Head of AI Safety
My guess is more that we were talking past each other than that his intended claim was false/unrepresentative. I do think it’s true that EA’s mostly talk about people doing gain of function research as the problem, rather than about the insufficiency of the safeguards; I just think the latter is why the former is a problem.

Adam Scholl 23 Apr 2024 2:43 UTC
2 points
−3
in reply to: ChristianKl’s comment on: Paul Christiano named as US AI Safety Institute Head of AI Safety
There have been frequent and severe biosafety accidents for decades, many of which occurred at labs which were attempting to follow BSL protocol.

Adam Scholl 23 Apr 2024 2:36 UTC
10 points
4
in reply to: Davidmanheim’s comment on: Paul Christiano named as US AI Safety Institute Head of AI Safety
The EA cause area around biorisk is mostly happy to rely on those levels
I disagree—I think nearly all EA’s focused on biorisk think gain of function research should be banned, since the risk management framework doesn’t work well enough to drive the expected risk below that of the expected benefit. If our framework for preventing lab accidents worked as well as e.g. our framework for preventing plane accidents, I think few EA’s would worry much about GoF.
(Obviously there are non-accidental sources of biorisk too, for which we can hardly blame the safety measures; but I do think the measures work sufficiently poorly that even accident risk alone would justify a major EA cause area).

Adam Scholl 19 Apr 2024 15:53 UTC
24 points
13
on: Express interest in an “FHI of the West”
Man, I can’t believe there are no straightforwardly excited comments so far!
Personally, I think an institution like this is sorely needed, and I’d be thrilled if Lightcone built one. There are remarkably few people in the world who are trying to think carefully about the future, and fewer still who are trying to solve alignment; institutions like this seem like one of the most obvious ways to help them.

Adam Scholl 19 Apr 2024 14:50 UTC
8 points
−14
in reply to: owencb’s comment on: Express interest in an “FHI of the West”
Your answer might also be “I, Oliver, will play this role”. My gut take would be excited for you to be like one of three people in this role (with strong co-leads, who are maybe complementary in the sense that they’re strong at some styles of thinking you don’t know exactly how to replicate), and kind of weakly pessimistic about you doing it alone. (It certainly might be that that pessimism is misplaced.)
For what it’s worth, my guess is that your pessimism is misplaced. Oliver certainly isn’t as famous as Bostrom, so I doubt he’d be a similar “beacon.” But I’m not sure a beacon is needed—historically, plenty of successful research institutions (e.g. Bells Labs, IAS, the Royal Society in most eras) weren’t led by their star researchers, and the track record of those that were strikes me as pretty mixed.
Oliver spends most of his time building infrastructure for researchers, and I think he’s become quite good at it. For example, you are reading this comment on (what strikes me as) rather obviously the best-designed forum on the internet; I think the review books LessWrong made are probably the second-best designed books I’ve seen, after those from Stripe Press; and the Lighthaven campus is an exceedingly nice place to work.
Personally, I think Oliver would probably be my literal top choice to head an institution like this.

Adam Scholl 19 Apr 2024 13:43 UTC
9 points
3
in reply to: Orpheus16’s comment on: Express interest in an “FHI of the West”
I ask partly because I personally would be more excited of a version of this that wasn’t ignoring AGI timelines, but I think a version of this that’s not ignoring AGI timelines would probably be quite different from the intellectual spirit/tradition of FHI.
This frame feels a bit off to me. Partly because I don’t think FHI was ignoring timelines, and because I think their work has proved quite useful already—mostly by substantially improving our concepts for reasoning about existential risk.
But also, the portfolio of alignment research with maximal expected value need not necessarily perform well in the most likely particular world. One might imagine, for example—and indeed this is my own bet—that the most valuable actions we can take will only actually save us in the subset of worlds in which we have enough time to develop a proper science of alignment.

Adam Scholl 17 Apr 2024 8:08 UTC
20 points
5
in reply to: cdwhite’s comment on: Paul Christiano named as US AI Safety Institute Head of AI Safety
I agree metrology is cool! But I think units are mostly helpful for engineering insofar as they reflect fundamental laws of nature—see e.g. the metric units—and we don’t have those yet for AI. Until we do, I expect attempts to define them will be vague, high-level descriptions more than deep scientific understanding.
(And I think the former approach has a terrible track record, at least when used to define units of risk or controllability—e.g. BSL levels, which have failed so consistently and catastrophically they’ve induced an EA cause area, and which for some reason AI labs are starting to emulate).

OMMC Announces RIP

Adam Scholl and aysja

1 Apr 2024 23:20 UTC

189 points

5 comments2 min readLW link

Adam Scholl 6 Mar 2024 23:37 UTC
10 points
6
in reply to: evhub’s comment on: Anthropic release Claude 3, claims >GPT-4 Performance
I assumed “anyone” was meant to include OpenAI—do you interpret it as just describing novel entrants? If so I agree that wouldn’t be contradictory, but it seems like a strange interpretation to me in the context of a pitch deck asking investors for a billion dollars.

Adam Scholl 6 Mar 2024 22:25 UTC
10 points
4
in reply to: Jacob Pfau’s comment on: Anthropic release Claude 3, claims >GPT-4 Performance
I agree it’s common for startups to somewhat oversell their products to investors, but I think it goes far beyond “somewhat”—maybe even beyond the bar for criminal fraud, though I’m not sure—to tell investors you’re aiming to soon get “too far ahead for anyone to catch up in subsequent cycles,” if your actual plan is to avoid getting meaningfully ahead at all.

Adam Scholl 6 Mar 2024 21:05 UTC
10 points
−4
in reply to: Jacob Pfau’s comment on: Anthropic release Claude 3, claims >GPT-4 Performance
“Diverting money” strikes me as the wrong frame here. Partly because I doubt this actually was the consequence—i.e., I doubt OpenAI etc. had a meaningfully harder time raising capital because of Anthropic’s raise—but also because it leaves out the part where this purported desirable consequence was achieved via (what seems to me like) straightforward deception!
If indeed Dario told investors he hoped to obtain an insurmountable lead soon, while telling Dustin and others that he was committed to avoid gaining any meaningful lead, then it sure seems like one of those claims was a lie. And by my ethical lights, this seems like a horribly unethical thing to lie about, regardless of whether it somehow caused OpenAI to have less money.

Adam Scholl 6 Mar 2024 13:55 UTC
5 points
0
in reply to: jimrandomh’s comment on: Jimrandomh’s Shortform Posts
Huh, I’ve also noticed a larger effect from indoors/outdoors than seems reflected by CO2 monitors, and that I seem smarter when it’s windy, but I never thought of this hypothesis; it’s interesting, thanks.

Adam Scholl 5 Mar 2024 19:06 UTC
28 points
23
in reply to: LawrenceC’s comment on: Anthropic release Claude 3, claims >GPT-4 Performance
Yeah, seems plausible; but either way it seems worth noting that Dario left Dustin, Evan and Anthropic’s investors with quite different impressions here.

Adam Scholl 5 Mar 2024 12:41 UTC
50 points
21
in reply to: evhub’s comment on: Anthropic release Claude 3, claims >GPT-4 Performance
It seems Dario left Dustin Moskovitz with a different impression—that Anthropic had a policy/commitment to not meaningfully advance the frontier:
What links here?

Adam Scholl

Non-Dis­par­age­ment Ca­naries for OpenAI

OMMC An­nounces RIP

Non-Disparagement Canaries for OpenAI

OMMC Announces RIP