Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Responsible Scaling Policies
Tag
Last edit:
27 Oct 2023 19:43 UTC
by
elifland
As proposed by
ARC Evals
, and with a version implemented by
Anthropic
Relevant
New
Old
Anthropic: Reflections on our Responsible Scaling Policy
Zac Hatfield-Dodds
20 May 2024 4:14 UTC
29
points
21
comments
10
min read
LW
link
(www.anthropic.com)
What’s up with “Responsible Scaling Policies”?
habryka
and
ryan_greenblatt
29 Oct 2023 4:17 UTC
99
points
8
comments
20
min read
LW
link
Responsible Scaling Policies Are Risk Management Done Wrong
simeon_c
25 Oct 2023 23:46 UTC
120
points
34
comments
22
min read
LW
link
(www.navigatingrisks.ai)
Thoughts on responsible scaling policies and regulation
paulfchristiano
24 Oct 2023 22:21 UTC
219
points
33
comments
6
min read
LW
link
We’re Not Ready: thoughts on “pausing” and responsible scaling policies
HoldenKarnofsky
27 Oct 2023 15:19 UTC
200
points
33
comments
8
min read
LW
link
On ‘Responsible Scaling Policies’ (RSPs)
Zvi
5 Dec 2023 16:10 UTC
48
points
3
comments
37
min read
LW
link
(thezvi.wordpress.com)
ARC Evals: Responsible Scaling Policies
Zach Stein-Perlman
28 Sep 2023 4:30 UTC
40
points
9
comments
2
min read
LW
link
(evals.alignment.org)
Vaniver’s thoughts on Anthropic’s RSP
Vaniver
28 Oct 2023 21:06 UTC
46
points
4
comments
3
min read
LW
link
RSPs are pauses done right
evhub
14 Oct 2023 4:06 UTC
164
points
70
comments
7
min read
LW
link
AI #35: Responsible Scaling Policies
Zvi
26 Oct 2023 13:30 UTC
66
points
10
comments
55
min read
LW
link
(thezvi.wordpress.com)
OMMC Announces RIP
Adam Scholl
and
aysja
1 Apr 2024 23:20 UTC
181
points
5
comments
2
min read
LW
link
OpenAI: Preparedness framework
Zach Stein-Perlman
18 Dec 2023 18:30 UTC
70
points
23
comments
4
min read
LW
link
(openai.com)
On OpenAI’s Preparedness Framework
Zvi
21 Dec 2023 14:00 UTC
51
points
4
comments
21
min read
LW
link
(thezvi.wordpress.com)
OpenAI’s Preparedness Framework: Praise & Recommendations
Akash
2 Jan 2024 16:20 UTC
66
points
1
comment
7
min read
LW
link
Anthropic’s Responsible Scaling Policy & Long-Term Benefit Trust
Zac Hatfield-Dodds
19 Sep 2023 15:09 UTC
83
points
23
comments
3
min read
LW
link
(www.anthropic.com)
Dario Amodei’s prepared remarks from the UK AI Safety Summit, on Anthropic’s Responsible Scaling Policy
Zac Hatfield-Dodds
1 Nov 2023 18:10 UTC
85
points
1
comment
4
min read
LW
link
(www.anthropic.com)
Paul Christiano on Dwarkesh Podcast
ESRogs
3 Nov 2023 22:13 UTC
17
points
0
comments
1
min read
LW
link
(www.dwarkeshpatel.com)
How are voluntary commitments on vulnerability reporting going?
Adam Jones
22 Feb 2024 8:43 UTC
23
points
1
comment
1
min read
LW
link
(adamjones.me)
A call for a quantitative report card for AI bioterrorism threat models
Juno
4 Dec 2023 6:35 UTC
12
points
0
comments
10
min read
LW
link
No comments.
Back to top