RSS

Deep­Mind’s “​​Fron­tier Safety Frame­work” is weak and unambitious

Zach Stein-Perlman18 May 2024 3:00 UTC
46 points
2 comments4 min readLW link

In­ter­na­tional Scien­tific Re­port on the Safety of Ad­vanced AI: Key Information

Aryeh Englander18 May 2024 1:45 UTC
17 points
0 comments13 min readLW link

Good­hart in RL with KL: Appendix

Thomas Kwa18 May 2024 0:40 UTC
9 points
0 comments6 min readLW link

AI 2030 – AI Policy Roadmap

LTM17 May 2024 23:29 UTC
2 points
0 comments1 min readLW link

Lan­guage Models Model Us

eggsyntax17 May 2024 21:00 UTC
57 points
4 comments7 min readLW link

Towards Guaran­teed Safe AI: A Frame­work for En­sur­ing Ro­bust and Reli­able AI Systems

Joar Skalse17 May 2024 19:13 UTC
39 points
1 comment2 min readLW link

Deep­Mind: Fron­tier Safety Framework

Zach Stein-Perlman17 May 2024 17:30 UTC
59 points
0 comments3 min readLW link
(deepmind.google)

Iden­ti­fy­ing Func­tion­ally Im­por­tant Fea­tures with End-to-End Sparse Dic­tionary Learning

17 May 2024 16:25 UTC
38 points
1 comment4 min readLW link
(publications.apolloresearch.ai)

AISafety.com – Re­sources for AI Safety

17 May 2024 15:57 UTC
57 points
0 comments1 min readLW link

Is There Really a Child Penalty in the Long Run?

Maxwell Tabarrok17 May 2024 11:56 UTC
29 points
5 comments5 min readLW link
(www.maximum-progress.com)

My Ham­mer Time Fi­nal Exam

adios17 May 2024 9:28 UTC
9 points
1 comment3 min readLW link

D&D.Sci (Easy Mode): On The Con­struc­tion Of Im­pos­si­ble Structures

abstractapplic17 May 2024 0:25 UTC
22 points
9 comments2 min readLW link

To an LLM, ev­ery­thing looks like a logic puzzle

Jesse Richardson16 May 2024 22:21 UTC
10 points
0 comments2 min readLW link

AI Safety In­sti­tute’s In­spect hello world ex­am­ple for AI evals

TheManxLoiner16 May 2024 20:47 UTC
3 points
0 comments1 min readLW link
(lovkush.medium.com)

Feel­ing (in­stru­men­tally) Rational

Pi Rogers16 May 2024 18:56 UTC
14 points
5 comments1 min readLW link

Ad­vice for Ac­tivists from the His­tory of Environmentalism

Jeffrey Heninger16 May 2024 18:40 UTC
68 points
3 comments6 min readLW link
(blog.aiimpacts.org)

Ninety-five the­ses on AI

hamandcheese16 May 2024 17:51 UTC
13 points
0 comments7 min readLW link

FMT: a great op­por­tu­nity for soon-to-be parents

Anton Rodenhauser16 May 2024 13:24 UTC
8 points
1 comment15 min readLW link

Towards Guaran­teed Safe AI: A Frame­work for En­sur­ing Ro­bust and Reli­able AI Systems

Gunnar_Zarncke16 May 2024 13:09 UTC
50 points
4 comments1 min readLW link
(arxiv.org)

The Dun­ning-Kruger of dis­prov­ing Dun­ning-Kruger

kromem16 May 2024 10:11 UTC
31 points
0 comments5 min readLW link