RSS

Towards Guaran­teed Safe AI: A Frame­work for En­sur­ing Ro­bust and Reli­able AI Systems

Joar Skalse17 May 2024 19:13 UTC
5 points
0 comments2 min readLW link

Deep­Mind: Fron­tier Safety Framework

Zach Stein-Perlman17 May 2024 17:30 UTC
23 points
0 comments3 min readLW link
(deepmind.google)

Iden­ti­fy­ing Func­tion­ally Im­por­tant Fea­tures with End-to-End Sparse Dic­tionary Learning

17 May 2024 16:25 UTC
23 points
0 comments4 min readLW link
(publications.apolloresearch.ai)

AISafety.com – Re­sources for AI Safety

17 May 2024 15:57 UTC
39 points
0 comments1 min readLW link

Is There Really a Child Penalty in the Long Run?

Maxwell Tabarrok17 May 2024 11:56 UTC
20 points
3 comments5 min readLW link
(www.maximum-progress.com)

My Ham­mer Time Fi­nal Exam

adios17 May 2024 9:28 UTC
7 points
1 comment3 min readLW link

D&D.Sci (Easy Mode): On The Con­struc­tion Of Im­pos­si­ble Structures

abstractapplic17 May 2024 0:25 UTC
19 points
7 comments2 min readLW link

To an LLM, ev­ery­thing looks like a logic puzzle

Jesse Richardson16 May 2024 22:21 UTC
10 points
0 comments2 min readLW link

AI Safety In­sti­tute’s In­spect hello world ex­am­ple for AI evals

TheManxLoiner16 May 2024 20:47 UTC
3 points
0 comments1 min readLW link
(lovkush.medium.com)

Feel­ing (in­stru­men­tally) Rational

Pi Rogers16 May 2024 18:56 UTC
14 points
5 comments1 min readLW link

Ad­vice for Ac­tivists from the His­tory of Environmentalism

Jeffrey Heninger16 May 2024 18:40 UTC
61 points
3 comments6 min readLW link
(blog.aiimpacts.org)

Ninety-five the­ses on AI

hamandcheese16 May 2024 17:51 UTC
12 points
0 comments7 min readLW link

FMT: a great op­por­tu­nity for soon-to-be parents

Anton Rodenhauser16 May 2024 13:24 UTC
8 points
1 comment15 min readLW link

Towards Guaran­teed Safe AI: A Frame­work for En­sur­ing Ro­bust and Reli­able AI Systems

Gunnar_Zarncke16 May 2024 13:09 UTC
47 points
4 comments1 min readLW link
(arxiv.org)

The Dun­ning-Kruger of dis­prov­ing Dun­ning-Kruger

kromem16 May 2024 10:11 UTC
27 points
0 comments5 min readLW link

A case for fair­ness-en­forc­ing ir­ra­tional behavior

cousin_it16 May 2024 9:41 UTC
9 points
3 comments2 min readLW link

Pod­cast: Eye4AI on 2023 Survey

KatjaGrace16 May 2024 7:40 UTC
8 points
0 comments1 min readLW link
(worldspiritsockpuppet.com)

Against “ar­gu­ment from over­hang risk”

RobertM16 May 2024 4:44 UTC
28 points
9 comments5 min readLW link

Do you be­lieve in hun­dred dol­lar bills ly­ing on the ground? Con­sider humming

Elizabeth16 May 2024 0:00 UTC
103 points
10 comments6 min readLW link
(acesounderglass.com)

In­tro­duc­ing Statis­ti­cal Utility Me­chan­ics: A Frame­work for Utility Maximizers

J Bostock15 May 2024 21:56 UTC
9 points
0 comments7 min readLW link