RSS

AI Safety Camp

TagLast edit: 10 May 2022 6:23 UTC by Remmelt

AI Safety Camp (AISC) is a non-profit initiative to run programs for diversely skilled researchers who want to try collaborate on an open problem for reducing AI existential risk.

Official Website

How teams went about their re­search at AI Safety Camp edi­tion 5

Remmelt28 Jun 2021 15:15 UTC
24 points
0 comments6 min readLW link

The first AI Safety Camp & onwards

Remmelt7 Jun 2018 20:13 UTC
45 points
0 comments8 min readLW link

Thoughts on AI Safety Camp

Charlie Steiner13 May 2022 7:16 UTC
24 points
7 comments7 min readLW link

Ap­pli­ca­tions for AI Safety Camp 2022 Now Open!

adamShimi17 Nov 2021 21:42 UTC
47 points
3 comments1 min readLW link

Trust-max­i­miz­ing AGI

25 Feb 2022 15:13 UTC
7 points
26 comments9 min readLW link
(universalprior.substack.com)

Ex­trac­tion of hu­man prefer­ences 👨→🤖

arunraja-hub24 Aug 2021 16:34 UTC
18 points
2 comments5 min readLW link

A brief re­view of the rea­sons multi-ob­jec­tive RL could be im­por­tant in AI Safety Research

Ben Smith29 Sep 2021 17:09 UTC
30 points
8 comments10 min readLW link

An­nounc­ing the sec­ond AI Safety Camp

Lachouette11 Jun 2018 18:59 UTC
34 points
0 comments1 min readLW link

AI Safety Re­search Camp—Pro­ject Proposal

David_Kristoffersson2 Feb 2018 4:25 UTC
29 points
11 comments8 min readLW link

The­o­ries of Mo­du­lar­ity in the Biolog­i­cal Literature

4 Apr 2022 12:48 UTC
48 points
13 comments7 min readLW link

Pro­ject In­tro: Selec­tion The­o­rems for Modularity

4 Apr 2022 12:59 UTC
69 points
20 comments16 min readLW link

Open Prob­lems in Nega­tive Side Effect Minimization

6 May 2022 9:37 UTC
12 points
7 comments17 min readLW link

Machines vs. Memes 2: Memet­i­cally-Mo­ti­vated Model Extensions

naterush31 May 2022 22:03 UTC
4 points
0 comments4 min readLW link

Machines vs Memes Part 3: Imi­ta­tion and Memes

ceru231 Jun 2022 13:36 UTC
5 points
0 comments7 min readLW link

Steganog­ra­phy and the Cy­cleGAN—al­ign­ment failure case study

Jan Czechowski11 Jun 2022 9:41 UTC
28 points
0 comments4 min readLW link

Reflec­tion Mechanisms as an Align­ment tar­get: A survey

22 Jun 2022 15:05 UTC
30 points
1 comment14 min readLW link

AISC5 Ret­ro­spec­tive: Mechanisms for Avoid­ing Tragedy of the Com­mons in Com­mon Pool Re­source Problems

27 Sep 2021 16:46 UTC
8 points
3 comments7 min readLW link

Sur­vey on AI ex­is­ten­tial risk scenarios

8 Jun 2021 17:12 UTC
63 points
11 comments7 min readLW link

Ac­knowl­edg­ing Hu­man Prefer­ence Types to Sup­port Value Learning

Nandi Sabrina Erin13 Nov 2018 18:57 UTC
34 points
4 comments9 min readLW link

Em­piri­cal Ob­ser­va­tions of Ob­jec­tive Ro­bust­ness Failures

23 Jun 2021 23:23 UTC
63 points
5 comments9 min readLW link

Dis­cus­sion: Ob­jec­tive Ro­bust­ness and In­ner Align­ment Terminology

23 Jun 2021 23:25 UTC
70 points
7 comments9 min readLW link

A sur­vey of tool use and work­flows in al­ign­ment research

23 Mar 2022 23:44 UTC
44 points
5 comments1 min readLW link

Machines vs Memes Part 1: AI Align­ment and Memetics

Harriet Farlow31 May 2022 22:03 UTC
16 points
0 comments6 min readLW link

AI takeover table­top RPG: “The Treach­er­ous Turn”

Daniel Kokotajlo30 Nov 2022 7:16 UTC
52 points
5 comments1 min readLW link

Re­sults from a sur­vey on tool use and work­flows in al­ign­ment research

19 Dec 2022 15:19 UTC
71 points
2 comments19 min readLW link

A de­scrip­tive, not pre­scrip­tive, overview of cur­rent AI Align­ment Research

6 Jun 2022 21:59 UTC
130 points
21 comments7 min readLW link

AI Safety Camp, Vir­tual Edi­tion 2023

Linda Linsefors6 Jan 2023 11:09 UTC
39 points
10 comments3 min readLW link
(aisafety.camp)

AI Safety Camp: Ma­chine Learn­ing for Scien­tific Dis­cov­ery

Eleni Angelou6 Jan 2023 3:21 UTC
2 points
0 comments1 min readLW link