RSS

AI Safety Camp

TagLast edit: 10 May 2022 6:23 UTC by Remmelt

AI Safety Camp (AISC) is a non-profit initiative to run programs for diversely skilled researchers who want to try collaborate on an open problem for reducing AI existential risk.

Official Website

How teams went about their re­search at AI Safety Camp edi­tion 5

Remmelt28 Jun 2021 15:15 UTC
24 points
0 comments6 min readLW link

The first AI Safety Camp & onwards

Remmelt7 Jun 2018 20:13 UTC
45 points
0 comments8 min readLW link

Thoughts on AI Safety Camp

Charlie Steiner13 May 2022 7:16 UTC
23 points
7 comments7 min readLW link

Ap­pli­ca­tions for AI Safety Camp 2022 Now Open!

adamShimi17 Nov 2021 21:42 UTC
47 points
3 comments1 min readLW link

Trust-max­i­miz­ing AGI

25 Feb 2022 15:13 UTC
7 points
26 comments9 min readLW link
(universalprior.substack.com)

Ex­trac­tion of hu­man prefer­ences 👨→🤖

arunraja-hub24 Aug 2021 16:34 UTC
18 points
2 comments5 min readLW link

A brief re­view of the rea­sons multi-ob­jec­tive RL could be im­por­tant in AI Safety Research

Ben Smith29 Sep 2021 17:09 UTC
27 points
7 comments10 min readLW link

An­nounc­ing the sec­ond AI Safety Camp

Lachouette11 Jun 2018 18:59 UTC
34 points
0 comments1 min readLW link

AI Safety Re­search Camp—Pro­ject Proposal

David_Kristoffersson2 Feb 2018 4:25 UTC
29 points
11 comments8 min readLW link

The­o­ries of Mo­du­lar­ity in the Biolog­i­cal Literature

4 Apr 2022 12:48 UTC
45 points
13 comments7 min readLW link

Pro­ject In­tro: Selec­tion The­o­rems for Modularity

4 Apr 2022 12:59 UTC
68 points
19 comments16 min readLW link

Open Prob­lems in Nega­tive Side Effect Minimization

6 May 2022 9:37 UTC
12 points
3 comments17 min readLW link

Machines vs. Memes 2: Memet­i­cally-Mo­ti­vated Model Extensions

naterush31 May 2022 22:03 UTC
4 points
0 comments4 min readLW link

Machines vs Memes Part 3: Imi­ta­tion and Memes

ceru231 Jun 2022 13:36 UTC
5 points
0 comments7 min readLW link

Steganog­ra­phy and the Cy­cleGAN—al­ign­ment failure case study

Jan Czechowski11 Jun 2022 9:41 UTC
24 points
0 comments4 min readLW link

Reflec­tion Mechanisms as an Align­ment tar­get: A survey

22 Jun 2022 15:05 UTC
28 points
1 comment14 min readLW link

AISC5 Ret­ro­spec­tive: Mechanisms for Avoid­ing Tragedy of the Com­mons in Com­mon Pool Re­source Problems

27 Sep 2021 16:46 UTC
8 points
3 comments7 min readLW link

Sur­vey on AI ex­is­ten­tial risk scenarios

8 Jun 2021 17:12 UTC
60 points
11 comments7 min readLW link

Ac­knowl­edg­ing Hu­man Prefer­ence Types to Sup­port Value Learning

Nandi Sabrina Erin13 Nov 2018 18:57 UTC
34 points
4 comments9 min readLW link

Em­piri­cal Ob­ser­va­tions of Ob­jec­tive Ro­bust­ness Failures

23 Jun 2021 23:23 UTC
63 points
5 comments9 min readLW link

Dis­cus­sion: Ob­jec­tive Ro­bust­ness and In­ner Align­ment Terminology

23 Jun 2021 23:25 UTC
67 points
6 comments9 min readLW link

A sur­vey of tool use and work­flows in al­ign­ment research

23 Mar 2022 23:44 UTC
37 points
4 comments1 min readLW link

Machines vs Memes Part 1: AI Align­ment and Memetics

Harriet Farlow31 May 2022 22:03 UTC
16 points
0 comments6 min readLW link