AI Safety Mentors and Mentees Program

TagLast edit: 10 May 2023 10:55 UTC by Magdalena Wache

The AI Safety Mentors and Mentees program aims to facilitate mentoring in AI safety. It does so by

helping mentees to find mentors (matchmaking)
giving some structure to the mentor-mentee relationship

AISC Project: Modelling Trajectories of Language Models

NickyP13 Nov 2023 14:33 UTC

25 points

0 comments12 min readLW link

How Do Induction Heads Actually Work in Transformers With Finite Capacity?

Fabien Roger23 Mar 2023 9:09 UTC

27 points

0 comments5 min readLW link

What Discovering Latent Knowledge Did and Did Not Find

Fabien Roger13 Mar 2023 19:29 UTC

164 points

16 comments11 min readLW link

The Translucent Thoughts Hypotheses and Their Implications

Fabien Roger9 Mar 2023 16:30 UTC

133 points

7 comments19 min readLW link

[Hebbian Natural Abstractions] Mathematical Foundations

Samuel Nellessen and Jan

25 Dec 2022 20:58 UTC

15 points

2 comments6 min readLW link

(www.snellessen.com)

If Wentworth is right about natural abstractions, it would be bad for alignment

Wuschel Schulz8 Dec 2022 15:19 UTC

28 points

5 comments4 min readLW link

Announcing AI safety Mentors and Mentees

Marius Hobbhahn23 Nov 2022 15:21 UTC

62 points

7 comments10 min readLW link

[Hebbian Natural Abstractions] Introduction

Samuel Nellessen and Jan

21 Nov 2022 20:34 UTC

34 points

3 comments4 min readLW link

(www.snellessen.com)

The Inter-Agent Facet of AI Alignment

Michael Oesterle18 Sep 2022 20:39 UTC

12 points

1 comment5 min readLW link

No comments.