AI Safety Mentors and Mentees Program

TagLast edit: 10 May 2023 10:55 UTC by Magdalena Wache

The AI Safety Mentors and Mentees program aims to facilitate mentoring in AI safety. It does so by

helping mentees to find mentors (matchmaking)
giving some structure to the mentor-mentee relationship

Announcing AI safety Mentors and Mentees

Marius Hobbhahn23 Nov 2022 15:21 UTC

62 points

7 comments10 min readLW link

AISC Project: Modelling Trajectories of Language Models

NickyP13 Nov 2023 14:33 UTC

26 points

0 comments12 min readLW link

What Discovering Latent Knowledge Did and Did Not Find

Fabien Roger13 Mar 2023 19:29 UTC

166 points

17 comments11 min readLW link

The Translucent Thoughts Hypotheses and Their Implications

Fabien Roger9 Mar 2023 16:30 UTC

135 points

7 comments19 min readLW link

If Wentworth is right about natural abstractions, it would be bad for alignment

Wuschel Schulz8 Dec 2022 15:19 UTC

29 points

5 comments4 min readLW link

[Hebbian Natural Abstractions] Introduction

Samuel Nellessen and Jan

21 Nov 2022 20:34 UTC

34 points

3 comments4 min readLW link

(www.snellessen.com)

[Hebbian Natural Abstractions] Mathematical Foundations

Samuel Nellessen and Jan

25 Dec 2022 20:58 UTC

15 points

2 comments6 min readLW link

(www.snellessen.com)

How Do Induction Heads Actually Work in Transformers With Finite Capacity?

Fabien Roger23 Mar 2023 9:09 UTC

27 points

0 comments5 min readLW link

I made an AI safety fellowship. What I wish I knew.

Ruben Castaing8 Jun 2024 15:23 UTC

11 points

0 comments2 min readLW link

The Inter-Agent Facet of AI Alignment

Michael Oesterle18 Sep 2022 20:39 UTC

12 points

1 comment5 min readLW link

No comments.

AI Safety Men­tors and Men­tees Program

AI Safety Mentors and Mentees Program