RSS

Towards_Keeperhood

Karma: 1,003

Simon Skade

I did (mostly non-prosaic) alignment research between Feb 2022 and Aug 2025. (Won $10k in the ELK contest, participated in MLAB and SERI MATS 3.0 & 3.1, then independent research. I mostly worked on an ambitious attempt to better understand minds to figure out how to create more understandable and pointable AIs. I started with agent foundations but then developed a more sciency agenda where I also studied concrete observations from language/​linguistics, pychology, (neuroscience—though didn’t study much here yet), and from tracking my thoughts on problems I solved (aka a good kind of introspection).)

I’m now exploring advocacy for making it more likely that we get sth like the MIRI treaty (ideally with a good exit plan like human intelligence augmentation, or possibly an alignment project with actually competent leadership).

Currently based in Germany.

[Ad­vanced In­tro to AI Align­ment] 2. What Values May an AI Learn? — 4 Key Problems

Towards_Keeperhood2 Jan 2026 14:51 UTC
33 points
10 comments19 min readLW link

[Ad­vanced In­tro to AI Align­ment] 1. Goal-Directed Rea­son­ing and Why It Matters

Towards_Keeperhood30 Dec 2025 15:48 UTC
12 points
0 comments10 min readLW link

[Ad­vanced In­tro to AI Align­ment] 0. Overview and Foundations

Towards_Keeperhood22 Dec 2025 21:20 UTC
15 points
0 comments5 min readLW link

Plan 1 and Plan 2

Towards_Keeperhood24 Oct 2025 8:18 UTC
50 points
22 comments3 min readLW link

Dark Lord’s An­swer: Re­view and Eco­nomics Excerpts

Towards_Keeperhood23 Jul 2025 17:45 UTC
16 points
6 comments17 min readLW link

Keltham on Be­com­ing more Truth-Oriented

Towards_Keeperhood28 Apr 2025 12:58 UTC
22 points
2 comments19 min readLW link

What al­ign­ment-rele­vant abil­ities might Ter­ence Tao lack?

Towards_Keeperhood7 Apr 2025 19:44 UTC
13 points
2 comments3 min readLW link

Thoughts on Creat­ing a Good Language

Towards_Keeperhood6 Apr 2025 15:57 UTC
1 point
2 comments7 min readLW link

In­tro­duc­tion to Rep­re­sent­ing Sen­tences as Log­i­cal Statements

Towards_Keeperhood5 Apr 2025 20:35 UTC
33 points
10 comments16 min readLW link

I changed my mind about orca intelligence

Towards_Keeperhood18 Mar 2025 10:15 UTC
54 points
24 comments5 min readLW link

Help make the orca lan­guage ex­per­i­ment happen

Towards_Keeperhood15 Mar 2025 21:39 UTC
9 points
12 comments5 min readLW link

Op­ti­miz­ing Feed­back to Learn Faster

Towards_Keeperhood26 Feb 2025 14:24 UTC
12 points
0 comments2 min readLW link

Con­sid­er­a­tions on orca intelligence

Towards_Keeperhood29 Dec 2024 14:35 UTC
53 points
14 comments9 min readLW link

Orca com­mu­ni­ca­tion pro­ject—seek­ing feed­back (and col­lab­o­ra­tors)

Towards_Keeperhood3 Dec 2024 17:29 UTC
38 points
16 comments2 min readLW link

[Question] What are the pri­mary drivers that caused se­lec­tion pres­sure for in­tel­li­gence in hu­mans?

Towards_Keeperhood7 Nov 2024 9:40 UTC
8 points
15 comments1 min readLW link

An al­ter­na­tive ap­proach to superbabies

Towards_Keeperhood5 Nov 2024 22:56 UTC
48 points
19 comments3 min readLW link

[Question] Could or­cas be (trained to be) smarter than hu­mans? 

Towards_Keeperhood4 Nov 2024 23:29 UTC
59 points
23 comments1 min readLW link

Rapid ca­pa­bil­ity gain around su­per­ge­nius level seems prob­a­ble even with­out in­tel­li­gence need­ing to im­prove intelligence

6 May 2024 17:09 UTC
48 points
17 comments4 min readLW link

Towards_Keep­er­hood’s Shortform

Towards_Keeperhood25 Nov 2022 11:50 UTC
2 points
28 comments1 min readLW link

Clar­ify­ing what ELK is try­ing to achieve

Towards_Keeperhood21 May 2022 7:34 UTC
22 points
1 comment5 min readLW link