RSS

jacquesthibs

Karma: 1,931

I work primarily on AI Alignment. Scroll down to my pinned Shortform for an idea of my current work and who I’d like to collaborate with.

Website: https://​​jacquesthibodeau.com

Twitter: https://​​twitter.com/​​JacquesThibs

GitHub: https://​​github.com/​​JayThibs

[Question] Shane Legg’s nec­es­sary prop­er­ties for ev­ery AGI Safety plan

jacquesthibs1 May 2024 17:15 UTC
56 points
12 comments1 min readLW link

AISC Pro­ject: Bench­marks for Stable Reflectivity

jacquesthibs13 Nov 2023 14:51 UTC
17 points
0 comments8 min readLW link

Re­search agenda: Su­per­vis­ing AIs im­prov­ing AIs

29 Apr 2023 17:09 UTC
76 points
5 comments19 min readLW link

Paus­ing AI Devel­op­ments Isn’t Enough. We Need to Shut it All Down by Eliezer Yudkowsky

jacquesthibs29 Mar 2023 23:16 UTC
298 points
296 comments3 min readLW link
(time.com)

Prac­ti­cal Pit­falls of Causal Scrubbing

27 Mar 2023 7:47 UTC
87 points
17 comments13 min readLW link

[Question] Can in­de­pen­dent re­searchers get a spon­sored visa for the US or UK?

jacquesthibs24 Mar 2023 6:10 UTC
20 points
1 comment1 min readLW link

[Question] What‘s in your list of un­solved prob­lems in AI al­ign­ment?

jacquesthibs7 Mar 2023 18:58 UTC
60 points
9 comments1 min readLW link

[Si­mu­la­tors sem­i­nar se­quence] #2 Semiotic physics—revamped

27 Feb 2023 0:25 UTC
23 points
23 comments13 min readLW link

Kolb’s: an ap­proach to con­sciously get bet­ter at anything

jacquesthibs3 Jan 2023 18:16 UTC
11 points
1 comment6 min readLW link

[Si­mu­la­tors sem­i­nar se­quence] #1 Back­ground & shared assumptions

2 Jan 2023 23:48 UTC
49 points
4 comments3 min readLW link

But is it re­ally in Rome? An in­ves­ti­ga­tion of the ROME model edit­ing technique

jacquesthibs30 Dec 2022 2:40 UTC
103 points
1 comment18 min readLW link

Re­sults from a sur­vey on tool use and work­flows in al­ign­ment research

19 Dec 2022 15:19 UTC
79 points
2 comments19 min readLW link

[Question] How is ARC plan­ning to use ELK?

jacquesthibs15 Dec 2022 20:11 UTC
24 points
5 comments1 min readLW link

Fore­sight for AGI Safety Strat­egy: Miti­gat­ing Risks and Iden­ti­fy­ing Golden Opportunities

jacquesthibs5 Dec 2022 16:09 UTC
28 points
6 comments8 min readLW link

Is the “Valley of Con­fused Ab­strac­tions” real?

jacquesthibs5 Dec 2022 13:36 UTC
19 points
11 comments2 min readLW link

jacquesthibs’s Shortform

jacquesthibs21 Nov 2022 12:04 UTC
2 points
190 comments1 min readLW link

A de­scrip­tive, not pre­scrip­tive, overview of cur­rent AI Align­ment Research

6 Jun 2022 21:59 UTC
138 points
21 comments7 min readLW link

AI Align­ment YouTube Playlists

9 May 2022 21:33 UTC
30 points
4 comments1 min readLW link

A sur­vey of tool use and work­flows in al­ign­ment research

23 Mar 2022 23:44 UTC
45 points
4 comments1 min readLW link