RSS

jacquesthibs

Karma: 1,678

I work primarily on AI Alignment. Scroll down to my pinned Shortform for an idea of my current work and who I’d like to collaborate with.

Website: https://​​jacquesthibodeau.com

Twitter: https://​​twitter.com/​​JacquesThibs

GitHub: https://​​github.com/​​JayThibs

Paus­ing AI Devel­op­ments Isn’t Enough. We Need to Shut it All Down by Eliezer Yudkowsky

jacquesthibs29 Mar 2023 23:16 UTC
298 points
296 comments3 min readLW link
(time.com)

But is it re­ally in Rome? An in­ves­ti­ga­tion of the ROME model edit­ing technique

jacquesthibs30 Dec 2022 2:40 UTC
102 points
1 comment18 min readLW link

Re­sults from a sur­vey on tool use and work­flows in al­ign­ment research

19 Dec 2022 15:19 UTC
79 points
2 comments19 min readLW link

[Question] What‘s in your list of un­solved prob­lems in AI al­ign­ment?

jacquesthibs7 Mar 2023 18:58 UTC
60 points
9 comments1 min readLW link

AI Align­ment YouTube Playlists

9 May 2022 21:33 UTC
30 points
4 comments1 min readLW link

Fore­sight for AGI Safety Strat­egy: Miti­gat­ing Risks and Iden­ti­fy­ing Golden Opportunities

jacquesthibs5 Dec 2022 16:09 UTC
28 points
6 comments8 min readLW link

[Question] How is ARC plan­ning to use ELK?

jacquesthibs15 Dec 2022 20:11 UTC
24 points
5 comments1 min readLW link

[Question] Can in­de­pen­dent re­searchers get a spon­sored visa for the US or UK?

jacquesthibs24 Mar 2023 6:10 UTC
20 points
1 comment1 min readLW link

Is the “Valley of Con­fused Ab­strac­tions” real?

jacquesthibs5 Dec 2022 13:36 UTC
19 points
11 comments2 min readLW link

AISC Pro­ject: Bench­marks for Stable Reflectivity

jacquesthibs13 Nov 2023 14:51 UTC
17 points
0 comments8 min readLW link