jacquesthibs

Karma: 1,931

I work primarily on AI Alignment. Scroll down to my pinned Shortform for an idea of my current work and who I’d like to collaborate with.

Website: https://jacquesthibodeau.com

Twitter: https://twitter.com/JacquesThibs

GitHub: https://github.com/JayThibs

[Question] Shane Legg’s necessary properties for every AGI Safety plan

jacquesthibs1 May 2024 17:15 UTC

56 points

12 comments1 min readLW link

AISC Project: Benchmarks for Stable Reflectivity

jacquesthibs13 Nov 2023 14:51 UTC

17 points

0 comments8 min readLW link

Research agenda: Supervising AIs improving AIs

Quintin Pope, Owen D, Roman Engeler and jacquesthibs

29 Apr 2023 17:09 UTC

76 points

5 comments19 min readLW link

Pausing AI Developments Isn’t Enough. We Need to Shut it All Down by Eliezer Yudkowsky

jacquesthibs29 Mar 2023 23:16 UTC

298 points

296 comments3 min readLW link

(time.com)

Practical Pitfalls of Causal Scrubbing

Jérémy Scheurer, Phil3, tony, jacquesthibs and David Lindner

27 Mar 2023 7:47 UTC

87 points

17 comments13 min readLW link

[Question] Can independent researchers get a sponsored visa for the US or UK?

jacquesthibs24 Mar 2023 6:10 UTC

20 points

1 comment1 min readLW link

[Question] What‘s in your list of unsolved problems in AI alignment?

jacquesthibs7 Mar 2023 18:58 UTC

60 points

9 comments1 min readLW link

[Simulators seminar sequence] #2 Semiotic physics—revamped

Jan, Charlie Steiner, Logan Riggs, janus, jacquesthibs, metasemi, Michael Oesterle, Lucas Teixeira, peligrietzer and remember

27 Feb 2023 0:25 UTC

23 points

23 comments13 min readLW link

Kolb’s: an approach to consciously get better at anything

jacquesthibs3 Jan 2023 18:16 UTC

11 points

1 comment6 min readLW link

[Simulators seminar sequence] #1 Background & shared assumptions

Jan, Charlie Steiner, Logan Riggs, janus, jacquesthibs, metasemi, Michael Oesterle, Lucas Teixeira, peligrietzer and remember

2 Jan 2023 23:48 UTC

49 points

4 comments3 min readLW link

But is it really in Rome? An investigation of the ROME model editing technique

jacquesthibs30 Dec 2022 2:40 UTC

103 points

1 comment18 min readLW link

Results from a survey on tool use and workflows in alignment research

jacquesthibs, Jan, janus and Logan Riggs

19 Dec 2022 15:19 UTC

79 points

2 comments19 min readLW link

[Question] How is ARC planning to use ELK?

jacquesthibs15 Dec 2022 20:11 UTC

24 points

5 comments1 min readLW link

Foresight for AGI Safety Strategy: Mitigating Risks and Identifying Golden Opportunities

jacquesthibs5 Dec 2022 16:09 UTC

28 points

6 comments8 min readLW link

Is the “Valley of Confused Abstractions” real?

jacquesthibs5 Dec 2022 13:36 UTC

19 points

11 comments2 min readLW link

jacquesthibs’s Shortform

jacquesthibs21 Nov 2022 12:04 UTC

2 points

190 comments1 min readLW link

A descriptive, not prescriptive, overview of current AI Alignment Research

Jan, Logan Riggs, jacquesthibs and janus

6 Jun 2022 21:59 UTC

138 points

21 comments7 min readLW link

AI Alignment YouTube Playlists

jacquesthibs and remember

9 May 2022 21:33 UTC

30 points

4 comments1 min readLW link

A survey of tool use and workflows in alignment research

Logan Riggs, Jan, janus and jacquesthibs

23 Mar 2022 23:44 UTC

45 points

4 comments1 min readLW link