RSS

Jeffrey Ladish

Karma: 2,250

Help keep AI un­der hu­man con­trol: Pal­isade Re­search 2026 fundraiser

18 Dec 2025 23:41 UTC
102 points
58 comments6 min readLW link

Shut­down Re­sis­tance in Rea­son­ing Models

6 Jul 2025 0:01 UTC
138 points
14 comments9 min readLW link
(palisaderesearch.org)

Bounty for Ev­i­dence on Some of Pal­isade Re­search’s Beliefs

23 Sep 2024 20:01 UTC
46 points
4 comments2 min readLW link

Take SCIFs, it’s dan­ger­ous to go alone

1 May 2024 8:02 UTC
43 points
1 comment3 min readLW link

Pal­isade is hiring Re­search Engineers

11 Nov 2023 3:09 UTC
23 points
0 comments3 min readLW link

unRLHF—Effi­ciently un­do­ing LLM safeguards

12 Oct 2023 19:58 UTC
117 points
15 comments20 min readLW link

LoRA Fine-tun­ing Effi­ciently Un­does Safety Train­ing from Llama 2-Chat 70B

12 Oct 2023 19:58 UTC
151 points
29 comments14 min readLW link

The Agency Overhang

Jeffrey Ladish21 Apr 2023 7:47 UTC
85 points
6 comments6 min readLW link

Dona­tion offsets for ChatGPT Plus subscriptions

Jeffrey Ladish16 Mar 2023 23:29 UTC
53 points
3 comments3 min readLW link