RSS

Florian_Dietz

Karma: 334

De­liber­a­tive Credit As­sign­ment (DCA): Mak­ing Faith­ful Rea­son­ing Profitable

Florian_Dietz29 Jul 2025 16:23 UTC
9 points
0 comments17 min readLW link

De­liber­a­tive Credit As­sign­ment: Mak­ing Faith­ful Rea­son­ing Profitable

Florian_Dietz14 Jul 2025 9:26 UTC
9 points
3 comments17 min readLW link

Edge Cases in AI Alignment

Florian_Dietz24 Mar 2025 9:27 UTC
19 points
3 comments4 min readLW link

Split Per­son­al­ity Train­ing: Re­veal­ing La­tent Knowl­edge Through Per­son­al­ity-Shift Tokens

Florian_Dietz10 Mar 2025 16:07 UTC
42 points
7 comments9 min readLW link

Do we want al­ign­ment fak­ing?

Florian_Dietz28 Feb 2025 21:50 UTC
7 points
4 comments1 min readLW link

Re­veal­ing al­ign­ment fak­ing with a sin­gle prompt

Florian_Dietz29 Jan 2025 21:01 UTC
9 points
5 comments4 min readLW link

Flo­rian_Dietz’s Shortform

Florian_Dietz1 Jan 2025 14:27 UTC
3 points
34 comments1 min readLW link

Achiev­ing AI Align­ment through De­liber­ate Uncer­tainty in Mul­ti­a­gent Systems

Florian_Dietz17 Feb 2024 8:45 UTC
4 points
0 comments13 min readLW link

Un­der­stand­ing differ­ences be­tween hu­mans and in­tel­li­gence-in-gen­eral to build safe AGI

Florian_Dietz16 Aug 2022 8:27 UTC
7 points
8 comments1 min readLW link

logic puz­zles and loop­hole abuse

Florian_Dietz30 Sep 2017 15:45 UTC
3 points
4 comments3 min readLW link

a differ­ent per­specive on physics

Florian_Dietz26 Jun 2017 22:47 UTC
0 points
15 comments3 min readLW link

Teach­ing an AI not to cheat?

Florian_Dietz20 Dec 2016 14:37 UTC
5 points
12 comments1 min readLW link

con­trol­ling AI be­hav­ior through un­usual ax­io­matic probabilities

Florian_Dietz8 Jan 2015 17:00 UTC
5 points
11 comments1 min readLW link

ques­tion: the 40 hour work week vs Sili­con Valley?

Florian_Dietz24 Oct 2014 12:09 UTC
18 points
108 comments1 min readLW link

LessWrong’s at­ti­tude to­wards AI research

Florian_Dietz20 Sep 2014 15:02 UTC
11 points
50 comments1 min readLW link