Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Florian_Dietz
Karma:
334
All
Posts
Comments
New
Top
Old
Deliberative Credit Assignment (DCA): Making Faithful Reasoning Profitable
Florian_Dietz
29 Jul 2025 16:23 UTC
9
points
0
comments
17
min read
LW
link
Deliberative Credit Assignment: Making Faithful Reasoning Profitable
Florian_Dietz
14 Jul 2025 9:26 UTC
9
points
3
comments
17
min read
LW
link
Edge Cases in AI Alignment
Florian_Dietz
24 Mar 2025 9:27 UTC
19
points
3
comments
4
min read
LW
link
Split Personality Training: Revealing Latent Knowledge Through Personality-Shift Tokens
Florian_Dietz
10 Mar 2025 16:07 UTC
42
points
7
comments
9
min read
LW
link
Do we want alignment faking?
Florian_Dietz
28 Feb 2025 21:50 UTC
7
points
4
comments
1
min read
LW
link
Revealing alignment faking with a single prompt
Florian_Dietz
29 Jan 2025 21:01 UTC
9
points
5
comments
4
min read
LW
link
Florian_Dietz’s Shortform
Florian_Dietz
1 Jan 2025 14:27 UTC
3
points
34
comments
1
min read
LW
link
Achieving AI Alignment through Deliberate Uncertainty in Multiagent Systems
Florian_Dietz
17 Feb 2024 8:45 UTC
4
points
0
comments
13
min read
LW
link
Understanding differences between humans and intelligence-in-general to build safe AGI
Florian_Dietz
16 Aug 2022 8:27 UTC
7
points
8
comments
1
min read
LW
link
logic puzzles and loophole abuse
Florian_Dietz
30 Sep 2017 15:45 UTC
3
points
4
comments
3
min read
LW
link
a different perspecive on physics
Florian_Dietz
26 Jun 2017 22:47 UTC
0
points
15
comments
3
min read
LW
link
Teaching an AI not to cheat?
Florian_Dietz
20 Dec 2016 14:37 UTC
5
points
12
comments
1
min read
LW
link
controlling AI behavior through unusual axiomatic probabilities
Florian_Dietz
8 Jan 2015 17:00 UTC
5
points
11
comments
1
min read
LW
link
question: the 40 hour work week vs Silicon Valley?
Florian_Dietz
24 Oct 2014 12:09 UTC
18
points
108
comments
1
min read
LW
link
LessWrong’s attitude towards AI research
Florian_Dietz
20 Sep 2014 15:02 UTC
11
points
50
comments
1
min read
LW
link
Back to top