Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Knight Lee
Karma:
989
I dropped out of a MSc. in mathematics at a top university, in order to focus my time on AI safety.
All
Posts
Comments
New
Top
Old
Page
1
Rant: the extreme wastefulness of high rent prices
Knight Lee
25 May 2025 17:04 UTC
−2
points
0
comments
2
min read
LW
link
If one surviving civilization can rescue others, shouldn’t civilizations randomize?
Knight Lee
20 May 2025 15:26 UTC
−2
points
4
comments
1
min read
LW
link
[Question]
Will we survive if AI solves engineering before deception?
Knight Lee
17 May 2025 19:22 UTC
21
points
13
comments
1
min read
LW
link
Don’t you mean “the most *conditionally* forbidden technique?”
Knight Lee
26 Apr 2025 3:45 UTC
14
points
0
comments
3
min read
LW
link
The AI Belief-Consistency Letter
Knight Lee
23 Apr 2025 12:01 UTC
−6
points
15
comments
4
min read
LW
link
Karma Tests in Logical Counterfactual Simulations motivates strong agents to protect weak agents
Knight Lee
18 Apr 2025 11:11 UTC
9
points
8
comments
3
min read
LW
link
A Solution to Sandbagging and other Self-Provable Misalignment: Constitutional AI Detectives
Knight Lee
14 Apr 2025 10:27 UTC
−3
points
2
comments
4
min read
LW
link
Commitment Races are a technical problem ASI can easily solve
Knight Lee
12 Apr 2025 22:22 UTC
7
points
6
comments
6
min read
LW
link
Thinking Machines
Knight Lee
8 Apr 2025 17:27 UTC
3
points
0
comments
6
min read
LW
link
An idea for avoiding neuralese architectures
Knight Lee
3 Apr 2025 22:23 UTC
12
points
2
comments
4
min read
LW
link
Cycles (a short story by Claude 3.7 and me)
Knight Lee
28 Feb 2025 7:04 UTC
9
points
0
comments
5
min read
LW
link
Detailed Ideal World Benchmark
Knight Lee
30 Jan 2025 2:31 UTC
5
points
2
comments
2
min read
LW
link
Scanless Whole Brain Emulation
Knight Lee
27 Jan 2025 10:00 UTC
10
points
5
comments
3
min read
LW
link
[Question]
Why do futurists care about the culture war?
Knight Lee
14 Jan 2025 7:35 UTC
23
points
22
comments
2
min read
LW
link
The “Everyone Can’t Be Wrong” Prior causes AI risk denial but helped prehistoric people
Knight Lee
9 Jan 2025 5:54 UTC
1
point
0
comments
2
min read
LW
link
Reduce AI Self-Allegiance by saying “he” instead of “I”
Knight Lee
23 Dec 2024 9:32 UTC
10
points
4
comments
2
min read
LW
link
Knight Lee’s Shortform
Knight Lee
22 Dec 2024 2:35 UTC
2
points
27
comments
LW
link
ARC-AGI is a genuine AGI test but o3 cheated :(
Knight Lee
22 Dec 2024 0:58 UTC
3
points
6
comments
2
min read
LW
link
Why empiricists should believe in AI risk
Knight Lee
11 Dec 2024 3:51 UTC
5
points
0
comments
1
min read
LW
link
The first AGI may be a good engineer but bad strategist
Knight Lee
9 Dec 2024 6:34 UTC
14
points
2
comments
2
min read
LW
link
Back to top
Next