Knight Lee

Karma: 1,012

I dropped out of a MSc. in mathematics at a top university, in order to focus my time on AI safety.

Rant: the extreme wastefulness of high rent prices

Knight Lee25 May 2025 17:04 UTC

−2 points

0 comments2 min readLW link

If one surviving civilization can rescue others, shouldn’t civilizations randomize?

Knight Lee20 May 2025 15:26 UTC

−2 points

4 comments1 min readLW link

[Question] Will we survive if AI solves engineering before deception?

Knight Lee17 May 2025 19:22 UTC

21 points

13 comments1 min readLW link

Don’t you mean “the most conditionally forbidden technique?”

Knight Lee26 Apr 2025 3:45 UTC

14 points

0 comments3 min readLW link

The AI Belief-Consistency Letter

Knight Lee23 Apr 2025 12:01 UTC

−6 points

15 comments4 min readLW link

Karma Tests in Logical Counterfactual Simulations motivates strong agents to protect weak agents

Knight Lee18 Apr 2025 11:11 UTC

9 points

8 comments3 min readLW link

A Solution to Sandbagging and other Self-Provable Misalignment: Constitutional AI Detectives

Knight Lee14 Apr 2025 10:27 UTC

−3 points

2 comments4 min readLW link

Commitment Races are a technical problem ASI can easily solve

Knight Lee12 Apr 2025 22:22 UTC

7 points

6 comments6 min readLW link

Thinking Machines

Knight Lee8 Apr 2025 17:27 UTC

3 points

0 comments6 min readLW link

An idea for avoiding neuralese architectures

Knight Lee3 Apr 2025 22:23 UTC

14 points

2 comments4 min readLW link

Cycles (a short story by Claude 3.7 and me)

Knight Lee28 Feb 2025 7:04 UTC

9 points

0 comments5 min readLW link

Detailed Ideal World Benchmark

Knight Lee30 Jan 2025 2:31 UTC

5 points

2 comments2 min readLW link

Scanless Whole Brain Emulation

Knight Lee27 Jan 2025 10:00 UTC

10 points

5 comments3 min readLW link

[Question] Why do futurists care about the culture war?

Knight Lee14 Jan 2025 7:35 UTC

23 points

22 comments2 min readLW link

The “Everyone Can’t Be Wrong” Prior causes AI risk denial but helped prehistoric people

Knight Lee9 Jan 2025 5:54 UTC

1 point

0 comments2 min readLW link

Reduce AI Self-Allegiance by saying “he” instead of “I”

Knight Lee23 Dec 2024 9:32 UTC

10 points

4 comments2 min readLW link

Knight Lee’s Shortform

Knight Lee22 Dec 2024 2:35 UTC

2 points

27 comments1 min readLW link

ARC-AGI is a genuine AGI test but o3 cheated :(

Knight Lee22 Dec 2024 0:58 UTC

3 points

6 comments2 min readLW link

Why empiricists should believe in AI risk

Knight Lee11 Dec 2024 3:51 UTC

5 points

0 comments1 min readLW link

The first AGI may be a good engineer but bad strategist

Knight Lee9 Dec 2024 6:34 UTC

14 points

2 comments2 min readLW link