RSS

Chris_Leong

Karma: 7,605

Re­v­erse Reg­u­la­tory Capture

Chris_LeongApr 11, 2024, 2:40 AM
12 points
3 comments1 min readLW link

On the Con­fu­sion be­tween In­ner and Outer Misalignment

Chris_LeongMar 25, 2024, 11:59 AM
17 points
10 comments1 min readLW link

The Best Es­say (Paul Gra­ham)

Chris_LeongMar 11, 2024, 7:25 PM
25 points
2 comments1 min readLW link
(paulgraham.com)

[Question] Can we get an AI to “do our al­ign­ment home­work for us”?

Chris_LeongFeb 26, 2024, 7:56 AM
53 points
33 comments1 min readLW link

[Question] What’s the the­ory of im­pact for ac­ti­va­tion vec­tors?

Chris_LeongFeb 11, 2024, 7:34 AM
61 points
12 comments1 min readLW link

No­tice When Peo­ple Are Direc­tion­ally Correct

Chris_LeongJan 14, 2024, 2:12 PM
136 points
8 comments2 min readLW link

Are Me­tac­u­lus AI Timelines In­con­sis­tent?

Chris_LeongJan 2, 2024, 6:47 AM
17 points
7 comments2 min readLW link

Ran­dom Mus­ings on The­ory of Im­pact for Ac­ti­va­tion Vectors

Chris_LeongDec 7, 2023, 1:07 PM
8 points
0 comments1 min readLW link

Good­hart’s Law Ex­am­ple: Train­ing Ver­ifiers to Solve Math Word Problems

Chris_LeongNov 25, 2023, 12:53 AM
27 points
2 comments1 min readLW link
(arxiv.org)

Up­com­ing Feed­back Op­por­tu­nity on Dual-Use Foun­da­tion Models

Chris_LeongNov 2, 2023, 4:28 AM
3 points
0 comments1 min readLW link

On Hav­ing No Clue

Chris_LeongNov 1, 2023, 1:36 AM
20 points
11 comments1 min readLW link

Is Yann LeCun straw­man­ning AI x-risks?

Chris_LeongOct 19, 2023, 11:35 AM
26 points
4 comments1 min readLW link

Don’t Dis­miss Sim­ple Align­ment Approaches

Chris_LeongOct 7, 2023, 12:35 AM
137 points
9 comments4 min readLW link

[Question] What ev­i­dence is there of LLM’s con­tain­ing world mod­els?

Chris_LeongOct 4, 2023, 2:33 PM
17 points
17 comments1 min readLW link

The Role of Groups in the Pro­gres­sion of Hu­man Understanding

Chris_LeongSep 27, 2023, 3:09 PM
11 points
0 comments2 min readLW link

The Flow-Through Fallacy

Chris_LeongSep 13, 2023, 4:28 AM
21 points
7 comments1 min readLW link

Char­i­ots of Philo­soph­i­cal Fire

Chris_LeongAug 26, 2023, 12:52 AM
12 points
0 comments1 min readLW link
(l.facebook.com)

Call for Papers on Global AI Gover­nance from the UN

Chris_LeongAug 20, 2023, 8:56 AM
19 points
0 commentsLW link
(www.linkedin.com)

Yann LeCun on AGI and AI Safety

Chris_LeongAug 6, 2023, 9:56 PM
37 points
13 comments1 min readLW link
(drive.google.com)

A Naive Pro­posal for Con­struct­ing In­ter­pretable AI

Chris_LeongAug 5, 2023, 10:32 AM
18 points
6 comments2 min readLW link