RSS

Chris_Leong

Karma: 7,716

Link: Let’s Think Dot by Dot: Hid­den Com­pu­ta­tion in Trans­former Lan­guage Models by Ja­cob Pfau, William Mer­rill & Sa­muel R. Bowman

Chris_LeongApr 27, 2024, 1:22 PM
12 points

10 votes

Overall karma indicates overall quality.

0 comments1 min readLW link
(twitter.com)

“You’re the most beau­tiful girl in the world” and Wittgen­stei­nian Lan­guage Games

Chris_LeongApr 20, 2024, 2:54 PM
5 points

22 votes

Overall karma indicates overall quality.

18 comments1 min readLW link

The ar­gu­ment for near-term hu­man dis­em­pow­er­ment through AI

Chris_LeongApr 16, 2024, 4:50 AM
22 points

7 votes

Overall karma indicates overall quality.

2 comments1 min readLW link
(link.springer.com)

Re­v­erse Reg­u­la­tory Capture

Chris_LeongApr 11, 2024, 2:40 AM
12 points

10 votes

Overall karma indicates overall quality.

3 comments1 min readLW link

On the Con­fu­sion be­tween In­ner and Outer Misalignment

Chris_LeongMar 25, 2024, 11:59 AM
18 points

8 votes

Overall karma indicates overall quality.

10 comments1 min readLW link

The Best Es­say (Paul Gra­ham)

Chris_LeongMar 11, 2024, 7:25 PM
25 points

8 votes

Overall karma indicates overall quality.

2 comments1 min readLW link
(paulgraham.com)

[Question] Can we get an AI to “do our al­ign­ment home­work for us”?

Chris_LeongFeb 26, 2024, 7:56 AM
55 points

27 votes

Overall karma indicates overall quality.

33 comments1 min readLW link

[Question] What’s the the­ory of im­pact for ac­ti­va­tion vec­tors?

Chris_LeongFeb 11, 2024, 7:34 AM
61 points

20 votes

Overall karma indicates overall quality.

12 comments1 min readLW link

No­tice When Peo­ple Are Direc­tion­ally Correct

Chris_LeongJan 14, 2024, 2:12 PM
137 points

69 votes

Overall karma indicates overall quality.

8 comments2 min readLW link

Are Me­tac­u­lus AI Timelines In­con­sis­tent?

Chris_LeongJan 2, 2024, 6:47 AM
17 points

9 votes

Overall karma indicates overall quality.

7 comments2 min readLW link

Ran­dom Mus­ings on The­ory of Im­pact for Ac­ti­va­tion Vectors

Chris_LeongDec 7, 2023, 1:07 PM
8 points

3 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

Good­hart’s Law Ex­am­ple: Train­ing Ver­ifiers to Solve Math Word Problems

Chris_LeongNov 25, 2023, 12:53 AM
27 points

10 votes

Overall karma indicates overall quality.

2 comments1 min readLW link
(arxiv.org)

Up­com­ing Feed­back Op­por­tu­nity on Dual-Use Foun­da­tion Models

Chris_LeongNov 2, 2023, 4:28 AM
3 points

3 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

On Hav­ing No Clue

Chris_LeongNov 1, 2023, 1:36 AM
20 points

14 votes

Overall karma indicates overall quality.

11 comments1 min readLW link

Is Yann LeCun straw­man­ning AI x-risks?

Chris_LeongOct 19, 2023, 11:35 AM
26 points

18 votes

Overall karma indicates overall quality.

4 comments1 min readLW link

Don’t Dis­miss Sim­ple Align­ment Approaches

Chris_LeongOct 7, 2023, 12:35 AM
138 points

63 votes

Overall karma indicates overall quality.

9 comments4 min readLW link

[Question] What ev­i­dence is there of LLM’s con­tain­ing world mod­els?

Chris_LeongOct 4, 2023, 2:33 PM
17 points

10 votes

Overall karma indicates overall quality.

17 comments1 min readLW link

The Role of Groups in the Pro­gres­sion of Hu­man Understanding

Chris_LeongSep 27, 2023, 3:09 PM
11 points

3 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

The Flow-Through Fallacy

Chris_LeongSep 13, 2023, 4:28 AM
21 points

12 votes

Overall karma indicates overall quality.

7 comments1 min readLW link

Char­i­ots of Philo­soph­i­cal Fire

Chris_LeongAug 26, 2023, 12:52 AM
12 points

4 votes

Overall karma indicates overall quality.

0 comments1 min readLW link
(l.facebook.com)