Ran­somware Pay­ments Should Re­quire a Sin Tax

Brian BienJul 22, 2024, 9:16 PM
20 points

9 votes

Overall karma indicates overall quality.

10 comments2 min readLW link

The Elu­sive Root Cause of Schizophre­nia—Th­e­sis In­tro­duc­tion Only

kareempforbesJul 22, 2024, 8:24 PM
−9 points

5 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

Is Chi­nese AGI a valid con­cern for the USA?

sammyboizJul 22, 2024, 8:21 PM
0 points

7 votes

Overall karma indicates overall quality.

2 comments9 min readLW link

Try­ing to un­der­stand Han­son’s Cul­tural Drift argument

KempJul 22, 2024, 8:20 PM
18 points

8 votes

Overall karma indicates overall quality.

3 comments2 min readLW link

Effi­cient Dic­tionary Learn­ing with Switch Sparse Autoencoders

Anish MudideJul 22, 2024, 6:45 PM
118 points

67 votes

Overall karma indicates overall quality.

20 comments12 min readLW link

An­a­lyz­ing Deep­Mind’s Prob­a­bil­is­tic Meth­ods for Eval­u­at­ing Agent Capabilities

Jul 22, 2024, 4:17 PM
69 points

32 votes

Overall karma indicates overall quality.

0 comments16 min readLW link

The Gar­den of Eden

Alexander TurokJul 22, 2024, 4:07 PM
23 points

11 votes

Overall karma indicates overall quality.

2 comments9 min readLW link

Car­ing about excellence

owencbJul 22, 2024, 2:24 PM
47 points

19 votes

Overall karma indicates overall quality.

4 comments6 min readLW link

Tim Dillon’s fake busi­ness al­tered my per­spec­tive more sig­nifi­cantly than any other video I have watched in the last 24 months

Stuart JohnsonJul 22, 2024, 12:54 PM
6 points

7 votes

Overall karma indicates overall quality.

0 comments1 min readLW link
(youtu.be)

On the CrowdStrike Incident

ZviJul 22, 2024, 12:40 PM
75 points

33 votes

Overall karma indicates overall quality.

14 comments17 min readLW link
(thezvi.wordpress.com)

Auto-En­hance: Devel­op­ing a meta-bench­mark to mea­sure LLM agents’ abil­ity to im­prove other agents

Jul 22, 2024, 12:33 PM
20 points

19 votes

Overall karma indicates overall quality.

0 comments14 min readLW link

What does “the uni­verse is quan­tum” ac­tu­ally mean?

TahpJul 22, 2024, 11:52 AM
2 points

1 vote

Overall karma indicates overall quality.

0 comments14 min readLW link

Ini­tial Ex­per­i­ments Us­ing SAEs to Help De­tect AI Gen­er­ated Text

Aaron_ScherJul 22, 2024, 5:16 AM
18 points

10 votes

Overall karma indicates overall quality.

1 comment14 min readLW link

Cat­e­gories of lead­er­ship on tech­ni­cal teams

benkuhnJul 22, 2024, 4:50 AM
38 points

16 votes

Overall karma indicates overall quality.

0 comments8 min readLW link
(www.benkuhn.net)

An ex­per­i­ment on hid­den cognition

Olli JärviniemiJul 22, 2024, 3:26 AM
25 points

9 votes

Overall karma indicates overall quality.

2 comments7 min readLW link

OpenAI Boy­cott Revisit

Jake DennieJul 22, 2024, 1:44 AM
17 points

15 votes

Overall karma indicates overall quality.

2 comments2 min readLW link

Coal­i­tional agency

Richard_NgoJul 22, 2024, 12:09 AM
61 points

17 votes

Overall karma indicates overall quality.

6 comments6 min readLW link

The AI Driver’s Li­cence—A Policy Proposal

Jul 21, 2024, 8:38 PM
0 points

10 votes

Overall karma indicates overall quality.

1 comment19 min readLW link

De­mog­ra­phy and Destiny

Zero ContradictionsJul 21, 2024, 8:34 PM
6 points

7 votes

Overall karma indicates overall quality.

11 comments1 min readLW link
(thewaywardaxolotl.blogspot.com)

The $100B plan with “70% risk of kil­ling us all” w Stephen Fry [video]

Oleg TrottJul 21, 2024, 8:06 PM
35 points

16 votes

Overall karma indicates overall quality.

8 comments1 min readLW link
(www.youtube.com)

Rais­ing Welfare for Lab Rodents

xanderbalwitJul 21, 2024, 7:18 PM
−2 points

6 votes

Overall karma indicates overall quality.

0 comments1 min readLW link
(press.asimov.com)

A sim­ple model of math skill

Alex_AltairJul 21, 2024, 6:57 PM
101 points

55 votes

Overall karma indicates overall quality.

16 comments8 min readLW link

Us­ing an LLM per­plex­ity filter to de­tect weight exfiltration

Adam KarvonenJul 21, 2024, 6:18 PM
25 points

12 votes

Overall karma indicates overall quality.

11 comments2 min readLW link

[Question] Would a scope-in­sen­si­tive AGI be less likely to in­ca­pac­i­tate hu­man­ity?

Jim BuhlerJul 21, 2024, 2:15 PM
2 points

1 vote

Overall karma indicates overall quality.

3 comments1 min readLW link

Holo­mor­phic sur­jec­tion the­o­rem (Pi­card’s lit­tle the­o­rem)

dkl9Jul 21, 2024, 1:24 PM
15 points

6 votes

Overall karma indicates overall quality.

0 comments2 min readLW link
(dkl9.net)

aim­less ace an­a­lyzes ac­tive am­a­teur: a micro-aaaaal­ign­ment proposal

lemonhopeJul 21, 2024, 12:37 PM
12 points

7 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

Pivotal Acts are eas­ier than Align­ment?

Michael SoareverixJul 21, 2024, 12:15 PM
2 points

5 votes

Overall karma indicates overall quality.

4 comments1 min readLW link

Ball Sq Pathways

jefftkJul 21, 2024, 2:20 AM
13 points

4 votes

Overall karma indicates overall quality.

1 comment1 min readLW link
(www.jefftk.com)

Free­dom and Pri­vacy of Thought Architectures

SebastianG Jul 20, 2024, 9:43 PM
5 points

1 vote

Overall karma indicates overall quality.

2 comments1 min readLW link

Why Ge­or­gism Lost Its Popularity

Zero ContradictionsJul 20, 2024, 3:08 PM
47 points

33 votes

Overall karma indicates overall quality.

55 comments1 min readLW link
(zerocontradictions.net)

Only Fools Avoid Hind­sight Bias

Kevin DorstJul 20, 2024, 1:42 PM
−11 points

7 votes

Overall karma indicates overall quality.

5 comments6 min readLW link
(kevindorst.substack.com)

A more sys­tem­atic case for in­ner misalignment

Richard_NgoJul 20, 2024, 5:03 AM
31 points

11 votes

Overall karma indicates overall quality.

4 comments5 min readLW link

BatchTopK: A Sim­ple Im­prove­ment for TopK-SAEs

Jul 20, 2024, 2:20 AM
61 points

21 votes

Overall karma indicates overall quality.

0 comments4 min readLW link

Krona Compare

jefftkJul 20, 2024, 1:10 AM
10 points

2 votes

Overall karma indicates overall quality.

0 comments2 min readLW link
(www.jefftk.com)

(Ap­prox­i­mately) Deter­minis­tic Nat­u­ral Latents

Jul 19, 2024, 11:02 PM
45 points

16 votes

Overall karma indicates overall quality.

1 comment4 min readLW link

Fea­ture Tar­geted LLC Es­ti­ma­tion Dist­in­guishes SAE Fea­tures from Ran­dom Directions

Jul 19, 2024, 8:32 PM
59 points

21 votes

Overall karma indicates overall quality.

6 comments16 min readLW link

JumpReLU SAEs + Early Ac­cess to Gemma 2 SAEs

Jul 19, 2024, 4:10 PM
55 points

21 votes

Overall karma indicates overall quality.

10 comments1 min readLW link
(storage.googleapis.com)

Truth is Univer­sal: Ro­bust De­tec­tion of Lies in LLMs

Lennart BuergerJul 19, 2024, 2:07 PM
24 points

13 votes

Overall karma indicates overall quality.

3 comments2 min readLW link
(arxiv.org)

Sus­tain­abil­ity of Digi­tal Life Form Societies

Hiroshi YamakawaJul 19, 2024, 1:59 PM
19 points

5 votes

Overall karma indicates overall quality.

1 comment20 min readLW link

Ro­mae Industriae

Maxwell TabarrokJul 19, 2024, 1:03 PM
34 points

19 votes

Overall karma indicates overall quality.

2 comments7 min readLW link
(www.maximum-progress.com)

[Question] Have peo­ple given up on iter­ated dis­til­la­tion and am­plifi­ca­tion?

Chris_LeongJul 19, 2024, 12:23 PM
20 points

9 votes

Overall karma indicates overall quality.

1 comment1 min readLW link

How do we know that “good re­search” is good? (aka “di­rect eval­u­a­tion” vs “eigen-eval­u­a­tion”)

RubyJul 19, 2024, 12:31 AM
49 points

17 votes

Overall karma indicates overall quality.

21 comments6 min readLW link

Linkpost: Surely you can be serious

kaveJul 18, 2024, 10:18 PM
62 points

34 votes

Overall karma indicates overall quality.

8 comments1 min readLW link
(www.experimental-history.com)

My ex­pe­rience ap­ply­ing to MATS 6.0

micJul 18, 2024, 7:02 PM
18 points

9 votes

Overall karma indicates overall quality.

3 comments5 min readLW link

[Question] What are the ac­tual ar­gu­ments in fa­vor of com­pu­ta­tion­al­ism as a the­ory of iden­tity?

sunwillriseJul 18, 2024, 6:44 PM
15 points

12 votes

Overall karma indicates overall quality.

27 comments5 min readLW link

Yet Another Cri­tique of “Lux­ury Beliefs”

ymeskhoutJul 18, 2024, 6:37 PM
6 points

17 votes

Overall karma indicates overall quality.

10 comments9 min readLW link
(www.ymeskhout.com)

[In­terim re­search re­port] Eval­u­at­ing the Goal-Direct­ed­ness of Lan­guage Models

Jul 18, 2024, 6:19 PM
40 points

12 votes

Overall karma indicates overall quality.

4 comments11 min readLW link

In­ter­pretabil­ity in Ac­tion: Ex­plo­ra­tory Anal­y­sis of VPT, a Minecraft Agent

Jul 18, 2024, 5:02 PM
9 points

8 votes

Overall karma indicates overall quality.

0 comments1 min readLW link
(arxiv.org)

Ac­ti­va­tion Eng­ineer­ing The­o­ries of Impact

kubaneticsJul 18, 2024, 4:44 PM
6 points

5 votes

Overall karma indicates overall quality.

1 comment2 min readLW link

[Question] Me & My Clone

SimonBaarsJul 18, 2024, 4:25 PM
27 points

18 votes

Overall karma indicates overall quality.

22 comments1 min readLW link