At­ten­tion-Fea­ture Tables in Gemma 2 Resi­d­ual Streams

J BostockAug 6, 2024, 10:56 PM
2 points
0 comments14 min readLW link

[Question] What are the strate­gic im­pli­ca­tions if aliens and Earth civ­i­liza­tions pro­duce similar util­ities?

Maxime RichéAug 6, 2024, 9:16 PM
4 points
1 comment1 min readLW link

WTH is Cere­brolysin, ac­tu­ally?

Aug 6, 2024, 8:40 PM
175 points
23 comments17 min readLW link

The Prag­matic Side of Cryp­to­graph­i­cally Box­ing AI

Bart JaworskiAug 6, 2024, 5:46 PM
6 points
0 comments9 min readLW link

In­fer­ence-Only De­bate Ex­per­i­ments Us­ing Math Problems

Aug 6, 2024, 5:44 PM
31 points
0 comments2 min readLW link

[Question] Is an AI re­li­gion jus­tified?

p4rziv4lAug 6, 2024, 3:42 PM
−35 points
11 comments1 min readLW link

Startup Roundup #2

ZviAug 6, 2024, 1:30 PM
45 points
0 comments32 min readLW link
(thezvi.wordpress.com)

Mechanis­tic Ano­maly De­tec­tion Re­search Update

Aug 6, 2024, 10:33 AM
11 points
0 comments1 min readLW link
(blog.eleuther.ai)

Rea­son­ing is not search—a chess example

p.b.Aug 6, 2024, 9:29 AM
4 points
3 comments2 min readLW link

Broadly hu­man level, cog­ni­tively com­plete AGI

p.b.Aug 6, 2024, 9:26 AM
9 points
0 comments1 min readLW link

Does Evolu­tion­ary The­ory Im­ply Ge­netic Trib­al­ism?

Zero ContradictionsAug 6, 2024, 5:43 AM
0 points
1 comment1 min readLW link
(thewaywardaxolotl.blogspot.com)

How I Learned To Stop Trust­ing Pre­dic­tion Mar­kets and Love the Arbitrage

orthonormalAug 6, 2024, 2:32 AM
200 points
30 comments3 min readLW link

John Schul­man leaves OpenAI for An­thropic [and then left An­thropic again for Think­ing Machines]

SodiumAug 6, 2024, 1:23 AM
57 points
0 comments1 min readLW link

Self-ex­plain­ing SAE features

Aug 5, 2024, 10:20 PM
60 points
13 comments10 min readLW link

Value frag­ility and AI takeover

Joe CarlsmithAug 5, 2024, 9:28 PM
76 points
5 comments30 min readLW link

Ex­cur­sions into Sparse Au­toen­coders: What is monose­man­tic­ity?

Jakub SmékalAug 5, 2024, 7:22 PM
2 points
0 comments10 min readLW link

Madrid—ACX Mee­tups Every­where Fall 2024

Pablo VillalobosAug 5, 2024, 6:36 PM
4 points
0 comments1 min readLW link

LLMs stifle cre­ativity, elimi­nate op­por­tu­ni­ties for serendipi­tous dis­cov­ery and dis­rupt in­ter­gen­er­a­tional trans­fer of wisdom

GhdzAug 5, 2024, 6:27 PM
6 points
2 comments7 min readLW link

Cir­cu­lar Reasoning

abramdemskiAug 5, 2024, 6:10 PM
91 points
37 comments8 min readLW link

Fear of cen­tral­ized power vs. fear of mis­al­igned AGI: Vi­talik Bu­terin on 80,000 Hours

Seth HerdAug 5, 2024, 3:38 PM
66 points
22 comments5 min readLW link

Four Phases of AGI

Gabe MAug 5, 2024, 1:15 PM
13 points
3 comments13 min readLW link

AI Safety at the Fron­tier: Paper High­lights, July ’24

gasteigerjoAug 5, 2024, 1:00 PM
8 points
0 comments7 min readLW link
(aisafetyfrontier.substack.com)

Game The­ory and Society

Zero ContradictionsAug 5, 2024, 4:27 AM
4 points
0 comments1 min readLW link
(thewaywardaxolotl.blogspot.com)

Near-mode think­ing on AI

Olli JärviniemiAug 4, 2024, 8:47 PM
128 points
9 comments5 min readLW link

Water­marks: Sign­ing, Brand­ing, and Boobytrapping

Shankar SivarajanAug 4, 2024, 8:41 PM
4 points
0 comments1 min readLW link

Model­ling So­cial Ex­change: A Sys­tem­a­tised Method to Judge Friend­ship Quality

Wynn WalkerAug 4, 2024, 6:49 PM
6 points
0 comments5 min readLW link

We’re not as 3-Di­men­sional as We Think

silentbobAug 4, 2024, 2:39 PM
46 points
17 comments5 min readLW link

You don’t know how bad most things are nor pre­cisely how they’re bad.

Solenoid_EntityAug 4, 2024, 2:12 PM
329 points
49 comments5 min readLW link

Can We Pre­dict Per­sua­sive­ness Bet­ter Than An­thropic?

Lennart FinkeAug 4, 2024, 2:05 PM
22 points
5 comments4 min readLW link

[Question] What should we do about COVID in 2024?

ChristianKlAug 4, 2024, 10:57 AM
20 points
2 comments1 min readLW link

To­k­enized SAEs: In­fus­ing per-to­ken bi­ases.

Aug 4, 2024, 9:17 AM
20 points
20 comments15 min readLW link

Thoughts On Democracy

Zero ContradictionsAug 4, 2024, 6:02 AM
2 points
0 comments1 min readLW link
(zerocontradictions.net)

AI Align­ment through Com­par­a­tive Advantage

artemiocobbAug 4, 2024, 12:32 AM
−2 points
4 comments3 min readLW link

La­bel­ling, Vari­ables, and In-Con­text Learn­ing in Llama2

Joshua PenmanAug 3, 2024, 7:36 PM
6 points
0 comments1 min readLW link
(colab.research.google.com)

[Question] Dan Hendrycks and EA

jeffreycarusoAug 3, 2024, 1:33 PM
−4 points
4 comments1 min readLW link

[Question] Why do Min­i­mal Bayes Nets of­ten cor­re­spond to Causal Models of Real­ity?

DalcyAug 3, 2024, 12:39 PM
27 points
1 comment1 min readLW link

Why did ChatGPT say that? Prompt en­g­ineer­ing and more, with PIZZA.

Jessica RumbelowAug 3, 2024, 12:07 PM
41 points
2 comments4 min readLW link

Co­op­er­a­tion and Align­ment in Del­e­ga­tion Games: You Need Both!

Aug 3, 2024, 10:16 AM
8 points
0 comments14 min readLW link
(www.oliversourbut.net)

SRE’s re­view of Democracy

Martin SustrikAug 3, 2024, 7:20 AM
48 points
2 comments3 min readLW link
(250bpm.substack.com)

The Case Against Libertarianism

Zero ContradictionsAug 3, 2024, 5:05 AM
−4 points
1 comment1 min readLW link
(zerocontradictions.net)

We Don’t Just Let Peo­ple Die—So What Next?

James Stephen BrownAug 3, 2024, 1:04 AM
11 points
8 comments10 min readLW link

The EA case for Trump

Judd RosenblattAug 3, 2024, 1:00 AM
14 points
1 comment1 min readLW link
(www.secondbest.ca)

I didn’t think I’d take the time to build this cal­ibra­tion train­ing game, but with web­sim it took roughly 30 sec­onds, so here it is!

mako yassAug 2, 2024, 10:35 PM
24 points
2 comments5 min readLW link

Eval­u­at­ing Sparse Au­toen­coders with Board Game Models

Aug 2, 2024, 7:50 PM
38 points
1 comment9 min readLW link

The Bit­ter Les­son for AI Safety Research

Aug 2, 2024, 6:39 PM
57 points
5 comments3 min readLW link

Eth­i­cal De­cep­tion: Should AI Ever Lie?

Jason ReidAug 2, 2024, 5:53 PM
5 points
2 comments7 min readLW link

[Question] Re­quest for AI risk quotes, es­pe­cially around speed, large im­pacts and black boxes

Nathan YoungAug 2, 2024, 5:49 PM
6 points
0 comments1 min readLW link

A Sim­ple Toy Co­her­ence Theorem

Aug 2, 2024, 5:47 PM
74 points
22 comments7 min readLW link

All the Fol­low­ing are Distinct

Gianluca CalcagniAug 2, 2024, 4:35 PM
16 points
3 comments9 min readLW link

The ‘strong’ fea­ture hy­poth­e­sis could be wrong

lewis smithAug 2, 2024, 2:33 PM
231 points
19 comments17 min readLW link