Ab­solute Zero: Alpha Zero for LLM

alapmi11 May 2025 20:42 UTC
23 points
16 comments1 min readLW link

AGI will re­sult from an ecosys­tem not a sin­gle firm

hamish_low11 May 2025 20:06 UTC
6 points
1 comment6 min readLW link
(cambrianr.substack.com)

Thou shalt not com­mand an al­ighned AI

Martin Vlach11 May 2025 20:02 UTC
0 points
4 comments1 min readLW link

[Question] How do I de­sign long prompts for think­ing zero shot sys­tems with dis­tinct equally dis­tributed prompt sec­tions (mis­sion, goals, mem­o­ries, how-to-re­spond,… etc) and how to main­tain llm co­her­ence?

ollie_11 May 2025 19:32 UTC
2 points
5 comments1 min readLW link

a con­fu­sion about prefer­ence orderings

nostalgebraist11 May 2025 19:30 UTC
93 points
39 comments11 min readLW link

[Book Trans­la­tion] Three Days in Dwarfland

Viliam11 May 2025 17:54 UTC
27 points
6 comments1 min readLW link

Bet­ter Air Purifiers

jefftk11 May 2025 16:50 UTC
81 points
21 comments3 min readLW link
(www.jefftk.com)

Align­ing Agents, Tools, and Simulators

11 May 2025 7:59 UTC
24 points
2 comments6 min readLW link

Con­sider not donat­ing un­der $100 to poli­ti­cal candidates

DanielFilan11 May 2025 3:20 UTC
141 points
33 comments1 min readLW link
(danielfilan.com)

Somerville Porch­fest 2025

jefftk11 May 2025 2:00 UTC
15 points
1 comment2 min readLW link
(www.jefftk.com)

It’s Okay to Feel Bad for a Bit

moridinamael10 May 2025 23:24 UTC
149 points
34 comments3 min readLW link

G.D. as Cap­i­tal­ist Evolu­tion, and the claim for hu­man­ity’s (tem­po­rary) up­per hand

Martin Vlach10 May 2025 21:18 UTC
8 points
3 comments1 min readLW link

Book Re­view: “En­coun­ters with Ein­stein” by Heisenberg

Baram Sosis10 May 2025 20:55 UTC
31 points
6 comments7 min readLW link

Where is the YIMBY move­ment for health­care?

jasoncrawford10 May 2025 20:36 UTC
20 points
10 comments2 min readLW link
(newsletter.rootsofprogress.org)

Be­come a Su­per­in­tel­li­gence Yourself

Yaroslav Granowski10 May 2025 20:20 UTC
2 points
1 comment5 min readLW link

A Look In­side a Frequentist

Eggs10 May 2025 15:18 UTC
5 points
10 comments3 min readLW link

Open-source weaponry

samuelshadrach10 May 2025 13:11 UTC
3 points
0 comments3 min readLW link
(samuelshadrach.com)

Glass box learn­ers want to be black box

Cole Wyeth10 May 2025 11:05 UTC
49 points
10 comments4 min readLW link

Takes and loose pre­dic­tions on AI progress and some key problems

zef10 May 2025 10:11 UTC
5 points
0 comments5 min readLW link
(halcyoncyborg.substack.com)

Cor­bent – A Master Plan for Next‑Gen­er­a­tion Direct Air Capture

Rudaiba10 May 2025 4:09 UTC
11 points
15 comments19 min readLW link

What if we just…didn’t build AGI? An Ar­gu­ment Against Inevitability

Nate Sharpe10 May 2025 3:37 UTC
9 points
7 comments14 min readLW link
(natezsharpe.substack.com)

Mind the Co­her­ence Gap: Les­sons from Steer­ing Llama with Goodfire

eitan sprejer9 May 2025 21:29 UTC
4 points
1 comment6 min readLW link

My Ex­pe­rience With EMDR

Sable9 May 2025 21:25 UTC
22 points
0 comments11 min readLW link
(affablyevil.substack.com)

AI’s Hid­den Game: Un­der­stand­ing Strate­gic De­cep­tion in AI and Why It Mat­ters for Our Future

EmilyinAI9 May 2025 20:01 UTC
4 points
0 comments6 min readLW link

Mud­dling Through Some Thoughts on the Na­ture of Historiography

E.G. Blee-Goldman9 May 2025 19:04 UTC
2 points
0 comments4 min readLW link

A Guide to AI 2027

koenrane9 May 2025 17:14 UTC
0 points
1 comment28 min readLW link

Let’s stop mak­ing “In­tel­li­gence scale” graphs with hu­mans and AI

Expertium9 May 2025 16:01 UTC
3 points
15 comments1 min readLW link

Slow cor­po­ra­tions as an in­tu­ition pump for AI R&D automation

9 May 2025 14:49 UTC
91 points
25 comments9 min readLW link

Cheaters Gonna Cheat Cheat Cheat Cheat Cheat

Zvi9 May 2025 14:30 UTC
55 points
4 comments22 min readLW link
(thezvi.wordpress.com)

Hu­mans vs LLM, memes as theorems

Yaroslav Granowski9 May 2025 13:26 UTC
1 point
0 comments1 min readLW link

Mov­ing to­wards a ques­tion-based plan­ning frame­work, in­stead of task lists

casualphysicsenjoyer9 May 2025 12:18 UTC
4 points
1 comment8 min readLW link
(substack.com)

Jim Bab­cock’s Main­line Doom Sce­nario: Hu­man-Level AI Can’t Con­trol Its Successor

9 May 2025 5:20 UTC
30 points
4 comments62 min readLW link
(www.youtube.com)

At­tend the 2025 Re­pro­duc­tive Fron­tiers Sum­mit, June 10-12

9 May 2025 5:17 UTC
59 points
0 comments3 min readLW link

In­ter­est In Con­flict Is In­stru­men­tally Convergent

Screwtape9 May 2025 2:16 UTC
66 points
58 comments10 min readLW link

Is ChatGPT ac­tu­ally fixed now?

sjadler8 May 2025 23:34 UTC
17 points
0 comments1 min readLW link
(stevenadler.substack.com)

Post EAG Lon­don AI x-Safety Co-work­ing Retreat

plex8 May 2025 23:00 UTC
10 points
0 comments1 min readLW link

a brief cri­tique of reduction

Vadim Golub8 May 2025 22:43 UTC
−17 points
4 comments2 min readLW link

Video & tran­script: Challenges for Safe & Benefi­cial Brain-Like AGI

Steven Byrnes8 May 2025 21:11 UTC
27 points
0 comments18 min readLW link

Ap­pendix: In­ter­pretable by De­sign—Con­straint Sets with Disjoint Limit Points

Ronak_Mehta8 May 2025 21:09 UTC
2 points
0 comments2 min readLW link

In­ter­pretable by De­sign—Con­straint Sets with Disjoint Limit Points

Ronak_Mehta8 May 2025 21:08 UTC
24 points
2 comments9 min readLW link
(ronakrm.github.io)

Is there a Half-Life for the Suc­cess Rates of AI Agents?

Matrice Jacobine8 May 2025 20:10 UTC
8 points
0 comments1 min readLW link
(www.tobyord.com)

Misal­ign­ment and Strate­gic Un­der­perfor­mance: An Anal­y­sis of Sand­bag­ging and Ex­plo­ra­tion Hacking

8 May 2025 19:06 UTC
80 points
3 comments15 min readLW link

Be­hold the Pale Child (es­cap­ing Moloch’s Mad Maze)

rogersbacon8 May 2025 16:36 UTC
8 points
16 comments11 min readLW link
(www.secretorum.life)

An al­ign­ment safety case sketch based on debate

8 May 2025 15:02 UTC
59 points
21 comments25 min readLW link
(arxiv.org)

Mechanis­tic In­ter­pretabil­ity Via Learn­ing Differ­en­tial Equa­tions: AI Safety Camp Pro­ject In­ter­me­di­ate Re­port.

8 May 2025 14:45 UTC
8 points
0 comments7 min readLW link

AI #115: The Evil Ap­pli­ca­tions Division

Zvi8 May 2025 13:40 UTC
32 points
3 comments62 min readLW link
(thezvi.wordpress.com)

The Stegano­graphic Po­ten­tials of Lan­guage Models

8 May 2025 11:23 UTC
9 points
0 comments1 min readLW link

Our bet on whether the AI mar­ket will crash

8 May 2025 9:56 UTC
25 points
2 comments1 min readLW link

Con­cept-an­chored rep­re­sen­ta­tion en­g­ineer­ing for alignment

Sandy Fraser8 May 2025 8:59 UTC
5 points
0 comments3 min readLW link

Orthog­o­nal­ity Th­e­sis in lay­man’s terms.

Michael (@lethal_ai)8 May 2025 8:31 UTC
1 point
0 comments2 min readLW link