Sam Alt­man’s sister claims Sam sex­u­ally abused her—Part 4: Timeline, continued

pythagoras501513 Apr 2025 23:41 UTC
1 point
0 comments51 min readLW link

The Struc­ture of the Pain of Change

ReverendBayes13 Apr 2025 21:51 UTC
7 points
0 comments10 min readLW link

Луна Лавгуд и Комната Тайн, Часть 4

13 Apr 2025 20:55 UTC
3 points
0 comments4 min readLW link

Thoughts on the Dou­ble Im­pact Project

Mati_Roy13 Apr 2025 19:07 UTC
27 points
14 comments2 min readLW link

In­tro to Multi-Agent Safety

james__p13 Apr 2025 17:40 UTC
12 points
0 comments5 min readLW link

Ves­ti­gial rea­son­ing in RL

Caleb Biddulph13 Apr 2025 15:40 UTC
54 points
8 comments9 min readLW link

Four Types of Disagreement

silentbob13 Apr 2025 11:22 UTC
50 points
4 comments5 min readLW link

How I switched ca­reers from soft­ware en­g­ineer to AI policy operations

Lucie Philippon13 Apr 2025 6:37 UTC
58 points
1 comment5 min readLW link

Steel­man­ning heuris­tic arguments

Dmitry Vaintrob13 Apr 2025 1:09 UTC
78 points
0 comments17 min readLW link

MONA: Three Month Later—Up­dates and Steganog­ra­phy Without Op­ti­miza­tion Pressure

12 Apr 2025 23:15 UTC
31 points
0 comments5 min readLW link

The Era of the Divi­d­ual—are we fal­ling apart?

James Stephen Brown12 Apr 2025 22:35 UTC
3 points
2 comments4 min readLW link

Com­mit­ment Races are a tech­ni­cal prob­lem ASI can eas­ily solve

Knight Lee12 Apr 2025 22:22 UTC
7 points
6 comments6 min readLW link

The King’s Gift: How In­sti­tu­tions Re­brand Re­spon­si­bil­ity into Illu­sion

Hu Yichao12 Apr 2025 19:38 UTC
1 point
0 comments1 min readLW link

Ex­perts have it easy

beyarkay12 Apr 2025 19:32 UTC
23 points
3 comments9 min readLW link

find_pur­pose.exe

heatdeathandtaxes12 Apr 2025 19:31 UTC
−1 points
0 comments5 min readLW link
(heatdeathandtaxes.substack.com)

The Cynic Wasps in the Beehive

mempko12 Apr 2025 19:30 UTC
−3 points
0 comments1 min readLW link
(blog.mempko.com)

Луна Лавгуд и Комната Тайн, Часть 3

12 Apr 2025 19:20 UTC
3 points
0 comments2 min readLW link

[Question] What is autism?

Adam Zerner12 Apr 2025 18:12 UTC
18 points
7 comments1 min readLW link

Col­lege Ad­vice For Peo­ple Like Me

henryj12 Apr 2025 14:36 UTC
50 points
5 comments17 min readLW link
(www.henryjosephson.com)

Why does LW not put much more fo­cus on AI gov­er­nance and out­reach?

12 Apr 2025 14:24 UTC
78 points
31 comments2 min readLW link

[Question] Is Lo­cal Order a Clue to Univer­sal En­tropy? How a Failed Pro­fes­sor Searches for a ‘Sa­cred Mo­ti­va­tional Order’

P. João12 Apr 2025 13:39 UTC
2 points
2 comments2 min readLW link

What are good safety stan­dards for open source AIs from China?

ChristianKl12 Apr 2025 13:06 UTC
10 points
2 comments1 min readLW link

Will US tar­iffs push data cen­ters for large model train­ing offshore?

ChristianKl12 Apr 2025 12:47 UTC
20 points
3 comments1 min readLW link

Self prop­a­gat­ing story.

Canaletto12 Apr 2025 12:32 UTC
3 points
0 comments8 min readLW link

Cal­ling Bul­lshit—the Cheatsheet

Niklas Lehmann12 Apr 2025 11:43 UTC
13 points
4 comments2 min readLW link

The In­ter­nal Model Prin­ci­ple: A Straight­for­ward Ex­pla­na­tion

Alfred Harwood12 Apr 2025 10:58 UTC
23 points
6 comments19 min readLW link

ACX Spring Meetup 2025 @ Klang Valley, Malaysia

Yi-Yang12 Apr 2025 7:31 UTC
2 points
0 comments1 min readLW link

Distributed whistleblowing

samuelshadrach12 Apr 2025 6:36 UTC
5 points
5 comments4 min readLW link
(samuelshadrach.com)

[Question] How likely are the USA to de­cay and how will it in­fluence the AI de­vel­op­ment?

StanislavKrym12 Apr 2025 4:42 UTC
10 points
0 comments1 min readLW link

[Question] Does this game have a name?

Mis-Understandings12 Apr 2025 1:52 UTC
4 points
4 comments1 min readLW link

Bias Miti­ga­tion in Lan­guage Models by Steer­ing Features

akankshanc12 Apr 2025 0:10 UTC
1 point
0 comments9 min readLW link
(akankshanc.io)

Do we want too much from a po­ten­tially godlike AGI?

StanislavKrym11 Apr 2025 23:33 UTC
−1 points
0 comments2 min readLW link

How train­ing-gamers might func­tion (and win)

Vivek Hebbar11 Apr 2025 21:26 UTC
110 points
5 comments13 min readLW link

The limits of black-box eval­u­a­tions: two hypotheticals

TFD11 Apr 2025 20:45 UTC
1 point
0 comments4 min readLW link
(www.thefloatingdroid.com)

Com­ments on “AI 2027”

Randaly11 Apr 2025 20:32 UTC
19 points
14 comments7 min readLW link

De­bunk the myth -Test­ing the gen­er­al­ized rea­son­ing abil­ity of LLM

Defender776211 Apr 2025 20:17 UTC
1 point
5 comments4 min readLW link

The­o­ries of Im­pact for Causal­ity in AI Safety

alexisbellot11 Apr 2025 20:16 UTC
11 points
1 comment6 min readLW link

Why Big­ger Models Gen­er­al­ize Better

PapersToAGI11 Apr 2025 19:54 UTC
1 point
0 comments2 min readLW link

Can LLMs learn Stegano­graphic Rea­son­ing via RL?

11 Apr 2025 16:33 UTC
29 points
3 comments6 min readLW link

My day in 2035

Tenoke11 Apr 2025 16:31 UTC
19 points
2 comments7 min readLW link
(svilentodorov.xyz)

Youth Lockout

Xavi CF11 Apr 2025 15:05 UTC
47 points
6 comments5 min readLW link

[Question] Is the ethics of in­ter­ac­tion with prim­i­tive peo­ples already solved?

StanislavKrym11 Apr 2025 14:56 UTC
−4 points
0 comments1 min readLW link

OpenAI Re­sponses API changes mod­els’ behavior

11 Apr 2025 13:27 UTC
53 points
6 comments2 min readLW link

Weird Ran­dom New­comb Problem

Tapatakt11 Apr 2025 13:09 UTC
21 points
16 comments4 min readLW link

On Google’s Safety Plan

Zvi11 Apr 2025 12:51 UTC
57 points
6 comments33 min readLW link
(thezvi.wordpress.com)

Луна Лавгуд и Комната Тайн, Часть 2

11 Apr 2025 12:42 UTC
2 points
1 comment3 min readLW link

Paper

dynomight11 Apr 2025 12:20 UTC
43 points
12 comments3 min readLW link

Why are neuro-sym­bolic sys­tems not con­sid­ered when it comes to AI Safety?

Edy Nastase11 Apr 2025 9:41 UTC
3 points
6 comments1 min readLW link

Crash sce­nario 1: Rapidly mo­bil­ise for a 2025 AI crash

Remmelt11 Apr 2025 6:54 UTC
12 points
4 comments1 min readLW link

Cur­rency Collapse

prue11 Apr 2025 3:48 UTC
27 points
3 comments9 min readLW link
(www.prue0.com)