Why We Need More Shovel-Ready AI Notkil­lev­ery­oneism Me­gapro­ject Proposals

Peter Berggren20 Jan 2025 22:38 UTC
36 points
1 comment6 min readLW link

Tips and Code for Em­piri­cal Re­search Workflows

20 Jan 2025 22:31 UTC
96 points
15 comments20 min readLW link

Lec­ture Series on Tiling Agents #2

abramdemski20 Jan 2025 21:02 UTC
16 points
0 comments1 min readLW link

An­nounce­ment: Learn­ing The­ory On­line Course

20 Jan 2025 19:55 UTC
63 points
33 comments4 min readLW link

The Hid­den Sta­tus Game in Hospi­tal Slacking

EpistemicExplorer20 Jan 2025 18:35 UTC
2 points
4 comments3 min readLW link

Monthly Roundup #26: Jan­uary 2025

Zvi20 Jan 2025 15:30 UTC
34 points
15 comments43 min readLW link
(thezvi.wordpress.com)

Things I have been us­ing LLMs for

Kaj_Sotala20 Jan 2025 14:20 UTC
51 points
13 comments7 min readLW link
(kajsotala.fi)

[Question] What are the chances that Su­per­hu­man Agents are already be­ing tested on the in­ter­net?

artemium20 Jan 2025 11:09 UTC
3 points
1 comment1 min readLW link

Detroit Lions—over con­fi­dence is over rated?

Hzn20 Jan 2025 10:53 UTC
6 points
0 comments1 min readLW link

Log­its, log-odds, and loss for par­allel circuits

Dmitry Vaintrob20 Jan 2025 9:56 UTC
57 points
4 comments11 min readLW link

Wor­ries about la­tent rea­son­ing in LLMs

Caleb Biddulph20 Jan 2025 9:09 UTC
47 points
11 comments7 min readLW link

SIGMI Cer­tifi­ca­tion Criteria

a littoral wizard20 Jan 2025 2:41 UTC
6 points
0 comments1 min readLW link

AXRP Epi­sode 38.5 - Adrià Gar­riga-Alonso on De­tect­ing AI Scheming

DanielFilan20 Jan 2025 0:40 UTC
9 points
0 comments16 min readLW link

The Mon­ster in Our Heads

testingthewaters19 Jan 2025 23:58 UTC
35 points
4 comments5 min readLW link

AI: How We Got Here—A Neu­ro­science Perspective

Mordechai Rorvig19 Jan 2025 23:51 UTC
5 points
0 comments2 min readLW link
(www.kickstarter.com)

Agent Foun­da­tions 2025 at CMU

19 Jan 2025 23:48 UTC
90 points
10 comments1 min readLW link

Who is mar­ket­ing AI al­ign­ment?

MrThink19 Jan 2025 21:37 UTC
23 points
4 comments1 min readLW link

Some les­sons from the OpenAI-Fron­tierMath debacle

7vik19 Jan 2025 21:09 UTC
71 points
9 comments4 min readLW link

Max­i­mally Eggy Crepes

jefftk19 Jan 2025 20:40 UTC
12 points
0 comments1 min readLW link
(www.jefftk.com)

The sec­ond bit­ter les­son — there’s a fun­da­men­tal prob­lem with al­ign­ing dis­tributed AI

aelwood19 Jan 2025 19:00 UTC
−5 points
0 comments5 min readLW link
(pursuingreality.substack.com)

The Gen­tle Romance

Richard_Ngo19 Jan 2025 18:29 UTC
244 points
46 comments15 min readLW link
(www.asimov.press)

Is the­ory good or bad for AI safety?

Dmitry Vaintrob19 Jan 2025 10:32 UTC
28 points
1 comment5 min readLW link

[Question] What’s the Right Way to think about In­for­ma­tion The­o­retic quan­tities in Neu­ral Net­works?

Dalcy19 Jan 2025 8:04 UTC
45 points
13 comments3 min readLW link

Per Trib­al­is­mum ad Astra

Martin Sustrik19 Jan 2025 6:50 UTC
30 points
5 comments2 min readLW link
(250bpm.substack.com)

Five Re­cent AI Tu­tor­ing Studies

Arjun Panickssery19 Jan 2025 3:53 UTC
94 points
0 comments2 min readLW link
(arjunpanickssery.substack.com)

Does So­ciety need a cul­tural out­let in tur­bu­lent poli­ti­cal times?

Freya Mcneill19 Jan 2025 2:45 UTC
−3 points
0 comments7 min readLW link

On Thiel’s New Amer­i­can Regime

shawkisukkar19 Jan 2025 2:45 UTC
−3 points
0 comments5 min readLW link
(shawkisukkar.substack.com)

be the per­son that makes the meet­ing productive

Oldmanrahul18 Jan 2025 22:32 UTC
9 points
0 comments1 min readLW link

Beards and Masks?

jefftk18 Jan 2025 16:00 UTC
72 points
5 comments4 min readLW link
(www.jefftk.com)

[Question] How likely is AGI to force us all to be happy for­ever? (much like in the Three Wor­lds Col­lide novel)

uhbif1918 Jan 2025 15:39 UTC
9 points
5 comments1 min readLW link

Well-be­ing in the mind, and its im­pli­ca­tions for utilitarianism

Sjlver18 Jan 2025 15:32 UTC
6 points
2 comments2 min readLW link

[Ex­er­cise] Four Ex­am­ples of Notic­ing Confusion

Logan Riggs18 Jan 2025 15:29 UTC
8 points
8 comments3 min readLW link

Scal­ing Wargam­ing for Global Catas­trophic Risks with AI

18 Jan 2025 15:10 UTC
40 points
2 comments4 min readLW link
(blog.sentinel-team.org)

Align­ment ideas

qbolec18 Jan 2025 12:43 UTC
11 points
1 comment8 min readLW link

AI-en­abled Cloud Gaming

samuelshadrach18 Jan 2025 11:56 UTC
1 point
0 comments3 min readLW link
(samuelshadrach.com)

Don’t ig­nore bad vibes you get from people

Kaj_Sotala18 Jan 2025 9:20 UTC
164 points
52 comments2 min readLW link
(kajsotala.fi)

Renor­mal­iza­tion Re­dux: QFT Tech­niques for AI Interpretability

18 Jan 2025 3:54 UTC
47 points
12 comments7 min readLW link

[Question] What’s Wrong With the Si­mu­la­tion Ar­gu­ment?

Davey18 Jan 2025 2:32 UTC
6 points
49 comments1 min readLW link

Your AI Safety fo­cus is down­stream of your AGI timeline

Michael Flood17 Jan 2025 21:24 UTC
9 points
0 comments4 min readLW link

Thoughts on the con­ser­va­tive as­sump­tions in AI control

Buck17 Jan 2025 19:23 UTC
91 points
5 comments13 min readLW link

Ti­maeus is hiring re­searchers & engineers

17 Jan 2025 19:13 UTC
65 points
4 comments4 min readLW link

Model Amnesty Project

themis17 Jan 2025 18:53 UTC
3 points
2 comments3 min readLW link

Ad­dress­ing doubts of AI progress: Why GPT-5 is not late, and why data scarcity isn’t a fun­da­men­tal limiter near term.

LDJ17 Jan 2025 18:53 UTC
2 points
0 comments2 min readLW link

Play­ing Dixit with AI: How Well LLMs De­tect ‘Me-ness’

Mariia Koroliuk17 Jan 2025 18:52 UTC
5 points
0 comments2 min readLW link

Do­ing a self-ran­dom­ized study of the im­pacts of glycine on sleep (Science is hard)

thedissonance.net17 Jan 2025 18:49 UTC
11 points
5 comments11 min readLW link

How sci-fi can have drama with­out dystopia or doomerism

jasoncrawford17 Jan 2025 15:22 UTC
19 points
3 comments3 min readLW link
(newsletter.rootsofprogress.org)

[Question] What do you mean with ‘al­ign­ment is solv­able in prin­ci­ple’?

Remmelt17 Jan 2025 15:03 UTC
3 points
9 comments1 min readLW link

Meta Pivots on Con­tent Moderation

Zvi17 Jan 2025 14:20 UTC
47 points
3 comments10 min readLW link
(thezvi.wordpress.com)

Tax Price Goug­ing?

jefftk17 Jan 2025 14:10 UTC
55 points
22 comments3 min readLW link
(www.jefftk.com)

The quan­tum red pill or: They lied to you, we live in the (den­sity) matrix

Dmitry Vaintrob17 Jan 2025 13:58 UTC
37 points
34 comments12 min readLW link