Aether is hiring tech­ni­cal AI safety researchers

5 Jan 2026 22:27 UTC
22 points
0 comments2 min readLW link

[Question] Con­tinual Learn­ing Achieved?

PeterMcCluskey5 Jan 2026 22:22 UTC
−7 points
11 comments1 min readLW link

AGI will not be one spe­cific sys­tem, it’ll be the unity of all systems

henophilia5 Jan 2026 18:21 UTC
−4 points
0 comments11 min readLW link

How to tame a com­plex system

jasoncrawford5 Jan 2026 18:20 UTC
27 points
0 comments2 min readLW link
(newsletter.rootsofprogress.org)

Broad­en­ing the train­ing set for alignment

Seth Herd5 Jan 2026 17:30 UTC
40 points
11 comments9 min readLW link

Dos Capital

Zvi5 Jan 2026 16:40 UTC
71 points
10 comments17 min readLW link
(thezvi.wordpress.com)

An­nounc­ing the CLR Fun­da­men­tals Program

Tristan Cook5 Jan 2026 15:16 UTC
12 points
0 comments2 min readLW link

AI Risk timelines: 10% chance (by year X) should be the head­line (and dead­line), not 50%. And 10% is _this year_!

Greg C5 Jan 2026 11:57 UTC
61 points
18 comments1 min readLW link

Trans­form­ers, Intuitively

atharva5 Jan 2026 11:34 UTC
5 points
0 comments4 min readLW link

The Tech­nol­ogy of Liberalism

L Rudolf L5 Jan 2026 11:04 UTC
41 points
7 comments29 min readLW link
(www.nosetgauge.com)

Ax­iolog­i­cal Stopsigns

JenniferRM5 Jan 2026 7:30 UTC
34 points
6 comments16 min readLW link

Ar­tifi­cal Ex­pert/​Ex­panded Nar­row In­tel­li­gence, and Proto-AGI

Yuli_Ban5 Jan 2026 3:40 UTC
15 points
0 comments7 min readLW link

An Apho­ris­tic Overview of Tech­ni­cal AI Align­ment proposals

wassname5 Jan 2026 3:01 UTC
11 points
3 comments2 min readLW link

Claude Wrote Me a 400-Com­mit RSS Reader App

Brendan Long5 Jan 2026 2:52 UTC
35 points
11 comments3 min readLW link
(www.brendanlong.com)

The inau­gu­ral Red­wood Re­search podcast

4 Jan 2026 22:11 UTC
146 points
10 comments142 min readLW link

LessOn­line 2026 Im­prove­ment Ideas

nomagicpill4 Jan 2026 21:56 UTC
16 points
0 comments1 min readLW link

The econ­omy is a graph, not a pipeline

anithite4 Jan 2026 21:48 UTC
33 points
10 comments4 min readLW link

Cal­ling all col­lege stu­dents (and new read­ers)

neo4 Jan 2026 21:20 UTC
15 points
0 comments1 min readLW link

Rock bot­tom ter­mi­nal value

ihatenumbersinusernames74 Jan 2026 20:43 UTC
4 points
9 comments2 min readLW link

In My Misan­thropy Era

jenn4 Jan 2026 18:34 UTC
352 points
153 comments8 min readLW link
(jenn.site)

The Think­ing Machine

PeterMcCluskey4 Jan 2026 18:24 UTC
36 points
0 comments2 min readLW link
(bayesianinvestor.com)

The Maduro Poly­mar­ket bet is not “ob­vi­ously in­sider trad­ing”

ceselder4 Jan 2026 10:53 UTC
22 points
18 comments3 min readLW link

The Prob­lem with Democracy

RandStrauss4 Jan 2026 7:11 UTC
−3 points
3 comments2 min readLW link

Ex­am­ples of Sub­tle Align­ment Failures from Claude and Gemini

Tachikoma4 Jan 2026 4:29 UTC
−9 points
1 comment5 min readLW link

Four Down­sides of Train­ing Poli­cies Online

4 Jan 2026 3:17 UTC
29 points
4 comments3 min readLW link

Hu­man­ity’s Gambit

Ben Ihrig4 Jan 2026 3:08 UTC
5 points
5 comments3 min readLW link

Se­man­tic Topolog­i­cal Spaces

TristanTrim4 Jan 2026 0:58 UTC
11 points
16 comments5 min readLW link

The sur­pris­ing ad­e­quacy of the Roblox game marketplace

Esteban Restrepo3 Jan 2026 14:15 UTC
26 points
3 comments8 min readLW link
(papabos.substack.com)

Re: An­thropic Chi­nese Cy­ber-At­tack. How Do We Pro­tect Open-source Models?

Mayowa Osibodu3 Jan 2026 9:45 UTC
−1 points
2 comments6 min readLW link

Give Skep­ti­cism a Try

Ape in the coat3 Jan 2026 8:57 UTC
12 points
17 comments3 min readLW link
(apeinthecoat102771.substack.com)

Why We Should Talk Speci­fi­cally Amid Uncertainty

sbaumohl3 Jan 2026 3:04 UTC
11 points
1 comment7 min readLW link

Com­pa­nies as “proto-ASI”

beyarkay (Boyd Kane)3 Jan 2026 0:24 UTC
15 points
3 comments1 min readLW link
(boydkane.com)

AXRP Epi­sode 47 - David Rein on METR Time Horizons

DanielFilan3 Jan 2026 0:10 UTC
21 points
0 comments46 min readLW link

The Weird­ness of Dat­ing/​Mat­ing: Deep Non­con­sent Preference

johnswentworth2 Jan 2026 23:05 UTC
12 points
61 comments6 min readLW link

Can AI learn hu­man so­cietal norms from so­cial feed­back (with­out re­ca­pitu­lat­ing all the ways this has failed in hu­man his­tory?)

foodforthought2 Jan 2026 22:11 UTC
7 points
3 comments4 min readLW link

Fer­til­ity Roundup #5: Causation

Zvi2 Jan 2026 22:00 UTC
19 points
5 comments25 min readLW link
(thezvi.wordpress.com)

Scale-Free Goodness

testingthewaters2 Jan 2026 21:00 UTC
10 points
3 comments5 min readLW link
(aclevername.substack.com)

Does de­vel­op­men­tal cog­ni­tive psy­chol­ogy provide any hints for mak­ing model al­ign­ment more ro­bust?

foodforthought2 Jan 2026 20:31 UTC
7 points
0 comments3 min readLW link

Does evolu­tion provide any hints for mak­ing model al­ign­ment more ro­bust?

foodforthought2 Jan 2026 19:06 UTC
5 points
0 comments4 min readLW link

Where do AI Safety Fel­lows go? An­a­lyz­ing a dataset of 600+ alumni

Christopher_Clay2 Jan 2026 18:14 UTC
20 points
2 comments5 min readLW link
(forum.effectivealtruism.org)

In­struct Vec­tors—Base mod­els can be in­struct with ac­ti­va­tion vectors

Eriskii2 Jan 2026 18:14 UTC
21 points
0 comments8 min readLW link

[Ad­vanced In­tro to AI Align­ment] 2. What Values May an AI Learn? — 4 Key Problems

Towards_Keeperhood2 Jan 2026 14:51 UTC
33 points
10 comments19 min readLW link

2025 Letter

zef2 Jan 2026 13:57 UTC
10 points
0 comments14 min readLW link
(zephyyr.substack.com)

2025 in AI predictions

jessicata2 Jan 2026 4:29 UTC
245 points
19 comments11 min readLW link

De­bunk­ing claims about sub­quadratic attention

Vladimir Ivanov2 Jan 2026 4:23 UTC
32 points
5 comments3 min readLW link

The bio-pirate’s guide to GLP-1 ag­o­nists

quiet_NaN2 Jan 2026 3:32 UTC
40 points
3 comments5 min readLW link

Col­lege Was Not That Ter­rible Now That I’m Not That Crazy

Zack_M_Davis1 Jan 2026 23:14 UTC
90 points
9 comments44 min readLW link
(zackmdavis.net)

Taiwan war timelines might be shorter than AI timelines

Baram Sosis1 Jan 2026 22:30 UTC
108 points
21 comments5 min readLW link

Split (Part 1)

Shoshannah Tekofsky1 Jan 2026 22:29 UTC
27 points
2 comments4 min readLW link
(shoshanigans.substack.com)

[Question] Who is re­spon­si­ble for shut­ting down rogue AI?

Cole Wyeth1 Jan 2026 21:36 UTC
45 points
2 comments1 min readLW link