Try­ing the Ob­vi­ous Thing

16 Jul 2025 22:24 UTC
35 points
2 comments3 min readLW link
(cognition.cafe)

Emer­gence vs En­tropy—a uni­ver­sal paradox

James Stephen Brown16 Jul 2025 21:31 UTC
4 points
0 comments4 min readLW link

Selec­tive Gen­er­al­iza­tion: Im­prov­ing Ca­pa­bil­ities While Main­tain­ing Alignment

16 Jul 2025 21:25 UTC
66 points
4 comments7 min readLW link

Body­dou­ble /​ Think­ing As­sis­tant matchmaking

Raemon16 Jul 2025 19:54 UTC
51 points
10 comments2 min readLW link

Zero sum ex­pec­ta­tions as an ex­pla­na­tion of om­ni­cide-indifference

asasz16 Jul 2025 19:25 UTC
2 points
6 comments2 min readLW link

On the ge­o­met­ri­cal Na­ture of Insight

Giuseppe Birardi16 Jul 2025 19:12 UTC
3 points
0 comments41 min readLW link

Van­cou­ver Ra­tion­al­ists/​Tran­shu­man­ists/​Fu­tur­ists Beach Meetup

apocalypticc16 Jul 2025 19:09 UTC
2 points
0 comments1 min readLW link

What is the prob­a­bil­ity that fu­ture AI de­vel­op­ment will be se­ri­ously de­layed or ended due to en­ergy de­cline ?

AdamLacerdo16 Jul 2025 19:08 UTC
−1 points
12 comments1 min readLW link

Re­boot­ing the Singularity

cdkg16 Jul 2025 18:26 UTC
8 points
0 comments1 min readLW link
(philpapers.org)

Be­ing and Existence

Gordon Seidoh Worley16 Jul 2025 18:10 UTC
7 points
0 comments3 min readLW link
(uncertainupdates.substack.com)

Kimi K2

Zvi16 Jul 2025 16:20 UTC
52 points
5 comments12 min readLW link
(thezvi.wordpress.com)

[Question] How should Canada Ne­go­ti­ate with Trump on Tar­iffs?

Davey16 Jul 2025 15:56 UTC
1 point
2 comments1 min readLW link

[Question] Why haven’t we auto-trans­lated all AI al­ign­ment con­tent?

Algon16 Jul 2025 15:33 UTC
22 points
10 comments1 min readLW link

A Hal­lu­ci­na­tion Filter Idea That Might Not Scale—Yet

8harath16 Jul 2025 14:40 UTC
−5 points
0 comments2 min readLW link

Ar­tifi­cial Life Re­search Agenda

dmac_9316 Jul 2025 13:23 UTC
−11 points
0 comments1 min readLW link

On be­ing sort of back and sort of new here

Loki zen16 Jul 2025 12:55 UTC
32 points
13 comments3 min readLW link

Con­way’s Game of Life—com­plex­ity emerges from simplicity

James Stephen Brown16 Jul 2025 4:42 UTC
3 points
0 comments2 min readLW link
(nonzerosum.games)

Emer­gent Price-Fix­ing by LLM Auc­tion Agents

Lech Mazur16 Jul 2025 2:45 UTC
13 points
0 comments9 min readLW link

Map­ping Men­tal Moves

Jordan Rubin16 Jul 2025 2:28 UTC
3 points
0 comments2 min readLW link
(jordanmrubin.substack.com)

Defin­ing Mon­i­torable and Use­ful Goals

Rubi J. Hudson15 Jul 2025 23:06 UTC
11 points
0 comments16 min readLW link

[Question] Do you have any recom­men­da­tions for read­ings on global risk fore­cast­ing and anal­y­sis ap­plied to pub­lic policy de­sign on a slightly smaller scale, or for more spe­cific ob­jec­tives?

Ana Lopez15 Jul 2025 22:00 UTC
1 point
0 comments1 min readLW link

1 week fast on livestream for AI xrisk

samuelshadrach15 Jul 2025 21:36 UTC
1 point
2 comments1 min readLW link

AISN #59: EU Pub­lishes Gen­eral-Pur­pose AI Code of Practice

15 Jul 2025 18:59 UTC
10 points
0 comments4 min readLW link
(aisafety.substack.com)

Prin­ci­ples for Pick­ing Prac­ti­cal In­ter­pretabil­ity Projects

Sam Marks15 Jul 2025 17:38 UTC
27 points
0 comments13 min readLW link

Chain of Thought Mon­i­tora­bil­ity: A New and Frag­ile Op­por­tu­nity for AI Safety

15 Jul 2025 16:23 UTC
166 points
32 comments1 min readLW link
(bit.ly)

The Virtue of Fear and the Myth of “Fear­less­ness”

David_Veksler15 Jul 2025 16:10 UTC
7 points
3 comments1 min readLW link

Grok 4 Var­i­ous Things

Zvi15 Jul 2025 15:50 UTC
50 points
4 comments32 min readLW link
(thezvi.wordpress.com)

Value sys­tems of the fron­tier AIs, re­duced to slogans

Mitchell_Porter15 Jul 2025 15:10 UTC
4 points
0 comments1 min readLW link

What is David Chap­man talk­ing about when he talks about “mean­ing” in his book “Mean­ing­ness”?

SpectrumDT15 Jul 2025 14:29 UTC
22 points
15 comments2 min readLW link

Why Elimi­nat­ing De­cep­tion Won’t Align AI

Priyanka Bharadwaj15 Jul 2025 9:21 UTC
19 points
6 comments4 min readLW link

Gen­er­al­iz­ing zom­bie arguments

jessicata15 Jul 2025 5:09 UTC
23 points
9 comments7 min readLW link
(unstableontology.com)

Do con­fi­dent short timelines make sense?

15 Jul 2025 3:37 UTC
138 points
76 comments69 min readLW link

Critic Con­tri­bu­tions Are Log­i­cally Irrelevant

Zack_M_Davis15 Jul 2025 1:03 UTC
27 points
74 comments6 min readLW link

AISafety.com Hackathon 2025

Bryce Robertson15 Jul 2025 0:04 UTC
12 points
0 comments1 min readLW link

Don’t Say “I Want to Work In AI Policy”

henryj14 Jul 2025 23:19 UTC
5 points
0 comments2 min readLW link
(www.henryjosephson.com)

Re­cent Red­wood Re­search pro­ject proposals

14 Jul 2025 22:27 UTC
91 points
0 comments3 min readLW link

The Role of Re­spect: Why we in­evitably ap­peal to authority

jimmy14 Jul 2025 21:28 UTC
18 points
2 comments12 min readLW link

Mak­ing Sense of Con­scious­ness Part 3: The Pul­v­inar Nucleus

sarahconstantin14 Jul 2025 21:20 UTC
14 points
0 comments10 min readLW link
(sarahconstantin.substack.com)

LLM-in­duced craz­i­ness and base rates

Kaj_Sotala14 Jul 2025 21:16 UTC
70 points
2 comments2 min readLW link
(andymasley.substack.com)

Nar­row Misal­ign­ment is Hard, Emer­gent Misal­ign­ment is Easy

14 Jul 2025 21:05 UTC
130 points
23 comments5 min readLW link

What do you Want out of Liter­a­ture Re­views?

Elizabeth14 Jul 2025 20:20 UTC
25 points
4 comments4 min readLW link
(acesounderglass.com)

The Three Ide­olog­i­cal Stances

14 Jul 2025 20:14 UTC
2 points
0 comments3 min readLW link
(cognition.cafe)

Vi­su­al­iz­ing AI Align­ment – CFP for AGI-2025 Work­shop (Aug 10, Live + Vir­tual)

CC4CI14 Jul 2025 20:12 UTC
9 points
0 comments4 min readLW link

[Question] Is the poli­ti­cal right be­com­ing ac­tively, ex­plic­itly an­ti­semitic?

lc14 Jul 2025 18:57 UTC
28 points
16 comments1 min readLW link

Weird Fea­tures in Protein LLMs: The Gram Lens

Jude Stiel14 Jul 2025 17:32 UTC
8 points
0 comments9 min readLW link

METR: How Does Time Hori­zon Vary Across Do­mains?

14 Jul 2025 16:13 UTC
84 points
8 comments14 min readLW link
(metr.org)

Worse Than MechaHitler

Zvi14 Jul 2025 16:00 UTC
53 points
1 comment22 min readLW link
(thezvi.wordpress.com)

How To Cause Less Suffer­ing While Eat­ing An­i­mals

Bentham's Bulldog14 Jul 2025 15:59 UTC
11 points
3 comments4 min readLW link

Self-preser­va­tion or In­struc­tion Am­bi­guity? Ex­am­in­ing the Causes of Shut­down Resistance

14 Jul 2025 14:52 UTC
67 points
18 comments11 min readLW link

Bernie San­ders (I-VT) men­tions AI loss of con­trol risk in Giz­modo interview

Matrice Jacobine14 Jul 2025 14:47 UTC
42 points
2 comments1 min readLW link
(gizmodo.com)