MIRI Newslet­ter #123

3 Jul 2025 22:56 UTC
54 points
0 comments2 min readLW link
(intelligence.org)

Mak­ing Sense of Con­scious­ness Part 2: Attention

sarahconstantin3 Jul 2025 21:20 UTC
16 points
1 comment6 min readLW link
(sarahconstantin.substack.com)

Bat­tle of the Sexes—how to solve any (solv­able) dispute

James Stephen Brown3 Jul 2025 19:21 UTC
7 points
0 comments3 min readLW link
(nonzerosum.games)

How worker co-ops can help re­store so­cial trust

B Jacobs3 Jul 2025 19:13 UTC
12 points
7 comments6 min readLW link
(bobjacobs.substack.com)

The Ul­ti­ma­tum Game—take it or leave it

James Stephen Brown3 Jul 2025 19:05 UTC
11 points
1 comment2 min readLW link
(nonzerosum.games)

A com­ment on Bayesian vs. fre­quen­tist statis­ti­cal practice

fryolysis3 Jul 2025 17:47 UTC
0 points
0 comments1 min readLW link

AISN #58: Se­nate Re­moves State AI Reg­u­la­tion Moratorium

3 Jul 2025 17:26 UTC
6 points
0 comments4 min readLW link
(newsletter.safe.ai)

Con­test for Bet­ter AGI Safety Plans

peterr3 Jul 2025 17:02 UTC
29 points
1 comment8 min readLW link
(manifund.org)

Re­search Note: Our schem­ing pre­cur­sor evals had limited pre­dic­tive power for our in-con­text schem­ing evals

Marius Hobbhahn3 Jul 2025 15:57 UTC
75 points
0 comments1 min readLW link
(www.apolloresearch.ai)

AI #123: Mo­ra­to­rium Moratorium

Zvi3 Jul 2025 15:40 UTC
33 points
1 comment49 min readLW link
(thezvi.wordpress.com)

Call for sug­ges­tions—AI safety course

boazbarak3 Jul 2025 14:30 UTC
53 points
23 comments1 min readLW link

Why I am not a poly­genic score nihilist

David Hugh-Jones3 Jul 2025 13:38 UTC
6 points
0 comments2 min readLW link
(wyclif.substack.com)

Hunch: min­i­mal­ism is correct

Adam Zerner3 Jul 2025 5:03 UTC
18 points
12 comments2 min readLW link

If Any­one Builds It, Every­one Dies: Ad­ver­tise­ment de­sign competition

yams2 Jul 2025 23:14 UTC
85 points
37 comments1 min readLW link
(intelligence.org)

Dialects for Hu­mans: Sound­ing Distinct from LLMs

nebrelbug2 Jul 2025 23:03 UTC
9 points
2 comments2 min readLW link

Congress Asks Bet­ter Questions

Zvi2 Jul 2025 22:10 UTC
48 points
1 comment17 min readLW link
(thezvi.wordpress.com)

Eat­ing Honey is (Prob­a­bly) Fine, Actually

Linch2 Jul 2025 22:09 UTC
35 points
0 comments3 min readLW link
(linch.substack.com)

On Pay­ing Attention

Alex Darby2 Jul 2025 21:52 UTC
4 points
0 comments7 min readLW link

Cur­ing PMDD with Hair Loss Pills

David Lorell2 Jul 2025 21:35 UTC
102 points
3 comments8 min readLW link

[Question] RSS feed for 1 LW user?

Commander Zander2 Jul 2025 20:19 UTC
9 points
1 comment1 min readLW link

Thought An­chors: Which LLM Rea­son­ing Steps Mat­ter?

2 Jul 2025 20:16 UTC
35 points
6 comments6 min readLW link
(www.thought-anchors.com)

Cy­ber­punk Yoga

Commander Zander2 Jul 2025 20:04 UTC
6 points
0 comments3 min readLW link

The in­fluence con­jec­ture and its implcations

Bastian Gronager2 Jul 2025 19:36 UTC
−1 points
0 comments5 min readLW link

Idea on Bayes’ Theorem

BJ33832 Jul 2025 19:27 UTC
3 points
3 comments1 min readLW link

The Pri­soner’s Dilemma—A Prob­le­matic Poster-Child

James Stephen Brown2 Jul 2025 19:10 UTC
9 points
0 comments3 min readLW link

AI Task Length Hori­zons in Offen­sive Cybersecurity

Sean Peters2 Jul 2025 18:36 UTC
70 points
10 comments12 min readLW link

Slic­ing the (Kosher) Hate Salami

ymeskhout2 Jul 2025 18:11 UTC
21 points
4 comments11 min readLW link
(www.ymeskhout.com)

Race and Gen­der Bias As An Ex­am­ple of Un­faith­ful Chain of Thought in the Wild

2 Jul 2025 16:35 UTC
181 points
25 comments4 min readLW link

Ex­ec­u­tive Be­loc­racy: Re­view of Or­ga­ni­za­tion Types

belos2 Jul 2025 15:56 UTC
−1 points
0 comments11 min readLW link
(bestofagreatlot.substack.com)

There are two fun­da­men­tally differ­ent con­straints on schemers

Buck2 Jul 2025 15:51 UTC
62 points
0 comments4 min readLW link

Myth­bust­ing the sup­posed “1,000+ AI state bills that would hob­ble in­no­va­tion”

sjadler2 Jul 2025 4:49 UTC
6 points
0 comments1 min readLW link
(stevenadler.substack.com)

[Question] Are LLMs be­ing trained us­ing LessWrong text?

Cedar2 Jul 2025 3:00 UTC
7 points
4 comments1 min readLW link

“What’s my goal?”

Raemon2 Jul 2025 2:58 UTC
122 points
9 comments2 min readLW link

Use AI to Dimensionalize

Jordan Rubin2 Jul 2025 2:43 UTC
10 points
1 comment3 min readLW link
(jordanmrubin.substack.com)

Why En­gag­ing with Global Ma­jor­ity AI Policy Matters

Heramb2 Jul 2025 1:46 UTC
9 points
0 comments2 min readLW link

Les­sons from Build­ing Sec­u­lar Ri­tual: A Win­ter Sols­tice Experiment

joshuamerriam2 Jul 2025 0:55 UTC
9 points
0 comments4 min readLW link

On The For­mal Defi­ni­tion of Alignment

Davey2 Jul 2025 0:05 UTC
4 points
3 comments1 min readLW link

AI-202X: a game be­tween hu­mans and AGIs al­igned to differ­ent fu­tures?

StanislavKrym1 Jul 2025 23:37 UTC
5 points
0 comments16 min readLW link

Aether July 2025 Update

1 Jul 2025 21:08 UTC
24 points
7 comments3 min readLW link

AI Mo­ra­to­rium Stripped From BBB

Zvi1 Jul 2025 18:50 UTC
70 points
4 comments6 min readLW link
(thezvi.wordpress.com)

Jack­son Hole ACX/​LW Thurs­day So­cial – 07/​03/​25 | 6:30 PM @ Miller Park

Diego A. Pena1 Jul 2025 18:47 UTC
1 point
0 comments1 min readLW link

Ma­nipu­lat­ing Self-Prefer­ence In LLMs

1 Jul 2025 18:03 UTC
11 points
0 comments7 min readLW link

A Sim­ple Ex­pla­na­tion of AGI Risk

TurnTrout1 Jul 2025 16:18 UTC
66 points
4 comments5 min readLW link
(turntrout.com)

Authors Have a Re­spon­si­bil­ity to Com­mu­ni­cate Clearly

TurnTrout1 Jul 2025 15:41 UTC
125 points
29 comments6 min readLW link
(turntrout.com)

Road to AnimalHarmBench

1 Jul 2025 13:38 UTC
−1 points
0 comments1 min readLW link
(forum.effectivealtruism.org)

Embed­ded Altru­ism [slides]

owencb1 Jul 2025 13:02 UTC
20 points
3 comments1 min readLW link

Se­nate Strikes Po­ten­tial AI Mo­ra­to­rium

Tristan Williams1 Jul 2025 11:49 UTC
16 points
0 comments1 min readLW link
(www.reuters.com)

[Question] Can AIs be shown their mes­sages aren’t tam­pered with?

mruwnik1 Jul 2025 9:39 UTC
4 points
10 comments1 min readLW link

SLT for AI Safety

Jesse Hoogland1 Jul 2025 4:52 UTC
63 points
0 comments3 min readLW link

Prob­le­matic Professors

Eggs1 Jul 2025 2:54 UTC
16 points
5 comments2 min readLW link