AI-202X: a game be­tween hu­mans and AGIs al­igned to differ­ent fu­tures?

StanislavKrym1 Jul 2025 23:37 UTC
5 points
0 comments16 min readLW link

Aether July 2025 Update

1 Jul 2025 21:08 UTC
24 points
7 comments3 min readLW link

AI Mo­ra­to­rium Stripped From BBB

Zvi1 Jul 2025 18:50 UTC
70 points
4 comments6 min readLW link
(thezvi.wordpress.com)

Jack­son Hole ACX/​LW Thurs­day So­cial – 07/​03/​25 | 6:30 PM @ Miller Park

Diego A. Pena1 Jul 2025 18:47 UTC
1 point
0 comments1 min readLW link

Ma­nipu­lat­ing Self-Prefer­ence In LLMs

1 Jul 2025 18:03 UTC
11 points
0 comments7 min readLW link

A Sim­ple Ex­pla­na­tion of AGI Risk

TurnTrout1 Jul 2025 16:18 UTC
66 points
4 comments5 min readLW link
(turntrout.com)

Authors Have a Re­spon­si­bil­ity to Com­mu­ni­cate Clearly

TurnTrout1 Jul 2025 15:41 UTC
125 points
29 comments6 min readLW link
(turntrout.com)

Road to AnimalHarmBench

1 Jul 2025 13:38 UTC
−1 points
0 comments1 min readLW link
(forum.effectivealtruism.org)

Embed­ded Altru­ism [slides]

owencb1 Jul 2025 13:02 UTC
20 points
3 comments1 min readLW link

Se­nate Strikes Po­ten­tial AI Mo­ra­to­rium

Tristan Williams1 Jul 2025 11:49 UTC
16 points
0 comments1 min readLW link
(www.reuters.com)

[Question] Can AIs be shown their mes­sages aren’t tam­pered with?

mruwnik1 Jul 2025 9:39 UTC
4 points
10 comments1 min readLW link

SLT for AI Safety

Jesse Hoogland1 Jul 2025 4:52 UTC
63 points
0 comments3 min readLW link

Prob­le­matic Professors

Eggs1 Jul 2025 2:54 UTC
16 points
5 comments2 min readLW link

I can’t tell if my ideas are good any­more be­cause I talked to robots too much

Tyson30 Jun 2025 21:21 UTC
12 points
10 comments1 min readLW link

Q1 AI Bench­mark Re­sults: Pro Fore­cast­ers Crush Bots

Ben Wilson30 Jun 2025 21:12 UTC
14 points
0 comments22 min readLW link
(www.metaculus.com)

ACX Meetup Cape Town

tegan30 Jun 2025 21:11 UTC
1 point
0 comments1 min readLW link

The best sim­ple ar­gu­ment for Paus­ing AI?

Gary Marcus30 Jun 2025 20:38 UTC
154 points
22 comments1 min readLW link

Hiring* an AI** Artist for LessWrong/​Lightcone

Raemon30 Jun 2025 19:01 UTC
30 points
6 comments1 min readLW link

SAE on ac­ti­va­tion differences

30 Jun 2025 17:50 UTC
44 points
3 comments5 min readLW link

The Spec­trum of At­ten­tion: From Em­pa­thy to Hyp­no­sis

jimmy30 Jun 2025 17:42 UTC
13 points
2 comments14 min readLW link

Sub­stack and Other Blog Recommendations

Zvi30 Jun 2025 17:20 UTC
30 points
7 comments16 min readLW link
(thezvi.wordpress.com)

What We Learned Try­ing to Diff Base and Chat Models (And Why It Mat­ters)

30 Jun 2025 17:17 UTC
105 points
2 comments7 min readLW link

Don’t Eat Honey

Bentham's Bulldog30 Jun 2025 15:57 UTC
−15 points
70 comments6 min readLW link

Pri­mary-bud­get vot­ing registration

eg30 Jun 2025 15:39 UTC
1 point
4 comments2 min readLW link

Pro­ject Vend: Can Claude run a small shop?

Gunnar_Zarncke30 Jun 2025 15:22 UTC
53 points
8 comments1 min readLW link
(www.anthropic.com)

If you want to be ve­gan but you worry about health effects of no meat, con­sider be­ing ve­gan ex­cept for mus­sels/​oysters

KatWoods30 Jun 2025 13:28 UTC
68 points
15 comments1 min readLW link

How dan­ger­ous is en­coded rea­son­ing?

artkpv30 Jun 2025 11:54 UTC
17 points
0 comments10 min readLW link

Cir­cuits in Su­per­po­si­tion 2: Now with Less Wrong Math

30 Jun 2025 10:25 UTC
72 points
0 comments20 min readLW link

life les­sons from poker

thiccythot30 Jun 2025 4:20 UTC
55 points
14 comments4 min readLW link

From Di­a­mond Min­ing to Open-World Sur­vival: Align­ment and Emer­gent Be­hav­ior in Minecraft Agents

30 Jun 2025 3:17 UTC
15 points
0 comments16 min readLW link

When Machines Do Our Jobs, Will We Re­mem­ber How to Live?

Ahmed Elsayyad30 Jun 2025 3:03 UTC
4 points
1 comment3 min readLW link

Paradigms for computation

Cole Wyeth30 Jun 2025 0:37 UTC
65 points
10 comments12 min readLW link

The In­ter­net Is Like a City (But Not in the Way You’d Think)

antonomon29 Jun 2025 22:25 UTC
20 points
0 comments8 min readLW link
(novum.substack.com)

Scien­tific Dis­cov­ery in the Age of Ar­tifi­cial Intelligence

Jessica Rumbelow29 Jun 2025 20:45 UTC
42 points
3 comments10 min readLW link

An Alter­na­tive Way to Fore­cast AGI: Count­ing Down Ca­pa­bil­ities

shash4229 Jun 2025 19:52 UTC
3 points
0 comments3 min readLW link
(open.substack.com)

Is Op­ti­mal Reflec­tion Com­pet­i­tive with Ex­tinc­tion Risk Re­duc­tion? - Re­quest­ing Reviewers

Jordan Arel29 Jun 2025 18:42 UTC
7 points
0 comments11 min readLW link

Let’s look at an­other “LLMs lack true un­der­stand­ing” paper

Expertium29 Jun 2025 14:00 UTC
3 points
0 comments4 min readLW link

I un­der­es­ti­mated safety re­search speedups from safe AI

Dan Braun29 Jun 2025 13:29 UTC
38 points
2 comments3 min readLW link

In­flight Auctions

jefftk29 Jun 2025 12:10 UTC
12 points
1 comment2 min readLW link
(www.jefftk.com)

Do Self-Per­ceived Su­per­in­tel­li­gent LLMs Ex­hibit Misal­ign­ment?

Dave Banerjee29 Jun 2025 11:06 UTC
26 points
2 comments12 min readLW link
(davebanerjee.xyz)

Con­cise­ness Manifesto

Vasyl Dotsenko29 Jun 2025 5:33 UTC
35 points
5 comments1 min readLW link

Feed­back wanted: Shortlist of AI safety ideas

mmKALLL29 Jun 2025 4:28 UTC
8 points
3 comments5 min readLW link

Build Your Exoskeleton

mrmoxon29 Jun 2025 1:54 UTC
1 point
0 comments9 min readLW link

Why Rea­son­ing Isn’t Enough: How LLM Agents Strug­gle with Ethics and Cooperation

28 Jun 2025 20:43 UTC
6 points
0 comments4 min readLW link

Sup­port for bedrock liberal prin­ci­ples seems to be in pretty bad shape these days

Max H28 Jun 2025 20:37 UTC
32 points
52 comments4 min readLW link

A De­pressed Shrink Tries Shrooms

AlphaAndOmega28 Jun 2025 17:16 UTC
44 points
6 comments1 min readLW link
(open.substack.com)

Time Ma­chine as Ex­is­ten­tial Risk

avturchin28 Jun 2025 15:17 UTC
15 points
7 comments45 min readLW link

The next wave of model im­prove­ments will be due to data quality

ChristianKl28 Jun 2025 14:34 UTC
17 points
4 comments1 min readLW link

AXRP Epi­sode 44 - Peter Salib on AI Rights for Hu­man Safety

DanielFilan28 Jun 2025 1:40 UTC
12 points
0 comments103 min readLW link

Pre­dic­tion Mar­kets Have an An­thropic Bias to Deal With

ar-sht28 Jun 2025 1:16 UTC
7 points
1 comment11 min readLW link