5 Jun 2025 23:07 UTC

22 points

2 comments5 min readLW link

(far.ai)

Introducing: Meridian Cambridge’s new online lecture series covering frontier AI and AI safety

Meridian Cambridge5 Jun 2025 21:55 UTC

1 point

0 comments1 min readLW link

cheaper sodium electrolysis

bhauth5 Jun 2025 21:49 UTC

23 points

3 comments4 min readLW link

(www.bhauth.com)

Histograms are to CDFs as calibration plots are to...

Optimization Process5 Jun 2025 20:20 UTC

35 points

9 comments1 min readLW link

(optimizationprocess.com)

Integration Bandwidth: The Mechanism Behind Intelligence and Puberty

Dortex5 Jun 2025 19:37 UTC

−1 points

4 comments1 min readLW link

(osf.io)

Levels of Doom: Eutopia, Disempowerment, Extinction

Vladimir_Nesov5 Jun 2025 19:08 UTC

34 points

1 comment2 min readLW link

LLM in-context learning as (approximating) Solomonoff induction

Cole Wyeth5 Jun 2025 17:45 UTC

31 points

3 comments4 min readLW link

Fundamental Uncertainty: Chapter 2 - How do words get their meaning?

Gordon Seidoh Worley5 Jun 2025 16:32 UTC

11 points

2 comments11 min readLW link

AI Might Kill Everyone

Bentham's Bulldog5 Jun 2025 15:37 UTC

6 points

0 comments4 min readLW link

AI #119: Goodbye AISI?

Zvi5 Jun 2025 14:00 UTC

42 points

8 comments60 min readLW link

(thezvi.wordpress.com)

Powerful Predictions

Alvin Ånestrand5 Jun 2025 10:44 UTC

2 points

0 comments6 min readLW link

(forecastingaifutures.substack.com)

Potentially Useful Projects in Wise AI

Chris_Leong5 Jun 2025 8:13 UTC

12 points

0 comments5 min readLW link

Building as gardening

Itay Dreyfus5 Jun 2025 6:41 UTC

3 points

1 comment4 min readLW link

(productidentity.co)

Semiconductor Fabs I: The Equipment

nomagicpill4 Jun 2025 22:09 UTC

19 points

0 comments19 min readLW link

(nomagicpill.github.io)

The Stereotype of the Stereotype

Ike4 Jun 2025 21:06 UTC

58 points

17 comments9 min readLW link

2. Why intuitive comparisons of large-scale impact are unjustified

Anthony DiGiovanni4 Jun 2025 20:30 UTC

25 points

0 comments16 min readLW link

Dating Roundup #6

Zvi4 Jun 2025 20:00 UTC

36 points

2 comments55 min readLW link

(thezvi.wordpress.com)

Rational Prime Calendar

RickHull4 Jun 2025 19:30 UTC

−1 points

0 comments3 min readLW link

A Technique of Pure Reason

Adam Newgas4 Jun 2025 19:07 UTC

11 points

3 comments2 min readLW link

“Flaky breakthroughs” pervade inner work — but almost no one tracks them

Chris Lakin4 Jun 2025 19:02 UTC

216 points

45 comments2 min readLW link

(chrislakin.blog)

[Question] LessOnline saved my life. Now how do I let go of this house?

RedMan4 Jun 2025 18:47 UTC

24 points

7 comments1 min readLW link

Linkpost: Predicting Empirical AI Research Outcomes with Language Models

quetzal_rainbow4 Jun 2025 18:14 UTC

10 points

1 comment1 min readLW link

(arxiv.org)

Self-Coordinated Deception in Current AI Models

Avi Brach-Neufeld4 Jun 2025 17:59 UTC

8 points

5 comments4 min readLW link

To MAIM or Not to MAIM. Introducing MARS: The Nuclear Deterrent case for Hardened Datacenters

kinsman4 Jun 2025 17:56 UTC

1 point

0 comments7 min readLW link

The Belocrat: a servant leader

belos4 Jun 2025 17:25 UTC

1 point

0 comments10 min readLW link

(bestofagreatlot.substack.com)

A list of books which are adjacent to EA

marco moldo4 Jun 2025 12:31 UTC

−1 points

0 comments3 min readLW link

Philosophical Jailbreaks: Demo of LLM Nihilism

Artem Karpov4 Jun 2025 12:03 UTC

3 points

0 comments5 min readLW link

Notes from a mini-replication of the alignment faking paper

Ben_Snodin4 Jun 2025 11:01 UTC

13 points

5 comments9 min readLW link

(www.bensnodin.com)

ARENA 6.0 - Call for Applicants

JamesH, JScriven, David Quarel, CallumMcDougall and James Fox

4 Jun 2025 10:19 UTC

26 points

3 comments6 min readLW link

Quickly Assessing Reward Hacking-like Behavior in LLMs and its Sensitivity to Prompt Variations

AndresCampero4 Jun 2025 7:22 UTC

26 points

1 comment17 min readLW link

Draft: A concise theory of agentic consciousness

Martin Vlach4 Jun 2025 5:00 UTC

2 points

4 comments1 min readLW link

Individual AI representatives don’t solve Gradual Disempowerement

Jan_Kulveit4 Jun 2025 1:26 UTC

62 points

4 comments3 min readLW link

Lectures on AI for high school students (and others)

Radford Neal3 Jun 2025 23:54 UTC

6 points

0 comments1 min readLW link

(radfordneal.wordpress.com)

Does the Taiwan invasion prevent mankind from obtaining the aligned ASI?

StanislavKrym3 Jun 2025 23:35 UTC

−14 points

1 comment5 min readLW link

Self-inquiry

Vadim Golub3 Jun 2025 22:15 UTC

−3 points

0 comments5 min readLW link

Question to LW devs: does LessWrong tries to be facebooky?

Roman Malov3 Jun 2025 22:08 UTC

5 points

1 comment1 min readLW link

Your Strategy Roadmap: Expert Tips + Live Training

Deena Englander3 Jun 2025 21:10 UTC

−4 points

0 comments4 min readLW link

Steering Vectors Can Help LLM Judges Detect Subtle Dishonesty

Leon Eshuijs, mcbeth, Etha and Archie Chaudhury

3 Jun 2025 20:33 UTC

12 points

1 comment5 min readLW link

Schelling Coordination via Agentic Loops

Callum-Luis Kindred3 Jun 2025 20:13 UTC

10 points

1 comment9 min readLW link

Visual Prompt Injections: Results on testing AI spam-defense and AI vulnerability to deceptive web ads.

Seon Gunness3 Jun 2025 20:10 UTC

4 points

0 comments12 min readLW link

Broad-Spectrum Cancer Treatments

sarahconstantin3 Jun 2025 19:40 UTC

150 points

10 comments7 min readLW link

(sarahconstantin.substack.com)

How to work through the ARENA program on your own

Leon Lang3 Jun 2025 17:38 UTC

38 points

5 comments6 min readLW link

How the veil of ignorance grounds sentientism

HoVY3 Jun 2025 17:29 UTC

−3 points

23 comments6 min readLW link

(forum.effectivealtruism.org)

In Which I Make the Mistake of Fully Covering an Episode of the All-In Podcast

Zvi3 Jun 2025 15:50 UTC

42 points

2 comments28 min readLW link

(thezvi.wordpress.com)

Transformer Modular Addition Through A Signal Processing Lens

Benjamin Kelley3 Jun 2025 15:32 UTC

1 point

0 comments1 min readLW link

AXRP Episode 41 - Lee Sharkey on Attribution-based Parameter Decomposition

DanielFilan3 Jun 2025 3:40 UTC

28 points

1 comment61 min readLW link

Notes on dynamism, power, & virtue

Lizka3 Jun 2025 1:40 UTC

19 points

0 comments12 min readLW link

Trends – Artificial Intelligence

Archimedes3 Jun 2025 0:48 UTC

1 point

1 comment1 min readLW link

(www.bondcap.com)

LLMs might have subjective experiences, but no concepts for them

No77e2 Jun 2025 21:18 UTC

17 points

5 comments2 min readLW link

In defense of memes (and thought-terminating clichés)

Harjas2 Jun 2025 20:18 UTC

11 points

4 comments10 min readLW link