All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All JanFebMar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 181920 21 22 23 24 25 26 27 28

Places of Loving Grace [Story]

ank18 Feb 2025 23:49 UTC

−1 points

0 comments4 min readLW link

Are SAE features from the Base Model still meaningful to LLaVA?

Shan23Chen18 Feb 2025 22:16 UTC

8 points

2 comments10 min readLW link

(www.lesswrong.com)

Sparse Autoencoder Features for Classifications and Transferability

Shan23Chen18 Feb 2025 22:14 UTC

5 points

0 comments1 min readLW link

(arxiv.org)

A fable on AI x-risk

bgaesop18 Feb 2025 20:15 UTC

8 points

4 comments1 min readLW link

The Unearned Privilege We Rarely Discuss: Cognitive Capability

DiegoRojas18 Feb 2025 20:06 UTC

−21 points

7 comments3 min readLW link

Call for Applications: XLab Summer Research Fellowship

Jo Jiao18 Feb 2025 19:19 UTC

12 points

0 comments1 min readLW link

AISN #48: Utility Engineering and EnigmaEval

Corin Katzke and Dan H

18 Feb 2025 19:15 UTC

4 points

0 comments4 min readLW link

(newsletter.safe.ai)

Abstract Mathematical Concepts vs. Abstractions Over Real-World Systems

Thane Ruthenis18 Feb 2025 18:04 UTC

40 points

11 comments4 min readLW link

How accurate was my “Altered Traits” book review?

lsusr18 Feb 2025 17:00 UTC

43 points

6 comments3 min readLW link

Medical Roundup #4

Zvi18 Feb 2025 13:40 UTC

24 points

3 comments10 min readLW link

(thezvi.wordpress.com)

Dear AGI,

Nathan Young18 Feb 2025 10:48 UTC

90 points

11 comments3 min readLW link

There are a lot of upcoming retreats/conferences between March and July (2025)

gergogaspar and ENAIS

18 Feb 2025 9:30 UTC

6 points

0 comments1 min readLW link

Sea Change

Charlie Sanders18 Feb 2025 6:03 UTC

−2 points

2 comments5 min readLW link

(www.dailymicrofiction.com)

Born on Third Base: The Case for Inheriting Nothing and Building Everything

charlieoneill18 Feb 2025 0:47 UTC

−24 points

16 comments2 min readLW link

Do models know when they are being evaluated?

fidgetsinner, Giles, Joe Needham and Marius Hobbhahn

17 Feb 2025 23:13 UTC

57 points

9 comments12 min readLW link

AGI Safety & Alignment @ Google DeepMind is hiring

Rohin Shah17 Feb 2025 21:11 UTC

103 points

19 comments10 min readLW link

The Peeperi (unfinished) - By Katja Grace

Nathan Young17 Feb 2025 19:33 UTC

22 points

0 comments3 min readLW link

(docs.google.com)

Progress links and short notes, 2025-02-17

jasoncrawford17 Feb 2025 19:18 UTC

8 points

0 comments7 min readLW link

(newsletter.rootsofprogress.org)

Claude 3.5 Sonnet (New)’s AGI scenario

Nathan Young17 Feb 2025 18:47 UTC

5 points

2 comments5 min readLW link

Talking to laymen about AI development

David Steel17 Feb 2025 18:42 UTC

8 points

0 comments1 min readLW link

On the Rebirth of Aristocracy in the American Regime

shawkisukkar17 Feb 2025 16:18 UTC

−16 points

3 comments9 min readLW link

(shawkisukkar.substack.com)

Ascetic hedonism

dkl917 Feb 2025 15:56 UTC

15 points

9 comments2 min readLW link

(dkl9.net)

AIS Berlin, events, opportunities and the flipped gameboard—Fieldbuilders Newsletter, February 2025

gergogaspar and ENAIS

17 Feb 2025 14:16 UTC

6 points

0 comments3 min readLW link

Monthly Roundup #27: February 2025

Zvi17 Feb 2025 14:10 UTC

27 points

3 comments44 min readLW link

(thezvi.wordpress.com)

What new x- or s-risk fieldbuilding organisations would you like to see? An EOI form. (FBB #3)

gergogaspar17 Feb 2025 12:39 UTC

6 points

0 comments2 min readLW link

A History of the Future, 2025-2040

L Rudolf L17 Feb 2025 12:03 UTC

253 points

42 comments75 min readLW link

(nosetgauge.substack.com)

Thermodynamic entropy = Kolmogorov complexity

Aram Ebtekar17 Feb 2025 5:56 UTC

77 points

14 comments1 min readLW link

(doi.org)

THE ARCHIVE

Jason Reid17 Feb 2025 1:12 UTC

7 points

0 comments6 min readLW link

[Question] What are the surviving worlds like?

KvmanThinking17 Feb 2025 0:41 UTC

21 points

2 comments1 min readLW link

CyberEconomy. The Limits to Growth

Timur Sadekov and Aleksei Vostriakov

16 Feb 2025 21:02 UTC

−3 points

0 comments23 min readLW link

Cooperation for AI safety must transcend geopolitical interference

Matrice Jacobine16 Feb 2025 18:18 UTC

7 points

6 comments1 min readLW link

(www.scmp.com)

[Question] Programming Language Early Funding?

J Thomas Moros16 Feb 2025 17:34 UTC

2 points

6 comments3 min readLW link

[Closed] Gauging Interest for a Learning-Theoretic Agenda Mentorship Programme

Vanessa Kosoy16 Feb 2025 16:24 UTC

54 points

5 comments2 min readLW link

Celtic Knots on Einstein Lattice

Ben16 Feb 2025 15:56 UTC

47 points

11 comments2 min readLW link

It’s been ten years. I propose HPMOR Anniversary Parties.

Screwtape16 Feb 2025 1:43 UTC

154 points

3 comments1 min readLW link

Come join Dovetail’s agent foundations fellowship talks & discussion

Alex_Altair15 Feb 2025 22:10 UTC

25 points

0 comments1 min readLW link

Quantifying the Qualitative: Towards a Bayesian Approach to Personal Insight

Pruthvi Kumar15 Feb 2025 19:50 UTC

1 point

0 comments6 min readLW link

Knitting a Sweater in a Burning House

CrimsonChin15 Feb 2025 19:50 UTC

27 points

2 comments2 min readLW link

Microplastics: Much Less Than You Wanted To Know

jenn, kaleb and Brent

15 Feb 2025 19:08 UTC

94 points

11 comments13 min readLW link

Preference for uncertainty and impact overestimation bias in altruistic systems.

Luck15 Feb 2025 12:27 UTC

1 point

0 comments1 min readLW link

Artificial Static Place Intelligence: Guaranteed Alignment

ank15 Feb 2025 11:08 UTC

2 points

2 comments2 min readLW link

The current AI strategic landscape: one bear’s perspective

Matrice Jacobine15 Feb 2025 9:49 UTC

11 points

0 comments2 min readLW link

(philosophybear.substack.com)

6 (Potential) Misconceptions about AI Intellectuals

ozziegooen14 Feb 2025 23:51 UTC

18 points

11 comments12 min readLW link

[Question] Should Open Philanthropy Make an Offer to Buy OpenAI?

peterr14 Feb 2025 23:18 UTC

25 points

1 comment1 min readLW link

A computational no-coincidence principle

Eric Neyman14 Feb 2025 21:39 UTC

152 points

40 comments6 min readLW link

(www.alignment.org)

Hopeful hypothesis, the Persona Jukebox.

Donald Hobson14 Feb 2025 19:24 UTC

11 points

4 comments3 min readLW link

Introduction to Expected Value Fanaticism

Petra Kosonen14 Feb 2025 19:05 UTC

9 points

8 comments1 min readLW link

(utilitarianism.net)

Intrinsic Dimension of Prompts in LLMs

Karthik Viswanathan14 Feb 2025 19:02 UTC

3 points

0 comments4 min readLW link

Objective Realism: A Perspective Beyond Human Constructs

Apatheos14 Feb 2025 19:02 UTC

−12 points

1 comment2 min readLW link

A short course on AGI safety from the GDM Alignment team

Vika and Rohin Shah

14 Feb 2025 15:43 UTC

105 points

2 comments1 min readLW link

(deepmindsafetyresearch.medium.com)