Could this be an un­usu­ally good time to Earn To Give?

TomGardinerMar 4, 2025, 9:51 PM
−1 points
0 comments3 min readLW link
(forum.effectivealtruism.org)

What is the best /​ most proper defi­ni­tion of “Feel­ing the AGI” there is?

AnnapurnaMar 4, 2025, 8:13 PM
8 points
5 comments1 min readLW link

En­ergy Mar­kets Tem­po­ral Ar­bi­trage with Batteries

NickyPMar 4, 2025, 5:37 PM
21 points
3 comments16 min readLW link

Distil­la­tion of Meta’s Large Con­cept Models Paper

NickyPMar 4, 2025, 5:33 PM
19 points
3 comments4 min readLW link

Top AI safety newslet­ters, books, pod­casts, etc – new AISafety.com resource

Mar 4, 2025, 5:01 PM
32 points
2 comments1 min readLW link

2028 Should Not Be AI Safety’s First Fo­ray Into Politics

Jesse RichardsonMar 4, 2025, 4:46 PM
5 points
0 comments2 min readLW link

[Question] How Much Are LLMs Ac­tu­ally Boost­ing Real-World Pro­gram­mer Pro­duc­tivity?

Thane RuthenisMar 4, 2025, 4:23 PM
137 points
52 comments3 min readLW link

Val­i­dat­ing against a mis­al­ign­ment de­tec­tor is very differ­ent to train­ing against one

mattmacdermottMar 4, 2025, 3:41 PM
33 points
4 comments4 min readLW link

For schem­ing, we should first fo­cus on de­tec­tion and then on prevention

Marius HobbhahnMar 4, 2025, 3:22 PM
47 points
7 comments5 min readLW link

Progress links and short notes, 2025-03-03

jasoncrawfordMar 4, 2025, 3:20 PM
8 points
0 comments6 min readLW link
(newsletter.rootsofprogress.org)

For­ma­tion Re­search: Or­gani­sa­tion Overview

alamertonMar 4, 2025, 3:03 PM
5 points
0 comments11 min readLW link

On Writ­ing #1

ZviMar 4, 2025, 1:30 PM
37 points
2 comments15 min readLW link
(thezvi.wordpress.com)

The Semi-Ra­tional Mili­tar Firefighter

P. JoãoMar 4, 2025, 12:23 PM
72 points
10 comments2 min readLW link

Ob­ser­va­tions About LLM In­fer­ence Pricing

Aaron_ScherMar 4, 2025, 3:03 AM
28 points
2 comments9 min readLW link
(techgov.intelligence.org)

[Question] How much should I worry about the At­lanta Fed’s GDP es­ti­mates?

Brendan LongMar 4, 2025, 2:03 AM
16 points
2 comments1 min readLW link

[Question] shouldn’t we try to get me­dia at­ten­tion?

KvmanThinkingMar 4, 2025, 1:39 AM
6 points
1 comment1 min readLW link

The Mil­ton Fried­man Model of Policy Change

JohnofCharlestonMar 4, 2025, 12:38 AM
136 points
17 comments4 min readLW link

The Com­pli­ment Sand­wich 🥪 aka: How to crit­i­cize a normie with­out mak­ing them up­set.

keltanMar 3, 2025, 11:15 PM
13 points
10 comments1 min readLW link

AI Safety at the Fron­tier: Paper High­lights, Fe­bru­ary ’25

gasteigerjoMar 3, 2025, 10:09 PM
7 points
0 comments7 min readLW link
(aisafetyfrontier.substack.com)

What goals will AIs have? A list of hypotheses

Daniel KokotajloMar 3, 2025, 8:08 PM
87 points
19 comments18 min readLW link

Take­aways From Our Re­cent Work on SAE Probing

Mar 3, 2025, 7:50 PM
30 points
0 comments5 min readLW link

Why Peo­ple Com­mit White Col­lar Fraud (Ozy linkpost)

sapphireMar 3, 2025, 7:33 PM
22 points
1 comment1 min readLW link
(thingofthings.substack.com)

[Question] Ask Me Any­thing—Samuel

samuelshadrachMar 3, 2025, 7:24 PM
0 points
0 comments1 min readLW link

Ex­pand­ing Har­mBench: In­ves­ti­gat­ing Gaps & Ex­tend­ing Ad­ver­sar­ial LLM Test­ing

racinkc1Mar 3, 2025, 7:23 PM
1 point
0 comments1 min readLW link

Could Ad­vanced AI Ac­cel­er­ate the Pace of AI Progress? In­ter­views with AI Researchers

Mar 3, 2025, 7:05 PM
43 points
1 comment1 min readLW link
(papers.ssrn.com)

Mid­dle School Choice

jefftkMar 3, 2025, 4:10 PM
27 points
10 comments4 min readLW link
(www.jefftk.com)

On GPT-4.5

ZviMar 3, 2025, 1:40 PM
44 points
12 comments22 min readLW link
(thezvi.wordpress.com)

Co­a­les­cence—Deter­minism In Ways We Care About

vitaliyaMar 3, 2025, 1:20 PM
12 points
0 comments11 min readLW link

Meth­ods for strong hu­man germline en­g­ineer­ing

TsviBTMar 3, 2025, 8:13 AM
149 points
28 comments108 min readLW link

[Question] Ex­am­ples of self-fulfilling prophe­cies in AI al­ign­ment?

Chris LakinMar 3, 2025, 2:45 AM
22 points
6 comments1 min readLW link

[Question] Re­quest for Com­ments on AI-re­lated Pre­dic­tion Mar­ket Ideas

PeterMcCluskeyMar 2, 2025, 8:52 PM
17 points
1 comment3 min readLW link

Statis­ti­cal Challenges with Mak­ing Su­per IQ babies

Jan Christian RefsgaardMar 2, 2025, 8:26 PM
154 points
26 comments9 min readLW link

Cau­tions about LLMs in Hu­man Cog­ni­tive Loops

Alice BlairMar 2, 2025, 7:53 PM
39 points
11 comments7 min readLW link

Self-fulfilling mis­al­ign­ment data might be poi­son­ing our AI models

TurnTroutMar 2, 2025, 7:51 PM
153 points
28 comments1 min readLW link
(turntrout.com)

Spencer Green­berg hiring a per­sonal/​pro­fes­sional/​re­search re­mote as­sis­tant for 5-10 hours per week

spencergMar 2, 2025, 6:01 PM
13 points
0 commentsLW link

[Question] Will LLM agents be­come the first takeover-ca­pa­ble AGIs?

Seth HerdMar 2, 2025, 5:15 PM
36 points
10 comments1 min readLW link

Not-yet-falsifi­able be­liefs?

Benjamin HendricksMar 2, 2025, 2:11 PM
6 points
4 comments1 min readLW link

Sav­ing Zest

jefftkMar 2, 2025, 12:00 PM
24 points
1 comment1 min readLW link
(www.jefftk.com)

Open Thread Spring 2025

Ben PaceMar 2, 2025, 2:33 AM
19 points
51 comments1 min readLW link

[Question] help, my self image as ra­tio­nal is af­fect­ing my abil­ity to em­pathize with others

KvmanThinkingMar 2, 2025, 2:06 AM
1 point
13 comments1 min readLW link

Main­tain­ing Align­ment dur­ing RSI as a Feed­back Con­trol Problem

berenMar 2, 2025, 12:21 AM
66 points
6 comments11 min readLW link

AI Safety Policy Won’t Go On Like This – AI Safety Ad­vo­cacy Is Failing Be­cause No­body Cares.

henophiliaMar 1, 2025, 8:15 PM
1 point
1 comment1 min readLW link
(blog.hermesloom.org)

Mean­ing Machines

appromoximateMar 1, 2025, 7:16 PM
0 points
0 comments13 min readLW link

[Question] Share AI Safety Ideas: Both Crazy and Not

ankMar 1, 2025, 7:08 PM
16 points
28 comments1 min readLW link

His­to­ri­o­graph­i­cal Com­pres­sions: Re­nais­sance as An Example

adamShimiMar 1, 2025, 6:21 PM
17 points
4 comments7 min readLW link
(formethods.substack.com)

Real-Time Gigstats

jefftkMar 1, 2025, 2:10 PM
9 points
0 comments1 min readLW link
(www.jefftk.com)

Open prob­lems in emer­gent misalignment

Mar 1, 2025, 9:47 AM
82 points
13 comments7 min readLW link

Es­ti­mat­ing the Prob­a­bil­ity of Sam­pling a Trained Neu­ral Net­work at Random

Mar 1, 2025, 2:11 AM
32 points
10 comments1 min readLW link
(arxiv.org)

[Question] What na­tion did Trump pre­vent from go­ing to war (Feb. 2025)?

James CamachoMar 1, 2025, 1:46 AM
3 points
3 comments1 min readLW link

AXRP Epi­sode 38.8 - David Du­ve­naud on Sab­o­tage Eval­u­a­tions and the Post-AGI Future

DanielFilanMar 1, 2025, 1:20 AM
13 points
0 comments13 min readLW link