Metacompilation

Donald Hobson24 Feb 2025 22:58 UTC
11 points
1 comment4 min readLW link

The man­i­fest manifesto

dkl924 Feb 2025 22:13 UTC
6 points
2 comments2 min readLW link
(dkl9.net)

Credit Suisse col­lapse obfus­cated Par­reaux, Thiébaud & Part­ners scan­dal

pocock24 Feb 2025 21:28 UTC
3 points
0 comments1 min readLW link
(juristgate.com)

Topolog­i­cal Data Anal­y­sis and Mechanis­tic Interpretability

Gunnar Carlsson24 Feb 2025 19:56 UTC
16 points
4 comments7 min readLW link

Zizian com­par­i­sons /​ con­nec­tions in the open source & Linux communities

pocock24 Feb 2025 19:55 UTC
−15 points
0 comments1 min readLW link

Lo­cal Trust

24 Feb 2025 19:53 UTC
21 points
4 comments5 min readLW link

Na­tion­wide Ac­tion Work­shop: Con­tact Congress about AI safety!

Felix De Simone24 Feb 2025 19:36 UTC
7 points
0 comments1 min readLW link

An­thropic re­leases Claude 3.7 Son­net with ex­tended think­ing mode

LawrenceC24 Feb 2025 19:32 UTC
88 points
8 comments4 min readLW link
(www.anthropic.com)

Train­ing AI to do al­ign­ment re­search we don’t already know how to do

joshc24 Feb 2025 19:19 UTC
45 points
24 comments7 min readLW link

Con­fer­ence Re­port: Thresh­old 2030 - Model­ing AI Eco­nomic Futures

24 Feb 2025 18:56 UTC
52 points
0 comments10 min readLW link
(www.convergenceanalysis.org)

Eval­u­at­ing “What 2026 Looks Like” So Far

Jonny Spicer24 Feb 2025 18:55 UTC
78 points
6 comments7 min readLW link

Su­per­in­tel­li­gent Agents Pose Catas­trophic Risks: Can Scien­tist AI Offer a Safer Path?

24 Feb 2025 18:31 UTC
44 points
15 comments11 min readLW link

Un­der­stand­ing Agent Preferences

martinkunev24 Feb 2025 17:46 UTC
6 points
2 comments14 min readLW link

What We Can Do to Prevent Ex­tinc­tion by AI

Joe Rogero24 Feb 2025 17:15 UTC
12 points
0 comments11 min readLW link

Dream, Truth, & Good

abramdemski24 Feb 2025 16:59 UTC
50 points
11 comments4 min readLW link

Fore­cast­ing Fron­tier Lan­guage Model Agent Capabilities

24 Feb 2025 16:51 UTC
35 points
0 comments5 min readLW link
(www.apolloresearch.ai)

A City Within a City

Declan Molony24 Feb 2025 15:51 UTC
50 points
1 comment7 min readLW link

Grok Grok

Zvi24 Feb 2025 14:20 UTC
36 points
2 comments19 min readLW link
(thezvi.wordpress.com)

if you’re not happy sin­gle, you won’t be happy immortal

daijin24 Feb 2025 13:23 UTC
2 points
1 comment1 min readLW link

[NSFW] The Fuzzy Hand­cuffs of Liberation

lsusr24 Feb 2025 13:05 UTC
28 points
11 comments2 min readLW link

Day­ton, Ohio, HPMOR 10 year An­niver­sary meetup

Lunawarrior24 Feb 2025 12:55 UTC
1 point
0 comments1 min readLW link

An Alter­nate His­tory of the Fu­ture, 2025-2040

Mr Beastly24 Feb 2025 5:53 UTC
5 points
5 comments10 min readLW link

Ex­port Surplusses

lsusr24 Feb 2025 5:53 UTC
24 points
21 comments3 min readLW link

AI al­ign­ment for men­tal health supports

hiki_t24 Feb 2025 4:21 UTC
1 point
1 comment1 min readLW link

The GDM AGI Safety+Align­ment Team is Hiring for Ap­plied In­ter­pretabil­ity Research

24 Feb 2025 2:17 UTC
48 points
1 comment7 min readLW link

Poll on AI opinions.

Niclas Kupper23 Feb 2025 22:39 UTC
1 point
2 comments1 min readLW link

The Geom­e­try of Lin­ear Re­gres­sion ver­sus PCA

criticalpoints23 Feb 2025 21:01 UTC
20 points
7 comments6 min readLW link
(eregis.github.io)

Judge­ments: Merg­ing Pre­dic­tion & Evidence

abramdemski23 Feb 2025 19:35 UTC
107 points
7 comments6 min readLW link

In­tel­li­gence as Priv­ilege Escalation

Cole Wyeth23 Feb 2025 19:31 UTC
28 points
2 comments5 min readLW link

[Question] Have LLMs Gen­er­ated Novel In­sights?

23 Feb 2025 18:22 UTC
166 points
41 comments2 min readLW link

The case for cor­po­ral punishment

Yair Halberstadt23 Feb 2025 15:05 UTC
28 points
5 comments2 min readLW link

Reflec­tions on the state of the race to su­per­in­tel­li­gence, Fe­bru­ary 2025

Mitchell_Porter23 Feb 2025 13:58 UTC
21 points
7 comments4 min readLW link

List of most in­ter­est­ing ideas I en­coun­tered in my life, ranked

Lucien23 Feb 2025 12:36 UTC
21 points
6 comments1 min readLW link

Test of the Bene Gesserit

lsusr23 Feb 2025 11:51 UTC
19 points
3 comments3 min readLW link

Mo­ral gauge the­ory: A spec­u­la­tive sug­ges­tion for AI alignment

James Diacoumis23 Feb 2025 11:42 UTC
6 points
2 comments8 min readLW link

[Question] Does hu­man (mis)al­ign­ment pose a sig­nifi­cant and im­mi­nent ex­is­ten­tial threat?

jr23 Feb 2025 10:03 UTC
6 points
3 comments1 min readLW link

Deep sparse au­toen­coders yield in­ter­pretable fea­tures too

Armaan A. Abraham23 Feb 2025 5:46 UTC
30 points
8 comments8 min readLW link

New Re­port: Multi-Agent Risks from Ad­vanced AI

Lewis Hammond23 Feb 2025 0:32 UTC
24 points
0 comments2 min readLW link
(www.cooperativeai.com)

Power Lies Trem­bling: a three-book review

Richard_Ngo22 Feb 2025 22:57 UTC
214 points
29 comments15 min readLW link
(www.mindthefuture.info)

Trans­former Dy­nam­ics: a neuro-in­spired ap­proach to MechInterp

22 Feb 2025 21:33 UTC
11 points
0 comments5 min readLW link

Re­cur­sive Cog­ni­tive Refine­ment (RCR): A Self-Cor­rect­ing Ap­proach for LLM Hallucinations

mxTheo22 Feb 2025 21:32 UTC
0 points
0 comments2 min readLW link

Grad­ual Disem­pow­er­ment: Simplified

Annapurna22 Feb 2025 16:59 UTC
10 points
1 comment1 min readLW link
(jorgevelez.substack.com)

AI Apoca­lypse and the Buddha

pchvykov22 Feb 2025 16:33 UTC
−17 points
6 comments9 min readLW link

Unal­igned AGI & Brief His­tory of Inequality

ank22 Feb 2025 16:26 UTC
−20 points
4 comments7 min readLW link

HPMOR An­niver­sary Guide

Screwtape22 Feb 2025 16:17 UTC
63 points
7 comments3 min readLW link

Fore­cast­ing Un­con­trol­led Spread of AI

Alvin Ånestrand22 Feb 2025 13:05 UTC
2 points
0 comments10 min readLW link
(forecastingaifutures.substack.com)

See­ing Through the Eyes of the Algorithm

silentbob22 Feb 2025 11:54 UTC
18 points
3 comments10 min readLW link

Proselytizing

lsusr22 Feb 2025 11:54 UTC
50 points
3 comments2 min readLW link

Work­shop: In­ter­pretabil­ity in LLMs us­ing Geo­met­ric and Statis­ti­cal Methods

Karthik Viswanathan22 Feb 2025 9:39 UTC
17 points
0 comments8 min readLW link

In­for­ma­tion through­put of biolog­i­cal hu­mans and fron­tier LLMs

benwr22 Feb 2025 7:15 UTC
12 points
0 comments1 min readLW link