Cor­rigi­bil­ity’s De­sir­a­bil­ity is Timing-Sensitive

RobertMDec 26, 2024, 10:24 PM
29 points
4 comments3 min readLW link

PCR retrospective

bhauthDec 26, 2024, 9:20 PM
24 points
0 comments8 min readLW link
(bhauth.com)

AI #96: o3 But Not Yet For Thee

ZviDec 26, 2024, 8:30 PM
58 points
8 comments36 min readLW link
(thezvi.wordpress.com)

Su­per hu­man AI is a very low hang­ing fruit!

HznDec 26, 2024, 7:00 PM
−4 points
0 comments7 min readLW link

The Field of AI Align­ment: A Post­mortem, and What To Do About It

johnswentworthDec 26, 2024, 6:48 PM
302 points
160 comments8 min readLW link

ReSols­ticed vol I: “We’re Not Go­ing Quietly”

RaemonDec 26, 2024, 5:52 PM
61 points
4 comments19 min readLW link

[Question] Are Sparse Au­toen­coders a good idea for AI con­trol?

Gerard BoxoDec 26, 2024, 5:34 PM
3 points
4 comments1 min readLW link

A Three-Layer Model of LLM Psychology

Jan_KulveitDec 26, 2024, 4:49 PM
218 points
13 comments8 min readLW link

Hu­man, All Too Hu­man—Su­per­in­tel­li­gence re­quires learn­ing things we can’t teach

Ben TurtelDec 26, 2024, 4:26 PM
−13 points
4 comments1 min readLW link
(bturtel.substack.com)

[Question] Why don’t we cur­rently have AI agents?

ChristianKlDec 26, 2024, 3:26 PM
8 points
10 comments1 min readLW link

[Question] What would be the IQ and other bench­marks of o3 that uses $1 mil­lion worth of com­pute re­sources to an­swer one ques­tion?

avturchinDec 26, 2024, 11:08 AM
16 points
2 comments1 min readLW link

The Eco­nomics & Prac­ti­cal­ity of Start­ing Mars Colonization

Zero ContradictionsDec 26, 2024, 10:56 AM
2 points
1 comment1 min readLW link
(zerocontradictions.net)

Ter­mi­nal goal vs Intelligence

Donatas LučiūnasDec 26, 2024, 8:10 AM
−12 points
24 comments1 min readLW link

Stream­lin­ing my voice note process

Vlad SitaloDec 26, 2024, 6:04 AM
6 points
1 comment7 min readLW link
(vlad.roam.garden)

Whistle­blow­ing Twit­ter Bot

MckievDec 26, 2024, 4:09 AM
19 points
5 comments2 min readLW link

Open Thread Win­ter 2024/​2025

habrykaDec 25, 2024, 9:02 PM
23 points
59 comments1 min readLW link

Ex­plor­ing Co­op­er­a­tion: The Path to Utopia

DavidmanheimDec 25, 2024, 6:31 PM
11 points
0 commentsLW link
(exploringcooperation.substack.com)

Liv­ing with Rats in College

lsusrDec 25, 2024, 10:44 AM
28 points
0 comments1 min readLW link

[Question] What Have Been Your Most Valuable Ca­sual Con­ver­sa­tions At Con­fer­ences?

johnswentworthDec 25, 2024, 5:49 AM
54 points
21 comments1 min readLW link

The Open­ing Salvo: 1. An On­tolog­i­cal Con­scious­ness Met­ric: Re­sis­tance to Be­hav­ioral Mod­ifi­ca­tion as a Mea­sure of Re­cur­sive Awareness

PeterpiperDec 25, 2024, 2:29 AM
−3 points
0 comments5 min readLW link

The Deep Lore of LightHaven, with Oliver Habryka (TBC epi­sode 228)

Dec 24, 2024, 10:45 PM
45 points
4 comments91 min readLW link
(thebayesianconspiracy.substack.com)

Ac­knowl­edg­ing Back­ground In­for­ma­tion with P(Q|I)

JenniferRMDec 24, 2024, 6:50 PM
29 points
8 comments14 min readLW link

Game The­ory and Be­hav­ioral Eco­nomics in The Stock Mar­ket

Jaiveer SinghDec 24, 2024, 6:15 PM
1 point
0 comments3 min readLW link

[Question] What are the main ar­gu­ments against AGI?

Edy NastaseDec 24, 2024, 3:49 PM
1 point
6 comments1 min readLW link

[Question] Recom­men­da­tions on com­mu­ni­ties that dis­cuss AI ap­pli­ca­tions in society

AnnapurnaDec 24, 2024, 1:37 PM
7 points
2 comments1 min readLW link

AIs Will In­creas­ingly Fake Alignment

ZviDec 24, 2024, 1:00 PM
89 points
0 comments52 min readLW link
(thezvi.wordpress.com)

Ap­ply to the 2025 PIBBSS Sum­mer Re­search Fellowship

Dec 24, 2024, 10:25 AM
15 points
0 comments2 min readLW link

Hu­man-AI Com­ple­men­tar­ity: A Goal for Am­plified Oversight

Dec 24, 2024, 9:57 AM
27 points
4 comments1 min readLW link
(deepmindsafetyresearch.medium.com)

Pre­limi­nary Thoughts on Flirt­ing Theory

Alice BlairDec 24, 2024, 7:37 AM
14 points
6 comments3 min readLW link

[Question] Why is neu­ron count of hu­man brain rele­vant to AI timelines?

samuelshadrachDec 24, 2024, 5:15 AM
6 points
7 comments1 min readLW link

How Much to Give is a Prag­matic Question

jefftkDec 24, 2024, 4:20 AM
12 points
1 comment2 min readLW link
(www.jefftk.com)

Do you need a bet­ter map of your myr­iad of maps to the ter­ri­tory?

CstineSublimeDec 24, 2024, 2:00 AM
11 points
2 comments5 min readLW link

Panology

JenniferRMDec 23, 2024, 9:40 PM
17 points
10 comments5 min readLW link

Aris­to­tle, Aquinas, and the Evolu­tion of Tele­ol­ogy: From Pur­pose to Mean­ing.

Spiritus DeiDec 23, 2024, 7:37 PM
−9 points
0 comments6 min readLW link

Peo­ple aren’t prop­erly cal­ibrated on FrontierMath

cakubiloDec 23, 2024, 7:35 PM
31 points
4 comments3 min readLW link

Near- and medium-term AI Con­trol Safety Cases

Martín SotoDec 23, 2024, 5:37 PM
9 points
0 comments6 min readLW link

[Ra­tion­al­ity Malaysia] 2024 year-end meetup!

Doris LiewDec 23, 2024, 4:02 PM
1 point
0 comments1 min readLW link

Printable book of some ra­tio­nal­ist cre­ative writ­ing (from Scott A. & Eliezer)

CounterBlunderDec 23, 2024, 3:44 PM
10 points
0 comments1 min readLW link

Monthly Roundup #25: De­cem­ber 2024

ZviDec 23, 2024, 2:20 PM
18 points
3 comments26 min readLW link
(thezvi.wordpress.com)

Ex­plor­ing the pe­ter­todd /​ Leilan du­al­ity in GPT-2 and GPT-J

mwatkinsDec 23, 2024, 1:17 PM
12 points
1 comment17 min readLW link

[Question] What are the strongest ar­gu­ments for very short timelines?

Kaj_SotalaDec 23, 2024, 9:38 AM
101 points
79 comments1 min readLW link

Re­duce AI Self-Alle­giance by say­ing “he” in­stead of “I”

Knight LeeDec 23, 2024, 9:32 AM
10 points
4 comments2 min readLW link

Fund­ing Case: AI Safety Camp 11

Dec 23, 2024, 8:51 AM
60 points
4 comments6 min readLW link
(manifund.org)

What is com­pute gov­er­nance?

Dec 23, 2024, 6:32 AM
6 points
0 comments2 min readLW link
(aisafety.info)

Stop Mak­ing Sense

JenniferRMDec 23, 2024, 5:16 AM
16 points
0 comments3 min readLW link

Hire (or Be­come) a Think­ing Assistant

RaemonDec 23, 2024, 3:58 AM
138 points
49 comments8 min readLW link

Non-Ob­vi­ous Benefits of Insurance

jefftkDec 23, 2024, 3:40 AM
21 points
5 comments2 min readLW link
(www.jefftk.com)

Vi­sion of a pos­i­tive Singularity

RussellThorDec 23, 2024, 2:19 AM
4 points
0 comments4 min readLW link

Ide­olo­gies are slow and nec­es­sary, for now

Gabriel AlfourDec 23, 2024, 1:57 AM
15 points
1 comment1 min readLW link
(cognition.cafe)

[Question] Has An­thropic checked if Claude fakes al­ign­ment for in­tended val­ues too?

MaloewDec 23, 2024, 12:43 AM
4 points
1 comment1 min readLW link