The prob­lem with pro­por­tional extrapolation

pathos_bot30 Jan 2024 23:40 UTC
6 points
0 comments1 min readLW link

Coun­ter­fac­tual Mechanism Networks

StrivingForLegibility30 Jan 2024 20:30 UTC
4 points
0 comments5 min readLW link

Con­trol vs Selec­tion: Civil­i­sa­tion is best at con­trol, but nav­i­gat­ing AGI re­quires selection

VojtaKovarik30 Jan 2024 19:06 UTC
7 points
1 comment1 min readLW link

AI gov­er­nance frames

NathanBarnard30 Jan 2024 18:18 UTC
3 points
0 comments3 min readLW link

De­cid­ing What Pro­ject/​Org to Start: A Guide to Pri­ori­ti­za­tion Research

Alexandra Bos30 Jan 2024 18:13 UTC
8 points
0 comments1 min readLW link

on neodymium magnets

bhauth30 Jan 2024 15:58 UTC
47 points
6 comments4 min readLW link
(www.bhauth.com)

[Question] Can we cre­ate self-im­prov­ing AIs that perfect their own ethics?

Gabi QUENE30 Jan 2024 14:45 UTC
1 point
10 comments1 min readLW link

Child­hood and Ed­u­ca­tion Roundup #4

Zvi30 Jan 2024 13:50 UTC
43 points
10 comments24 min readLW link
(thezvi.wordpress.com)

Last call for sub­mis­sions for TAIS 2024!

Blaine30 Jan 2024 12:08 UTC
4 points
0 comments1 min readLW link
(tais2024.cc)

[Question] Has any­one ac­tu­ally changed their mind re­gard­ing Sleep­ing Beauty prob­lem?

Ape in the coat30 Jan 2024 8:34 UTC
14 points
50 comments1 min readLW link

San Fer­nando Valley Ra­tion­al­ity: Fe­bru­ary 15, 2024

Thomas Broadley30 Jan 2024 4:40 UTC
3 points
0 comments1 min readLW link

The case for more am­bi­tious lan­guage model evals

Jozdien30 Jan 2024 0:01 UTC
109 points
27 comments5 min readLW link

A short ‘deriva­tion’ of Watan­abe’s Free En­ergy Formula

Wuschel Schulz29 Jan 2024 23:41 UTC
13 points
6 comments7 min readLW link

How im­por­tant is AI hack­ing as LLMs ad­vance?

Artyom Karpov29 Jan 2024 18:41 UTC
1 point
0 comments6 min readLW link

LLM Psy­cho­met­rics: A Spec­u­la­tive Ap­proach to AI Safety

pskl29 Jan 2024 18:38 UTC
3 points
4 comments1 min readLW link
(pascal.cc)

[Question] How to write bet­ter?

TeaTieAndHat29 Jan 2024 17:02 UTC
7 points
24 comments1 min readLW link

Pro­ces­sor clock speeds are not how fast AIs think

Ege Erdil29 Jan 2024 14:39 UTC
130 points
55 comments2 min readLW link

Nat­u­ral se­lec­tion for ingame char­ac­ter build optimisation

Kongo Landwalker29 Jan 2024 11:34 UTC
8 points
5 comments2 min readLW link

Anal­ogy Bank for AI Safety

Rocket29 Jan 2024 2:35 UTC
23 points
0 comments7 min readLW link

Min­neapo­lis-St Paul ACX Ar­ti­cle Club: Med­i­ta­tion and LSD

25Hour29 Jan 2024 1:24 UTC
3 points
0 comments1 min readLW link

Sim­ple dis­tri­bu­tion ap­prox­i­ma­tion: When sam­pled 100 times, can lan­guage mod­els yield 80% A and 20% B?

29 Jan 2024 0:24 UTC
39 points
5 comments4 min readLW link

Why I take short timelines seriously

NicholasKees28 Jan 2024 22:27 UTC
121 points
29 comments4 min readLW link

Win Friends and In­fluence Peo­ple Ch. 2: The Bombshell

gull28 Jan 2024 21:40 UTC
38 points
13 comments17 min readLW link
(www.google.com)

Riga ACX Fe­bru­ary 2024 Meetup: 2023 in Review

Anastasia28 Jan 2024 21:36 UTC
4 points
0 comments1 min readLW link

San Fran­cisco ACX Meetup “First Satur­day”

28 Jan 2024 18:39 UTC
8 points
1 comment1 min readLW link

Things You’re Allowed to Do: At the Dentist

rbinnn28 Jan 2024 18:39 UTC
38 points
16 comments1 min readLW link
(metavee.github.io)

[Question] What ex­actly did that great AI fu­ture in­volve again?

lukehmiles28 Jan 2024 10:10 UTC
10 points
27 comments1 min readLW link

Pal­world de­vel­op­ment blog post

bhauth28 Jan 2024 5:56 UTC
79 points
12 comments1 min readLW link
(note.com)

Vir­tu­ally Ra­tional—VRChat Meetup

28 Jan 2024 5:52 UTC
25 points
3 comments1 min readLW link

[Stan­ford Daily] Table Talk

sudo28 Jan 2024 3:15 UTC
6 points
1 comment9 min readLW link
(stanforddaily.com)

AI Law-a-Thon

Iknownothing28 Jan 2024 2:30 UTC
5 points
3 comments1 min readLW link

Chap­ter 1 of How to Win Friends and In­fluence People

gull28 Jan 2024 0:32 UTC
48 points
5 comments17 min readLW link
(www.google.com)

Don’t sleep on Co­or­di­na­tion Takeoffs

trevor27 Jan 2024 19:55 UTC
62 points
24 comments5 min readLW link

Epistemic Hell

rogersbacon27 Jan 2024 17:13 UTC
69 points
20 comments14 min readLW link

David Burns Thinks Psy­chother­apy Is a Learn­able Skill. Git Gud.

Morpheus27 Jan 2024 13:21 UTC
27 points
20 comments11 min readLW link
(podcast.clearerthinking.org)

Aligned AI is dual use technology

lc27 Jan 2024 6:50 UTC
52 points
31 comments2 min readLW link

Ques­tions I’d Want to Ask an AGI+ to Test Its Un­der­stand­ing of Ethics

sweenesm26 Jan 2024 23:40 UTC
14 points
6 comments4 min readLW link

An In­vi­ta­tion to Refrain from Down­vot­ing Posts into Net-Nega­tive Karma

MikkW26 Jan 2024 20:13 UTC
2 points
12 comments1 min readLW link

The Good Balsamic Vinegar

jenn26 Jan 2024 19:30 UTC
51 points
4 comments2 min readLW link
(jenn.site)

The Per­spec­tive-based Ex­pla­na­tion to the Reflec­tive In­con­sis­tency Paradox

dadadarren26 Jan 2024 19:00 UTC
10 points
16 comments8 min readLW link

To Boldly Code

StrivingForLegibility26 Jan 2024 18:25 UTC
25 points
4 comments3 min readLW link

In­cor­po­rat­ing Mechanism De­sign Into De­ci­sion Theory

StrivingForLegibility26 Jan 2024 18:25 UTC
17 points
4 comments4 min readLW link

Mak­ing ev­ery re­searcher seek grants is a bro­ken model

jasoncrawford26 Jan 2024 16:06 UTC
152 points
41 comments4 min readLW link
(rootsofprogress.org)

Notes on Innocence

David Gross26 Jan 2024 14:45 UTC
12 points
21 comments19 min readLW link

Stacked Lap­top Monitor

jefftk26 Jan 2024 14:10 UTC
22 points
5 comments1 min readLW link
(www.jefftk.com)

Surgery Works Well Without The FDA

Maxwell Tabarrok26 Jan 2024 13:31 UTC
42 points
28 comments4 min readLW link
(maximumprogress.substack.com)

[Question] Work­shop (hackathon, res­i­dence pro­gram, etc.) about for-profit AI Safety pro­jects?

Roman Leventov26 Jan 2024 9:49 UTC
21 points
5 comments1 min readLW link

Without fun­da­men­tal ad­vances, mis­al­ign­ment and catas­tro­phe are the de­fault out­comes of train­ing pow­er­ful AI

26 Jan 2024 7:22 UTC
161 points
60 comments57 min readLW link

Ap­prox­i­mately Bayesian Rea­son­ing: Knigh­tian Uncer­tainty, Good­hart, and the Look-Else­where Effect

RogerDearnaley26 Jan 2024 3:58 UTC
13 points
0 comments11 min readLW link

Mus­ings on Cargo Cult Consciousness

Gareth Davidson25 Jan 2024 23:00 UTC
−13 points
11 comments17 min readLW link