Out of the Box

jesseduffield13 Nov 2023 23:43 UTC
5 points
1 comment7 min readLW link

Loudly Give Up, Don’t Quietly Fade

Screwtape13 Nov 2023 23:30 UTC
138 points
11 comments6 min readLW link

Great Em­pa­thy and Great Re­sponse Ability

positivesum13 Nov 2023 23:04 UTC
16 points
0 comments3 min readLW link
(tryingtruly.substack.com)

The­o­ries of Change for AI Auditing

13 Nov 2023 19:33 UTC
53 points
0 comments18 min readLW link
(www.apolloresearch.ai)

They are made of re­peat­ing patterns

quetzal_rainbow13 Nov 2023 18:17 UTC
49 points
4 comments2 min readLW link

How to Upload a Mind (In Three Not-So-Easy Steps)

13 Nov 2023 18:13 UTC
26 points
0 comments7 min readLW link
(youtu.be)

Non-my­opia stories

lberglund13 Nov 2023 17:52 UTC
28 points
10 comments7 min readLW link

It’s OK to eat shrimp: EAs Make In­valid In­fer­ences About Fish Qualia and Mo­ral Patienthood

Mikhail Samin13 Nov 2023 16:51 UTC
2 points
17 comments1 min readLW link

Sugges­tions for chess puzzles

Zane13 Nov 2023 15:39 UTC
13 points
1 comment1 min readLW link

Why small phe­nomenons are rele­vant to moral­ity ​

Ryo 13 Nov 2023 15:25 UTC
1 point
0 comments3 min readLW link

Op­tion­al­ity ap­proach to ethics

Ryo 13 Nov 2023 15:23 UTC
7 points
2 comments3 min readLW link

Redi­rect­ing one’s own taxes as an effec­tive al­tru­ism method

David Gross13 Nov 2023 15:17 UTC
1 point
34 comments16 min readLW link

AISC Pro­ject: Bench­marks for Stable Reflectivity

jacquesthibs13 Nov 2023 14:51 UTC
17 points
0 comments8 min readLW link

AISC Pro­ject: Model­ling Tra­jec­to­ries of Lan­guage Models

NickyP13 Nov 2023 14:33 UTC
26 points
0 comments12 min readLW link

Bostrom Goes Unheard

Zvi13 Nov 2023 14:11 UTC
81 points
9 comments18 min readLW link

Novem­ber hang­out in Warsaw

ntoxeg13 Nov 2023 13:20 UTC
1 point
1 comment1 min readLW link

The Science Al­gorithm AISC Project

Johannes C. Mayer13 Nov 2023 12:52 UTC
12 points
0 comments1 min readLW link
(docs.google.com)

You can just spon­ta­neously call peo­ple you haven’t met in years

lc13 Nov 2023 5:21 UTC
154 points
19 comments1 min readLW link

Zvi’s Man­i­fold Mar­kets House Rules

Zvi13 Nov 2023 0:28 UTC
39 points
4 comments3 min readLW link

[Question] What’s your best util­i­tar­ian model for risk­ing your best kid­neys?

Ilio12 Nov 2023 23:01 UTC
−3 points
4 comments1 min readLW link

Helpful ex­am­ples to get a sense of mod­ern au­to­mated manipulation

trevor12 Nov 2023 20:49 UTC
33 points
3 comments9 min readLW link

The Snug­gle/​Date/​Slap Protocol

MadHatter12 Nov 2023 20:44 UTC
−21 points
4 comments2 min readLW link

Two chil­dren’s stories

Optimization Process12 Nov 2023 20:29 UTC
11 points
1 comment7 min readLW link

The Fun­da­men­tal The­o­rem for mea­surable fac­tor spaces

Matthias G. Mayer12 Nov 2023 19:25 UTC
38 points
2 comments2 min readLW link

How ac­cu­rate are stan­dard Dark Triad per­son­al­ity scales?

jamesbill12 Nov 2023 8:21 UTC
0 points
2 comments2 min readLW link

[Question] What ML gears do you like?

Ulisse Mini11 Nov 2023 19:10 UTC
25 points
4 comments1 min readLW link

Smart Ses­sions—Fi­nally a (kinda) win­dow-cen­tric ses­sion manager

Eli Tyre11 Nov 2023 18:54 UTC
13 points
3 comments5 min readLW link

AISC pro­ject: Satis­fIA – AI that satis­fies with­out over­do­ing it

Jobst Heitzig11 Nov 2023 18:22 UTC
11 points
0 comments1 min readLW link
(docs.google.com)

Con­trol Sym­me­try: why we might want to start in­ves­ti­gat­ing asym­met­ric al­ign­ment interventions

domenicrosati11 Nov 2023 17:27 UTC
23 points
1 comment2 min readLW link

Game The­ory with­out Argmax [Part 2]

Cleo Nardo11 Nov 2023 16:02 UTC
31 points
14 comments13 min readLW link

Game The­ory with­out Argmax [Part 1]

Cleo Nardo11 Nov 2023 15:59 UTC
53 points
16 comments19 min readLW link

It’s OK to be bi­ased to­wards humans

dr_s11 Nov 2023 11:59 UTC
55 points
69 comments6 min readLW link

The Top AI Safety Bets for 2023: GiveWiki’s Lat­est Recommendations

Dawn Drescher11 Nov 2023 9:04 UTC
2 points
2 comments1 min readLW link

Ar­tifi­cial Gen­eral Horsiness

robotelvis11 Nov 2023 5:15 UTC
4 points
0 comments5 min readLW link
(messyprogress.substack.com)

Pal­isade is hiring Re­search Engineers

11 Nov 2023 3:09 UTC
22 points
0 comments3 min readLW link

Open Phil re­leases RFPs on LLM Bench­marks and Forecasting

LawrenceC11 Nov 2023 3:01 UTC
53 points
0 comments2 min readLW link
(www.openphilanthropy.org)

Memo on some ne­glected topics

Lukas Finnveden11 Nov 2023 2:01 UTC
28 points
2 comments1 min readLW link
(open.substack.com)

Who is Sam Bankman-Fried (SBF) re­ally, and how could he have done what he did? - three the­o­ries and a lot of evidence

spencerg11 Nov 2023 1:04 UTC
36 points
28 comments1 min readLW link
(www.spencergreenberg.com)

Sur­vey on the ac­cel­er­a­tion risks of our new RFPs to study LLM capabilities

Ajeya Cotra10 Nov 2023 23:59 UTC
27 points
1 comment1 min readLW link

Rat Fest 2024

LoganChipkin10 Nov 2023 23:25 UTC
1 point
0 comments1 min readLW link

How I Think, Part Three: Weigh­ing Cryonics

Richard Henage10 Nov 2023 22:21 UTC
4 points
1 comment2 min readLW link

Lin­ear en­cod­ing of char­ac­ter-level in­for­ma­tion in GPT-J to­ken embeddings

10 Nov 2023 22:19 UTC
34 points
4 comments28 min readLW link

Fol­low-up sur­vey: inositol

Elizabeth10 Nov 2023 19:30 UTC
13 points
1 comment1 min readLW link
(acesounderglass.com)

We have promis­ing al­ign­ment plans with low taxes

Seth Herd10 Nov 2023 18:51 UTC
31 points
9 comments5 min readLW link

[Question] Vec­tor search on a large dataset?

camsdixon10 Nov 2023 18:43 UTC
−1 points
2 comments1 min readLW link

About Me

Abe Dillon10 Nov 2023 18:32 UTC
3 points
0 comments1 min readLW link

Me­tac­u­lus In­tro­duces AI-Pow­ered Com­mu­nity In­sights to Re­veal Fac­tors Driv­ing User Forecasts

ChristianWilliams10 Nov 2023 17:57 UTC
6 points
0 comments1 min readLW link
(www.metaculus.com)

Joy in the Here and Real

Screwtape10 Nov 2023 17:22 UTC
18 points
0 comments2 min readLW link

Arte­facts gen­er­ated by mode col­lapse in GPT-4 Turbo serve as ad­ver­sar­ial at­tacks.

Sohaib Imran10 Nov 2023 15:23 UTC
11 points
0 comments2 min readLW link

Wastew­a­ter RNA Read Lengths

jefftk10 Nov 2023 15:20 UTC
13 points
0 comments4 min readLW link
(www.jefftk.com)