All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025 2026

All Jan Feb Mar Apr May Jun Jul Aug Sep OctNovDec

All 1 2 3 4 5 6 7 8 9 10 111213 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

[Question] What ML gears do you like?

Ulisse Mini11 Nov 2023 19:10 UTC

25 points

4 comments1 min readLW link

Smart Sessions—Finally a (kinda) window-centric session manager

Eli Tyre11 Nov 2023 18:54 UTC

14 points

4 comments5 min readLW link

AISC project: SatisfIA – AI that satisfies without overdoing it

Jobst Heitzig11 Nov 2023 18:22 UTC

12 points

0 comments1 min readLW link

(docs.google.com)

Control Symmetry: why we might want to start investigating asymmetric alignment interventions

domenicrosati11 Nov 2023 17:27 UTC

25 points

1 comment2 min readLW link

Game Theory without Argmax [Part 2]

Cleo Nardo11 Nov 2023 16:02 UTC

31 points

14 comments13 min readLW link

Game Theory without Argmax [Part 1]

Cleo Nardo11 Nov 2023 15:59 UTC

78 points

18 comments19 min readLW link

It’s OK to be biased towards humans

dr_s11 Nov 2023 11:59 UTC

53 points

69 comments6 min readLW link

The Top AI Safety Bets for 2023: GiveWiki’s Latest Recommendations

Dawn Drescher11 Nov 2023 9:04 UTC

3 points

2 comments8 min readLW link

Artificial General Horsiness

robotelvis11 Nov 2023 5:15 UTC

4 points

0 comments5 min readLW link

(messyprogress.substack.com)

Palisade is hiring Research Engineers

Charlie Rogers-Smith and Jeffrey Ladish

11 Nov 2023 3:09 UTC

23 points

0 comments3 min readLW link

Open Phil releases RFPs on LLM Benchmarks and Forecasting

LawrenceC11 Nov 2023 3:01 UTC

53 points

0 comments2 min readLW link

(www.openphilanthropy.org)

Memo on some neglected topics

Lukas Finnveden11 Nov 2023 2:01 UTC

28 points

2 comments7 min readLW link

(open.substack.com)

Who is Sam Bankman-Fried (SBF) really, and how could he have done what he did? - three theories and a lot of evidence

spencerg11 Nov 2023 1:04 UTC

36 points

28 comments9 min readLW link

(www.spencergreenberg.com)

Survey on the acceleration risks of our new RFPs to study LLM capabilities

Ajeya Cotra10 Nov 2023 23:59 UTC

29 points

1 comment8 min readLW link

Rat Fest 2024

LoganChipkin10 Nov 2023 23:25 UTC

7 points

6 comments1 min readLW link

How I Think, Part Three: Weighing Cryonics

Richard Henage10 Nov 2023 22:21 UTC

4 points

1 comment2 min readLW link

Linear encoding of character-level information in GPT-J token embeddings

mwatkins and Joseph Bloom

10 Nov 2023 22:19 UTC

35 points

4 comments28 min readLW link

Follow-up survey: inositol

Elizabeth10 Nov 2023 19:30 UTC

13 points

1 comment1 min readLW link

(acesounderglass.com)

We have promising alignment plans with low taxes

Seth Herd10 Nov 2023 18:51 UTC

46 points

9 comments5 min readLW link

[Question] Vector search on a large dataset?

camsdixon10 Nov 2023 18:43 UTC

−1 points

2 comments1 min readLW link

Metaculus Introduces AI-Powered Community Insights to Reveal Factors Driving User Forecasts

ChristianWilliams10 Nov 2023 17:57 UTC

6 points

0 comments1 min readLW link

(www.metaculus.com)

Joy in the Here and Real

Screwtape10 Nov 2023 17:22 UTC

19 points

0 comments2 min readLW link

Artefacts generated by mode collapse in GPT-4 Turbo serve as adversarial attacks.

Sohaib Imran10 Nov 2023 15:23 UTC

11 points

0 comments2 min readLW link

Wastewater RNA Read Lengths

jefftk10 Nov 2023 15:20 UTC

13 points

0 comments4 min readLW link

(www.jefftk.com)

Update on the UK AI Summit and the UK’s Plans

Elliot Mckernon10 Nov 2023 14:47 UTC

11 points

0 comments8 min readLW link

Liv Boeree Ted Talk Moloch & AI

Neil 10 Nov 2023 14:04 UTC

10 points

2 comments1 min readLW link

(m.youtube.com)

Picking Mentors For Research Programmes

Raymond Douglas10 Nov 2023 13:01 UTC

105 points

8 comments4 min readLW link

GPT-2030 and Catastrophic Drives: Four Vignettes

jsteinhardt10 Nov 2023 7:30 UTC

50 points

5 comments10 min readLW link

(bounded-regret.ghost.io)

Crock, Crocker, Crockiest

Screwtape10 Nov 2023 6:14 UTC

21 points

5 comments6 min readLW link

AI Timelines

habryka, Daniel Kokotajlo, Ajeya Cotra and Ege Erdil

10 Nov 2023 5:28 UTC

302 points

144 comments51 min readLW link 2 reviews

ACI#6: A Non-Dualistic ACI Model

Akira Pyinya9 Nov 2023 23:01 UTC

10 points

2 comments6 min readLW link

How I got so excited about HowTruthful

Bruce Lewis9 Nov 2023 18:49 UTC

17 points

4 comments5 min readLW link

The case for “Generous Tit for Tat” as the ultimate game theory strategy

positivesum9 Nov 2023 18:41 UTC

3 points

3 comments8 min readLW link

(tryingtruly.substack.com)

Text Posts from the Kids Group: 2021

jefftk9 Nov 2023 17:50 UTC

38 points

1 comment8 min readLW link

(www.jefftk.com)

AI #37: Moving Too Fast

Zvi9 Nov 2023 17:50 UTC

53 points

5 comments76 min readLW link

(thezvi.wordpress.com)

Learning-theoretic agenda reading list

Vanessa Kosoy9 Nov 2023 17:25 UTC

108 points

1 comment2 min readLW link 1 review

Open-ended/Phenomenal Ethics (TLDR)

Ryo 9 Nov 2023 16:58 UTC

3 points

0 comments1 min readLW link

Polysemantic Attention Head in a 4-Layer Transformer

Jett Janiak, cmathw and StefanHex

9 Nov 2023 16:16 UTC

51 points

0 comments6 min readLW link

On OpenAI Dev Day

Zvi9 Nov 2023 16:10 UTC

60 points

0 comments15 min readLW link

(thezvi.wordpress.com)

Antropical Probabilities Are Fully Explained by Difference in Possible Outcomes

Ape in the coat9 Nov 2023 15:34 UTC

19 points

7 comments5 min readLW link

A free to enter, 240 character, open-source iterated prisoner’s dilemma tournament

Isaac King9 Nov 2023 8:24 UTC

64 points

19 comments1 min readLW link

(manifold.markets)

Into AI Safety Episodes 1 & 2

jacobhaimes9 Nov 2023 4:36 UTC

3 points

0 comments1 min readLW link

(into-ai-safety.github.io)

Making Bad Decisions On Purpose

Screwtape9 Nov 2023 3:36 UTC

49 points

8 comments5 min readLW link

Metaculus’s New Sidebar Helps You Find Forecasts Faster

ChristianWilliams8 Nov 2023 20:56 UTC

15 points

0 comments1 min readLW link

(www.metaculus.com)

Open-ended ethics of phenomena (a desiderata with universal morality)

Ryo 8 Nov 2023 20:10 UTC

1 point

0 comments8 min readLW link

Open Agency model can solve the AI regulation dilemma

Roman Leventov8 Nov 2023 20:00 UTC

22 points

1 comment2 min readLW link

Gothenburg LW / ACX meetup

Stefan8 Nov 2023 19:52 UTC

1 point

0 comments1 min readLW link

[Question] Why is lesswrong blocking wget and curl (scrape)?

nik lacombe8 Nov 2023 19:42 UTC

23 points

15 comments1 min readLW link

[Question] Is there a lesswrong archive of all public posts?

nik lacombe8 Nov 2023 19:26 UTC

15 points

7 comments1 min readLW link

Five projects from AI Safety Hub Labs 2023

Charlie Griffin8 Nov 2023 19:19 UTC

47 points

1 comment6 min readLW link

(www.aisafetyhub.org)