All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025 2026

All Jan Feb Mar Apr May JunJulAug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 202122 23 24 25 26 27 28 29 30 31

Defining Optimization in a Deeper Way Part 3

J Bostock20 Jul 2022 22:06 UTC

8 points

0 comments2 min readLW link

Cognitive Risks of Adolescent Binge Drinking

Elizabeth and Martin Bernstorff

20 Jul 2022 21:10 UTC

70 points

12 comments10 min readLW link

(acesounderglass.com)

Why AGI Timeline Research/Discourse Might Be Overrated

Noosphere8920 Jul 2022 20:26 UTC

5 points

0 comments1 min readLW link

(forum.effectivealtruism.org)

Enlightenment Values in a Vulnerable World

Maxwell Tabarrok20 Jul 2022 19:52 UTC

15 points

6 comments31 min readLW link

(maximumprogress.substack.com)

Countering arguments against working on AI safety

Rauno Arike20 Jul 2022 18:23 UTC

7 points

2 comments7 min readLW link

A Short Intro to Humans

Ben Amitay20 Jul 2022 15:28 UTC

1 point

1 comment7 min readLW link

How to Diversify Conceptual Alignment: the Model Behind Refine

adamShimi20 Jul 2022 10:44 UTC

87 points

11 comments8 min readLW link

[Question] What are the simplest questions in applied rationality where you don’t know the answer to?

ChristianKl20 Jul 2022 9:53 UTC

26 points

11 comments1 min readLW link

AI Safety Cheatsheet / Quick Reference

Zohar Jackson20 Jul 2022 9:39 UTC

3 points

0 comments1 min readLW link

(github.com)

Getting Unstuck on Counterfactuals

Chris_Leong20 Jul 2022 5:31 UTC

7 points

1 comment2 min readLW link

Pitfalls with Proofs

scasper19 Jul 2022 22:21 UTC

19 points

21 comments8 min readLW link

A daily routine I do for my AI safety research work

scasper19 Jul 2022 21:58 UTC

22 points

7 comments1 min readLW link

Progress links and tweets, 2022-07-19

jasoncrawford19 Jul 2022 20:50 UTC

11 points

1 comment1 min readLW link

(rootsofprogress.org)

Applications are open for CFAR workshops in Prague this fall!

John Steidley19 Jul 2022 18:29 UTC

64 points

3 comments2 min readLW link

Sexual Abuse attitudes might be infohazardous

Pseudonymous Otter19 Jul 2022 18:06 UTC

258 points

72 comments1 min readLW link

Spending Update 2022

jefftk19 Jul 2022 14:10 UTC

28 points

0 comments3 min readLW link

(www.jefftk.com)

Abram Demski’s ELK thoughts and proposal—distillation

Rubi J. Hudson19 Jul 2022 6:57 UTC

19 points

8 comments16 min readLW link

Bounded complexity of solving ELK and its implications

Rubi J. Hudson19 Jul 2022 6:56 UTC

11 points

4 comments18 min readLW link

Help ARC evaluate capabilities of current language models (still need people)

Beth Barnes19 Jul 2022 4:55 UTC

95 points

6 comments2 min readLW link

A Critique of AI Alignment Pessimism

ExCeph19 Jul 2022 2:28 UTC

9 points

1 comment9 min readLW link

Ars D&D.Sci: Mysteries of Mana Evaluation & Ruleset

aphyer19 Jul 2022 2:06 UTC

33 points

4 comments5 min readLW link

Marburg Virus Pandemic Prediction Checklist

DirectedEvolution18 Jul 2022 23:15 UTC

30 points

0 comments5 min readLW link

At what point will we know if Eliezer’s predictions are right or wrong?

anonymous12345618 Jul 2022 22:06 UTC

5 points

6 comments1 min readLW link

Modelling Deception

Garrett Baker18 Jul 2022 21:21 UTC

15 points

0 comments7 min readLW link

Are Intelligence and Generality Orthogonal?

cubefox18 Jul 2022 20:07 UTC

18 points

16 comments1 min readLW link

Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover

Ajeya Cotra18 Jul 2022 19:06 UTC

373 points

95 comments75 min readLW link 1 review

Turning Some Inconsistent Preferences into Consistent Ones

niplav18 Jul 2022 18:40 UTC

23 points

5 comments12 min readLW link

Addendum: A non-magical explanation of Jeffrey Epstein

lc18 Jul 2022 17:40 UTC

87 points

21 comments11 min readLW link

Launching a new progress institute, seeking a CEO

jasoncrawford18 Jul 2022 16:58 UTC

25 points

2 comments3 min readLW link

(rootsofprogress.org)

Machine Learning Model Sizes and the Parameter Gap [abridged]

Pablo Villalobos18 Jul 2022 16:51 UTC

20 points

0 comments1 min readLW link

(epochai.org)

Quantilizers and Generative Models

Adam Jermyn18 Jul 2022 16:32 UTC

24 points

5 comments4 min readLW link

AI Hiroshima (Does A Vivid Example Of Destruction Forestall Apocalypse?)

Sable18 Jul 2022 12:06 UTC

4 points

4 comments2 min readLW link

How the ---- did Feynman Get Here !?

George3d618 Jul 2022 9:43 UTC

8 points

8 comments3 min readLW link

(www.epistem.ink)

Conditioning Generative Models for Alignment

Jozdien18 Jul 2022 7:11 UTC

60 points

8 comments20 min readLW link

Training goals for large language models

Johannes Treutlein18 Jul 2022 7:09 UTC

28 points

5 comments19 min readLW link

A distillation of Evan Hubinger’s training stories (for SERI MATS)

Daphne_W18 Jul 2022 3:38 UTC

15 points

1 comment10 min readLW link

Forecasting ML Benchmarks in 2023

jsteinhardt18 Jul 2022 2:50 UTC

36 points

20 comments12 min readLW link

(bounded-regret.ghost.io)

What should you change in response to an “emergency”? And AI risk

AnnaSalamon18 Jul 2022 1:11 UTC

346 points

60 comments7 min readLW link 1 review

Deception?! I ain’t got time for that!

Paul Colognese18 Jul 2022 0:06 UTC

55 points

5 comments13 min readLW link

How Interpretability can be Impactful

Connall Garrod18 Jul 2022 0:06 UTC

19 points

0 comments37 min readLW link

Why you might expect homogeneous take-off: evidence from ML research

Andrei Alexandru17 Jul 2022 20:31 UTC

24 points

0 comments10 min readLW link

Examples of AI Increasing AI Progress

TW12317 Jul 2022 20:06 UTC

107 points

14 comments1 min readLW link

Four questions I ask AI safety researchers

Orpheus1617 Jul 2022 17:25 UTC

17 points

0 comments1 min readLW link

Why I Think Abrupt AI Takeoff

lincolnquirk17 Jul 2022 17:04 UTC

14 points

6 comments1 min readLW link

Culture wars in riddle format

Malmesbury17 Jul 2022 14:51 UTC

7 points

28 comments3 min readLW link

Bangalore LW/ACX Meetup in person

Vyakart17 Jul 2022 6:53 UTC

1 point

0 comments1 min readLW link

Resolve Cycles

CFAR!Duncan16 Jul 2022 23:17 UTC

144 points

8 comments10 min readLW link

Alignment as Game Design

Shoshannah Tekofsky16 Jul 2022 22:36 UTC

11 points

7 comments2 min readLW link

Risk Management from a Climbers Perspective

Annapurna16 Jul 2022 21:14 UTC

5 points

0 comments6 min readLW link

(jorgevelez.substack.com)

Cognitive Instability, Physicalism, and Free Will

dadadarren16 Jul 2022 13:13 UTC

5 points

27 comments2 min readLW link

(www.sleepingbeautyproblem.com)