All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025 2026

All Jan Feb Mar Apr May Jun Jul Aug Sep OctNovDec

All 1 2 3 4 5 6 789 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

How could we know that an AGI system will have good consequences?

So8res7 Nov 2022 22:42 UTC

112 points

25 comments5 min readLW link

A Walkthrough of Interpretability in the Wild (w/ authors Kevin Wang, Arthur Conmy & Alexandre Variengien)

Neel Nanda7 Nov 2022 22:39 UTC

30 points

15 comments3 min readLW link

(youtu.be)

Intercept article about lab accidents

ChristianKl7 Nov 2022 21:10 UTC

23 points

9 comments1 min readLW link

(theintercept.com)

The biological function of love for non-kin is to gain the trust of people we cannot deceive

chaosmage7 Nov 2022 20:26 UTC

43 points

3 comments8 min readLW link

Distillation Experiment: Chunk-Knitting

DirectedEvolution7 Nov 2022 19:56 UTC

10 points

3 comments6 min readLW link

Thinking About Mastodon

jefftk7 Nov 2022 19:40 UTC

33 points

17 comments1 min readLW link

(www.jefftk.com)

[Question] Ideas for tiny research projects related to rationality?

Frej7 Nov 2022 18:45 UTC

3 points

1 comment1 min readLW link

Loss of control of AI is not a likely source of AI x-risk

squek7 Nov 2022 18:44 UTC

−6 points

0 comments5 min readLW link

AI Safety Unconference NeurIPS 2022

Orpheus7 Nov 2022 15:39 UTC

25 points

0 comments1 min readLW link

(aisafetyevents.org)

Hacker-AI – Does it already exist?

Erland Wittkotter7 Nov 2022 14:01 UTC

3 points

12 comments11 min readLW link

What’s the Deal with Elon Musk and Twitter?

Zvi7 Nov 2022 13:50 UTC

60 points

13 comments31 min readLW link

(thezvi.wordpress.com)

How to Make Easy Decisions

lynettebye7 Nov 2022 13:17 UTC

17 points

3 comments2 min readLW link

Opportunities that surprised us during our Clearer Thinking Regrants program

spencerg7 Nov 2022 13:09 UTC

20 points

0 comments9 min readLW link

4 Key Assumptions in AI Safety

Prometheus7 Nov 2022 10:50 UTC

20 points

5 comments7 min readLW link

Google Search as a Washed Up Service Dog: “I HALP!”

Shmi7 Nov 2022 7:02 UTC

20 points

8 comments1 min readLW link

[Book Review] “Station Eleven” by Emily St. John Mandel

lsusr7 Nov 2022 5:56 UTC

17 points

1 comment1 min readLW link

Counterfactability

Scott Garrabrant7 Nov 2022 5:39 UTC

40 points

5 comments11 min readLW link

2022 LessWrong Census?

Matt He7 Nov 2022 5:16 UTC

67 points

13 comments1 min readLW link

A philosopher’s critique of RLHF

TW1237 Nov 2022 2:42 UTC

55 points

8 comments2 min readLW link

[Question] Is there any discussion on avoiding being Dutch-booked or otherwise taken advantage of one’s bounded rationality by refusing to engage?

Shmi7 Nov 2022 2:36 UTC

38 points

29 comments1 min readLW link

Exams-Only Universities

Mati_Roy6 Nov 2022 22:05 UTC

80 points

40 comments2 min readLW link

Democracy Is in Danger, but Not for the Reasons You Think

ExCeph6 Nov 2022 21:15 UTC

−7 points

4 comments12 min readLW link

(ginnungagapfoundation.wordpress.com)

Playground Game: Monster

jefftk6 Nov 2022 16:00 UTC

14 points

4 comments1 min readLW link

(www.jefftk.com)

[Question] Has Pascal’s Mugging problem been completely solved yet?

EniScien6 Nov 2022 12:52 UTC

3 points

11 comments1 min readLW link

[Question] Should I Pursue a PhD?

DragonGod6 Nov 2022 10:58 UTC

8 points

8 comments2 min readLW link

You won’t solve alignment without agent foundations

Mikhail Samin6 Nov 2022 8:07 UTC

29 points

3 comments8 min readLW link

Word-Distance vs Idea-Distance: The Case for Lanoitaring

Sable6 Nov 2022 5:25 UTC

7 points

7 comments7 min readLW link

(affablyevil.substack.com)

Apple Cider Syrup

jefftk6 Nov 2022 2:10 UTC

11 points

6 comments1 min readLW link

(www.jefftk.com)

What is epigenetics?

Metacelsus6 Nov 2022 1:24 UTC

78 points

4 comments6 min readLW link

(denovo.substack.com)

Response

Jarred Filmer6 Nov 2022 1:03 UTC

29 points

2 comments12 min readLW link

[Question] Has anyone increased their AGI timelines?

Darren McKee6 Nov 2022 0:03 UTC

39 points

12 comments1 min readLW link

Takeaways from a survey on AI alignment resources

DanielFilan5 Nov 2022 23:40 UTC

73 points

10 comments6 min readLW link 1 review

(danielfilan.com)

Unpricable Information and Certificate Hell

eva_5 Nov 2022 22:56 UTC

13 points

2 comments6 min readLW link

Recommend HAIST resources for assessing the value of RLHF-related alignment research

Sam Marks and Xander Davies

5 Nov 2022 20:58 UTC

26 points

9 comments3 min readLW link

Instead of technical research, more people should focus on buying time

Orpheus16, Olive Branch and Thomas Larsen

5 Nov 2022 20:43 UTC

101 points

45 comments14 min readLW link

Provably Honest—A First Step

Srijanak De5 Nov 2022 19:18 UTC

10 points

2 comments8 min readLW link

Should AI focus on problem-solving or strategic planning? Why not both?

Oliver Siegel5 Nov 2022 19:17 UTC

−12 points

3 comments1 min readLW link

How to store human values on a computer

Oliver Siegel5 Nov 2022 19:17 UTC

−12 points

17 comments1 min readLW link

The Slippery Slope from DALLE-2 to Deepfake Anarchy

scasper5 Nov 2022 14:53 UTC

17 points

9 comments11 min readLW link

When can a mimic surprise you? Why generative models handle seemingly ill-posed problems

David Johnston5 Nov 2022 13:19 UTC

8 points

4 comments16 min readLW link

My summary of “Pragmatic AI Safety”

Eleni Angelou5 Nov 2022 12:54 UTC

3 points

0 comments5 min readLW link

Review of the Challenge

SD Marlow5 Nov 2022 6:38 UTC

−14 points

5 comments2 min readLW link

Spectrum of Independence

jefftk5 Nov 2022 2:40 UTC

43 points

7 comments1 min readLW link

(www.jefftk.com)

Metaculus is seeking Software Engineers

dschwarz5 Nov 2022 0:42 UTC

18 points

0 comments1 min readLW link

(apply.workable.com)

Should we “go against nature”?

jasoncrawford4 Nov 2022 22:14 UTC

10 points

3 comments2 min readLW link

(rootsofprogress.org)

How much should we care about non-human animals?

bokov4 Nov 2022 21:36 UTC

17 points

8 comments2 min readLW link

(www.lesswrong.com)

For ELK truth is mostly a distraction

c.trout4 Nov 2022 21:14 UTC

44 points

0 comments21 min readLW link

Toy Models and Tegum Products

Adam Jermyn4 Nov 2022 18:51 UTC

28 points

7 comments5 min readLW link

Follow up to medical miracle

Elizabeth4 Nov 2022 18:00 UTC

78 points

5 comments6 min readLW link

(acesounderglass.com)

Cross-Void Optimization

pneumynym4 Nov 2022 17:47 UTC

1 point

1 comment8 min readLW link