All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025 2026

All Jan Feb Mar Apr May Jun Jul Aug Sep Oct NovDec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 192021 22 23 24 25 26 27 28 29 30 31

Shard Theory in Nine Theses: a Distillation and Critical Appraisal

LawrenceC19 Dec 2022 22:52 UTC

150 points

30 comments18 min readLW link

[Question] Will research in AI risk jinx it? Consequences of training AI on AI risk arguments

Yann Dubois19 Dec 2022 22:42 UTC

5 points

6 comments1 min readLW link

AGI Timelines in Governance: Different Strategies for Different Timeframes

simeon_c and AmberDawn

19 Dec 2022 21:31 UTC

65 points

28 comments10 min readLW link

Towards Hodge-podge Alignment

Cleo Nardo19 Dec 2022 20:12 UTC

95 points

30 comments9 min readLW link

Computational signatures of psychopathy

Cameron Berg19 Dec 2022 17:01 UTC

30 points

3 comments20 min readLW link

Results from a survey on tool use and workflows in alignment research

jacquesthibs, Jan, janus and Logan Riggs

19 Dec 2022 15:19 UTC

79 points

2 comments19 min readLW link

Does ChatGPT’s performance warrant working on a tutor for children? [It’s time to take it to the lab.]

Bill Benzon19 Dec 2022 15:12 UTC

13 points

5 comments4 min readLW link

(new-savanna.blogspot.com)

Conditions for Superrationality-motivated Cooperation in a one-shot Prisoner’s Dilemma

Jim Buhler19 Dec 2022 15:00 UTC

24 points

4 comments5 min readLW link

Next Level Seinfeld

Zvi19 Dec 2022 13:30 UTC

50 points

8 comments1 min readLW link

(thezvi.wordpress.com)

CEA Disambiguation

jefftk19 Dec 2022 13:20 UTC

25 points

0 comments1 min readLW link

(www.jefftk.com)

Why mechanistic interpretability does not and cannot contribute to long-term AGI safety (from messages with a friend)

Remmelt19 Dec 2022 12:02 UTC

−3 points

9 comments31 min readLW link

Hacker-AI and Cyberwar 2.0+

Erland Wittkotter19 Dec 2022 11:46 UTC

2 points

0 comments15 min readLW link

Non-Technical Preparation for Hacker-AI and Cyberwar 2.0+

Erland Wittkotter19 Dec 2022 11:42 UTC

2 points

0 comments25 min readLW link

An Effective Grab Bag

stavros19 Dec 2022 10:29 UTC

30 points

3 comments7 min readLW link

Slick hyperfinite Ramsey theory proof

Alok Singh19 Dec 2022 8:40 UTC

8 points

3 comments1 min readLW link

(alok.github.io)

The True Spirit of Solstice?

Raemon19 Dec 2022 8:00 UTC

71 points

31 comments9 min readLW link

The Risk of Orbital Debris and One (Cheap) Way to Mitigate It

clans19 Dec 2022 3:16 UTC

13 points

1 comment4 min readLW link

(locationtbd.home.blog)

Why I think that teaching philosophy is high impact

Eleni Angelou19 Dec 2022 3:11 UTC

5 points

0 comments2 min readLW link

A template for doing annual reviews

peterslattery19 Dec 2022 3:09 UTC

2 points

0 comments1 min readLW link

Event [Berkeley]: Alignment Collaborator Speed-Meeting

AlexMennen and Carson Jones

19 Dec 2022 2:24 UTC

18 points

2 comments1 min readLW link

An easier(?) end to the electoral college

ejacob19 Dec 2022 2:09 UTC

2 points

2 comments2 min readLW link

How Death Feels

sisyphus18 Dec 2022 23:47 UTC

−7 points

9 comments1 min readLW link

Why Are Women Hot?

Jacob Falkovich18 Dec 2022 23:20 UTC

17 points

19 comments11 min readLW link

[Question] Can we, in principle, know the measure of counterfactual quantum branches?

sisyphus18 Dec 2022 22:07 UTC

1 point

15 comments1 min readLW link

Boston Solstice 2022 Retrospective

jefftk18 Dec 2022 19:00 UTC

19 points

3 comments5 min readLW link

(www.jefftk.com)

Take 11: “Aligning language models” should be weirder.

Charlie Steiner18 Dec 2022 14:14 UTC

34 points

0 comments2 min readLW link

Bad at Arithmetic, Promising at Math

cohenmacaulay18 Dec 2022 5:40 UTC

102 points

19 comments20 min readLW link 1 review

Overconfidence bubbles

kaputmi18 Dec 2022 2:07 UTC

3 points

0 comments2 min readLW link

Positive values seem more robust and lasting than prohibitions

TurnTrout17 Dec 2022 21:43 UTC

52 points

13 comments2 min readLW link

What we owe the microbiome

weverka17 Dec 2022 19:40 UTC

2 points

0 comments1 min readLW link

(forum.effectivealtruism.org)

Why write more: improve your epistemics, self-care, & 28 other reasons

KatWoods17 Dec 2022 19:25 UTC

24 points

1 comment6 min readLW link

Looking for an alignment tutor

JanB17 Dec 2022 19:08 UTC

15 points

2 comments1 min readLW link

[Question] How to Convince my Son that Drugs are Bad

concerned_dad17 Dec 2022 18:47 UTC

174 points

91 comments2 min readLW link

Ordinary human life

David Hugh-Jones17 Dec 2022 16:46 UTC

24 points

3 comments14 min readLW link

(wyclif.substack.com)

Predictive Processing, Heterosexuality and Delusions of Grandeur

lsusr17 Dec 2022 7:37 UTC

39 points

14 comments5 min readLW link

[Link] Escape the Echo Chamber (2018)

CronoDAS17 Dec 2022 6:14 UTC

13 points

0 comments2 min readLW link

(aeon.co)

“Starry Night” Solstice Cookies

maia17 Dec 2022 5:31 UTC

27 points

7 comments1 min readLW link

[Question] What about non-degree seeking?

Lao Mein17 Dec 2022 2:22 UTC

5 points

5 comments1 min readLW link

Using Information Theory to tackle AI Alignment: A Practical Approach

Daniel Salami17 Dec 2022 1:37 UTC

10 points

4 comments7 min readLW link

Paper: Constitutional AI: Harmlessness from AI Feedback (Anthropic)

LawrenceC16 Dec 2022 22:12 UTC

68 points

11 comments1 min readLW link

(www.anthropic.com)

Vaguely interested in Effective Altruism? Please Take the Official 2022 EA Survey

Peter Wildeford16 Dec 2022 21:07 UTC

22 points

4 comments1 min readLW link

(rethinkpriorities.qualtrics.com)

Abstract concepts and metalingual definition: Does ChatGPT understand justice and charity?

Bill Benzon16 Dec 2022 21:01 UTC

2 points

0 comments13 min readLW link

Beyond the moment of invention

jasoncrawford16 Dec 2022 20:18 UTC

35 points

0 comments2 min readLW link

(rootsofprogress.org)

[Question] What’s the best time-efficient alternative to the Sequences?

trevor16 Dec 2022 20:17 UTC

7 points

7 comments1 min readLW link

Can we efficiently explain model behaviors?

paulfchristiano16 Dec 2022 19:40 UTC

64 points

3 comments9 min readLW link

(ai-alignment.com)

Proper scoring rules don’t guarantee predicting fixed points

Johannes Treutlein, Rubi J. Hudson and Caspar Oesterheld

16 Dec 2022 18:22 UTC

80 points

8 comments21 min readLW link

A learned agent is not the same as a learning agent

Ben Amitay16 Dec 2022 17:27 UTC

4 points

5 comments4 min readLW link

[Question] College Selection Advice for Technical Alignment

TempCollegeAsk16 Dec 2022 17:11 UTC

11 points

8 comments1 min readLW link

How important are accurate AI timelines for the optimal spending schedule on AI risk interventions?

Tristan Cook16 Dec 2022 16:05 UTC

27 points

2 comments5 min readLW link

Introducing Shrubgrazer

jefftk16 Dec 2022 14:50 UTC

22 points

0 comments2 min readLW link

(www.jefftk.com)