All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025 2026

All JanFebMar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 262728

Curiosity as a Solution to AGI Alignment

Harsha G.26 Feb 2023 23:36 UTC

7 points

7 comments3 min readLW link

Learning How to Learn (And 20+ Studies)

maxa26 Feb 2023 22:46 UTC

66 points

12 comments6 min readLW link

(max2c.com)

Bayesian Scenario: Snipers & Soldiers

abstractapplic26 Feb 2023 21:48 UTC

23 points

8 comments1 min readLW link

(h-b-p.github.io)

[Link Post] Cyber Digital Authoritarianism (National Intelligence Council Report)

Phosphorous26 Feb 2023 20:51 UTC

12 points

2 comments1 min readLW link

(www.dni.gov)

Reflections on Zen and the Art of Motorcycle Maintenance

LoganStrohl26 Feb 2023 20:46 UTC

33 points

3 comments23 min readLW link

Taboo “human-level intelligence”

Sherrinford26 Feb 2023 20:42 UTC

12 points

7 comments1 min readLW link

[Link] Petition on brain preservation: Allow global access to high-quality brain preservation as an option rapidly after death

Mati_Roy26 Feb 2023 15:56 UTC

29 points

2 comments1 min readLW link

(www.change.org)

Some thoughts on the cults LW had

Noosphere8926 Feb 2023 15:46 UTC

−4 points

31 comments1 min readLW link

A library for safety research in conditioning on RLHF tasks

James Chua26 Feb 2023 14:50 UTC

10 points

2 comments1 min readLW link

The Preference Fulfillment Hypothesis

Kaj_Sotala26 Feb 2023 10:55 UTC

66 points

63 comments11 min readLW link

All of my grandparents were prodigies, I am extremely bored at Oxford University. Please let me intern/work for you!

politicalpersuasion26 Feb 2023 7:50 UTC

−17 points

7 comments3 min readLW link

“Rationalist Discourse” Is Like “Physicist Motors”

Zack_M_Davis26 Feb 2023 5:58 UTC

138 points

153 comments9 min readLW link 1 review

[Question] Ways to prepare to a vastly new world?

Annapurna26 Feb 2023 4:56 UTC

4 points

6 comments1 min readLW link

Incentives and Selection: A Missing Frame From AI Threat Discussions?

DragonGod26 Feb 2023 1:18 UTC

11 points

16 comments2 min readLW link

A mechanistic explanation for SolidGoldMagikarp-like tokens in GPT2

MadHatter26 Feb 2023 1:10 UTC

61 points

14 comments6 min readLW link

Politics is the Fun-Killer

Adam Zerner25 Feb 2023 23:29 UTC

28 points

5 comments2 min readLW link

Bayes is Out-Dated, and You’re Doing it Wrong

AnthonyRepetto25 Feb 2023 23:18 UTC

−45 points

44 comments4 min readLW link

[Question] Would more model evals teams be good?

Ryan Kidd25 Feb 2023 22:01 UTC

20 points

4 comments1 min readLW link

Nod posts

Adam Zerner25 Feb 2023 21:53 UTC

27 points

8 comments2 min readLW link

Prediction market: Will John Wentworth’s Gears of Aging series hold up in 2033?

tailcalled25 Feb 2023 20:15 UTC

15 points

4 comments1 min readLW link

(manifold.markets)

Making Implied Standards Explicit

Logan Riggs25 Feb 2023 20:02 UTC

22 points

0 comments4 min readLW link

Two Reasons for no Utilitarianism

False Name25 Feb 2023 19:51 UTC

−4 points

3 comments3 min readLW link

Cognitive Emulation: A Naive AI Safety Proposal

Connor Leahy and Gabriel Alfour

25 Feb 2023 19:35 UTC

195 points

46 comments4 min readLW link

[Prediction] Humanity will survive the next hundred years

lsusr25 Feb 2023 18:59 UTC

33 points

44 comments2 min readLW link

The Caplan-Yudkowsky End-of-the-World Bet Scheme Doesn’t Actually Work

lsusr25 Feb 2023 18:57 UTC

5 points

14 comments2 min readLW link

The Practitioner’s Path 2.0: the Empiricist Archetype

Evenflair25 Feb 2023 17:05 UTC

15 points

0 comments1 min readLW link

(guildoftherose.org)

[Question] Pink Shoggoths: What does alignment look like in practice?

Yuli_Ban25 Feb 2023 12:23 UTC

30 points

13 comments11 min readLW link

Just How Hard a Problem is Alignment?

Roger Dearnaley's Old Profile25 Feb 2023 9:00 UTC

3 points

1 comment21 min readLW link

Buddhist Psychotechnology for Withstanding Apocalypse Stress

romeostevensit25 Feb 2023 3:11 UTC

63 points

10 comments5 min readLW link

What kind of place is this?

Jim Pivarski25 Feb 2023 2:14 UTC

25 points

24 comments8 min readLW link

Agents vs. Predictors: Concrete differentiating factors

evhub24 Feb 2023 23:50 UTC

37 points

3 comments4 min readLW link

Christiano (ARC) and GA (Conjecture) Discuss Alignment Cruxes

Andrea_Miotti, paulfchristiano, Gabriel Alfour and Olive Branch

24 Feb 2023 23:03 UTC

61 points

7 comments47 min readLW link

Retrospective on the 2022 Conjecture AI Discussions

Andrea_Miotti24 Feb 2023 22:41 UTC

92 points

5 comments2 min readLW link

How popular is ChatGPT? Part 1: more popular than Taylor Swift

Harlan24 Feb 2023 22:30 UTC

56 points

0 comments2 min readLW link

(aiimpacts.org)

Are you stably aligned?

Seth Herd24 Feb 2023 22:08 UTC

14 points

0 comments2 min readLW link

Puzzle Cycles

Screwtape24 Feb 2023 21:35 UTC

9 points

2 comments4 min readLW link

Sam Altman: “Planning for AGI and beyond”

LawrenceC24 Feb 2023 20:28 UTC

105 points

54 comments6 min readLW link

(openai.com)

A Proposed Test to Determine the Extent to Which Large Language Models Understand the Real World

Bruce G24 Feb 2023 20:20 UTC

4 points

7 comments8 min readLW link

Meta “open sources” LMs competitive with Chinchilla, PaLM, and code-davinci-002 (Paper)

LawrenceC24 Feb 2023 19:57 UTC

38 points

19 comments1 min readLW link

(research.facebook.com)

Relationship Orientations

DaystarEld24 Feb 2023 19:43 UTC

37 points

1 comment3 min readLW link

(daystareld.com)

The alien simulation meme doesn’t make sense

FTPickle24 Feb 2023 19:27 UTC

4 points

1 comment1 min readLW link

Exit Duty Generator by Matti Häyry

Oldphan24 Feb 2023 18:35 UTC

−5 points

0 comments1 min readLW link

(www.cambridge.org)

2023 Stanford Existential Risks Conference

elizabethcooper24 Feb 2023 18:35 UTC

7 points

0 comments1 min readLW link

How major governments can help with the most important century

HoldenKarnofsky24 Feb 2023 18:20 UTC

29 points

0 comments4 min readLW link

(www.cold-takes.com)

Consent Isn’t Always Enough

jefftk24 Feb 2023 15:40 UTC

61 points

16 comments3 min readLW link

(www.jefftk.com)

[Question] Training for corrigability: obvious problems?

Ben Amitay24 Feb 2023 14:02 UTC

4 points

6 comments1 min readLW link

Death and Desperation

Ustice24 Feb 2023 12:43 UTC

6 points

4 comments1 min readLW link

[Question] Are there rationality techniques similar to staring at the wall for 4 hours?

trevor24 Feb 2023 11:48 UTC

32 points

8 comments1 min readLW link

The fast takeoff motte/bailey

lc24 Feb 2023 7:11 UTC

−2 points

7 comments1 min readLW link

AGI systems & humans will both need to solve the alignment problem

Jeffrey Ladish24 Feb 2023 3:29 UTC

59 points

14 comments4 min readLW link