All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025 2026

All Jan Feb Mar Apr May Jun JulAugSep Oct Nov Dec

All 1 2 3 4 5 6 7 8910 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

What’s A “Market”?

johnswentworth8 Aug 2023 23:29 UTC

95 points

16 comments10 min readLW link

Podcast (+transcript): Nathan Barnard on how US financial regulation can inform AI governance

Aaron Bergman8 Aug 2023 21:46 UTC

8 points

0 comments23 min readLW link

(www.aaronbergman.net)

What are the flaws in this argument about p(Doom)?

William the Kiwi 8 Aug 2023 20:34 UTC

−2 points

26 comments1 min readLW link

A Simple Theory Of Consciousness

SherlockHolmes8 Aug 2023 18:05 UTC

2 points

5 comments1 min readLW link

(peterholmes.medium.com)

[Linkpost] Rationally awake

jpc8 Aug 2023 17:59 UTC

−1 points

0 comments4 min readLW link

(jpc.dev)

Yet more UFO Betting: Put Up or Shut Up

MoreRatsWrongReUAP8 Aug 2023 17:50 UTC

10 points

19 comments1 min readLW link

AISN #18: Challenges of Reinforcement Learning from Human Feedback, Microsoft’s Security Breach, and Conceptual Research on AI Safety

Dan H8 Aug 2023 15:52 UTC

13 points

0 comments5 min readLW link

(newsletter.safe.ai)

[Question] Beginner’s question about RLHF

FTPickle8 Aug 2023 15:48 UTC

1 point

3 comments1 min readLW link

My Trial Period as an Independent Alignment Researcher

Bart Bussmann8 Aug 2023 14:16 UTC

34 points

1 comment3 min readLW link

4 types of AGI selection, and how to constrain them

Remmelt8 Aug 2023 10:02 UTC

−4 points

3 comments3 min readLW link

Notice your everything

metachirality8 Aug 2023 2:38 UTC

16 points

1 comment2 min readLW link

Model Organisms of Misalignment: The Case for a New Pillar of Alignment Research

evhub, Nicholas Schiefer, Carson Denison and Ethan Perez

8 Aug 2023 1:30 UTC

331 points

30 comments18 min readLW link 1 review

Perpetually Declining Population?

jefftk8 Aug 2023 1:30 UTC

48 points

29 comments3 min readLW link

(www.jefftk.com)

[Question] How do I find all the items on LW that I’ve favorited or upvoted?

Alex K. Chen (StochasticCockatoo)7 Aug 2023 23:51 UTC

14 points

3 comments1 min readLW link

A plea for more funding shortfall transparency

porby7 Aug 2023 21:33 UTC

73 points

4 comments2 min readLW link

[Question] Tips for reducing thinking branching factor

Simon Berens7 Aug 2023 20:21 UTC

4 points

6 comments1 min readLW link

An interactive introduction to grokking and mechanistic interpretability

Adam Pearce and Asma Ghandeharioun

7 Aug 2023 19:09 UTC

23 points

3 comments1 min readLW link

(pair.withgoogle.com)

Feedbackloop-first Rationality

Raemon7 Aug 2023 17:58 UTC

211 points

69 comments8 min readLW link 2 reviews

Growing Bonsai Networks with RNNs

ameo7 Aug 2023 17:34 UTC

21 points

5 comments1 min readLW link

(cprimozic.net)

[Question] Should I test myself for microplastics?

Augs7 Aug 2023 17:31 UTC

9 points

2 comments1 min readLW link

Optimisation Measures: Desiderata, Impossibility, Proposals

mattmacdermott and Alexander Gietelink Oldenziel

7 Aug 2023 15:52 UTC

36 points

9 comments1 min readLW link

Announcing the Clearer Thinking micro-grants program for 2023

spencerg7 Aug 2023 15:21 UTC

14 points

1 comment1 min readLW link

(www.clearerthinking.org)

What I’ve been reading, July–August 2023

jasoncrawford7 Aug 2023 14:22 UTC

23 points

0 comments13 min readLW link

(rootsofprogress.org)

Monthly Roundup #9: August 2023

Zvi7 Aug 2023 13:20 UTC

42 points

25 comments57 min readLW link

(thezvi.wordpress.com)

Strengthening the Argument for Intrinsic AI Safety: The S-Curves Perspective

avturchin7 Aug 2023 13:13 UTC

8 points

0 comments12 min readLW link

Overview of how AI might exacerbate long-running catastrophic risks

Hauke Hillebrandt7 Aug 2023 11:53 UTC

20 points

0 comments11 min readLW link

(aisafetyfundamentals.com)

Drinks at a bar

yakimoff7 Aug 2023 2:52 UTC

3 points

0 comments1 min readLW link

Problems with Robin Hanson’s Quillette Article On AI

DaemonicSigil6 Aug 2023 22:13 UTC

89 points

33 comments8 min readLW link

Yann LeCun on AGI and AI Safety

Chris_Leong6 Aug 2023 21:56 UTC

37 points

13 comments1 min readLW link

(drive.google.com)

Computational Thread Art

CallumMcDougall6 Aug 2023 21:42 UTC

76 points

2 comments6 min readLW link

‘We’re changing the clouds.’ An unforeseen test of geoengineering is fueling record ocean warmth

Annapurna6 Aug 2023 20:58 UTC

60 points

6 comments1 min readLW link

(www.science.org)

[Linkpost] Will AI avoid exploitation?

cdkg6 Aug 2023 14:28 UTC

22 points

1 comment1 min readLW link

Reducing the risk of catastrophically misaligned AI by avoiding the Singleton scenario: the Manyton Variant

GravitasGradient6 Aug 2023 14:24 UTC

−6 points

0 comments3 min readLW link

Rebooting AI Governance: An AI-Driven Approach to AI Governance

utilon6 Aug 2023 14:19 UTC

1 point

1 comment29 min readLW link

(forum.effectivealtruism.org)

Model-Based Policy Analysis under Deep Uncertainty

utilon6 Aug 2023 14:07 UTC

16 points

1 comment23 min readLW link

(forum.effectivealtruism.org)

[Question] On being in a bad place and too stubborn to leave.

TeaTieAndHat6 Aug 2023 11:45 UTC

12 points

14 comments3 min readLW link

Safety-First Agents/Architectures Are a Promising Path to Safe AGI

Brendon_Wong6 Aug 2023 8:02 UTC

13 points

2 comments12 min readLW link

The Benevolent Ruler’s Handbook (Part 1): The Policy Problem

FCCC6 Aug 2023 3:46 UTC

11 points

3 comments4 min readLW link

Exploring the Multiverse of Large Language Models

franky6 Aug 2023 2:38 UTC

1 point

0 comments5 min readLW link

Aligning my web server with devops practices: part 2 (security)

VipulNaik6 Aug 2023 1:30 UTC

6 points

0 comments19 min readLW link

how 2 tell if ur input is out of distribution given only model weights

dkirmani5 Aug 2023 22:45 UTC

49 points

10 comments1 min readLW link

Summary of Improving Global Decision Making (around AI)

Will_Pearson5 Aug 2023 18:46 UTC

−7 points

0 comments1 min readLW link

Ground-Truth Label Imbalance Impairs the Performance of Contrast-Consistent Search (and Other Contrast-Pair-Based Unsupervised Methods)

Tom Angsten and Ami Hays

5 Aug 2023 17:55 UTC

6 points

2 comments7 min readLW link

(drive.google.com)

Seattle Astral Codex Ten Monthly Social

a7x5 Aug 2023 17:55 UTC

1 point

0 comments1 min readLW link

AISafety.info’s Writing & Editing Hackathon

smallsilo5 Aug 2023 17:14 UTC

2 points

0 comments1 min readLW link

Join AISafety.info’s Writing & Editing Hackathon (Aug 25-28) (Prizes to be won!)

smallsilo5 Aug 2023 14:08 UTC

19 points

3 comments1 min readLW link

(forum.effectivealtruism.org)

Stomach Ulcers and Dental Cavities

Metacelsus5 Aug 2023 14:08 UTC

57 points

7 comments1 min readLW link

(denovo.substack.com)

video games > IQ tests

bhauth5 Aug 2023 13:27 UTC

34 points

46 comments3 min readLW link

[Linkpost] Applicability of scaling laws to vision encoding models

Bogdan Ionut Cirstea5 Aug 2023 11:10 UTC

11 points

2 comments1 min readLW link

A Naive Proposal for Constructing Interpretable AI

Chris_Leong5 Aug 2023 10:32 UTC

18 points

6 comments2 min readLW link