All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025 2026

AllJanFeb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 121314 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

The Alignment Problems

Martín Soto12 Jan 2023 22:29 UTC

20 points

0 comments4 min readLW link

Proposal for Inducing Steganography in LMs

Logan Riggs12 Jan 2023 22:15 UTC

22 points

3 comments2 min readLW link

Announcing the 2023 PIBBSS Summer Research Fellowship

Nora_Ammann and DusanDNesic

12 Jan 2023 21:31 UTC

32 points

0 comments1 min readLW link

Victoria Krakovna on AGI Ruin, The Sharp Left Turn and Paradigms of AI Alignment

Michaël Trazzi12 Jan 2023 17:09 UTC

40 points

3 comments4 min readLW link

(www.theinsideview.ai)

[Question] What is a disagreement you have around AI safety?

tailcalled12 Jan 2023 16:58 UTC

16 points

7 comments1 min readLW link

Reward is not Necessary: How to Create a Compositional Self-Preserving Agent for Life-Long Learning

Roman Leventov12 Jan 2023 16:43 UTC

17 points

2 comments2 min readLW link

(arxiv.org)

ChatGPT struggles to respond to the real world

Alex Flint12 Jan 2023 16:02 UTC

31 points

9 comments24 min readLW link

Covid 1/12/23: Unexpected Spike in Deaths

Zvi12 Jan 2023 14:30 UTC

31 points

2 comments8 min readLW link

(thezvi.wordpress.com)

[Linkpost] Scaling Laws for Generative Mixed-Modal Language Models

Amal 12 Jan 2023 14:24 UTC

15 points

2 comments1 min readLW link

(arxiv.org)

ea.domains—Domains Free to a Good Home

plex12 Jan 2023 13:32 UTC

24 points

0 comments4 min readLW link

VIRTUA: a novel about AI alignment

Karl von Wendt12 Jan 2023 9:37 UTC

46 points

12 comments1 min readLW link

Iron deficiencies are very bad and you should treat them

Elizabeth12 Jan 2023 9:10 UTC

108 points

34 comments11 min readLW link 1 review

(acesounderglass.com)

Nonstandard analysis in ethics

Alok Singh12 Jan 2023 5:58 UTC

−1 points

0 comments78 min readLW link

(nickbostrom.com)

Example of the nameless rationalist virtue

Alok Singh12 Jan 2023 5:45 UTC

−9 points

2 comments1 min readLW link

FFMI Gains: A List of Vitalities

porby12 Jan 2023 4:48 UTC

26 points

3 comments7 min readLW link

[Linkpost] DreamerV3: A General RL Architecture

simeon_c12 Jan 2023 3:55 UTC

23 points

3 comments1 min readLW link

(arxiv.org)

Microsoft Plans to Invest $10B in OpenAI; $3B Invested to Date | Fortune

DragonGod12 Jan 2023 3:55 UTC

23 points

10 comments2 min readLW link

(fortune.com)

Progress and research disruptiveness

Eleni Angelou12 Jan 2023 3:51 UTC

3 points

2 comments1 min readLW link

(www.nature.com)

The Fable of the AI Coomer: Why the Social Prowess of Machines is AI’s Most Proximal Threat

Ace Delgado12 Jan 2023 1:15 UTC

−10 points

4 comments4 min readLW link

Write to Think

Michael Samoilov12 Jan 2023 0:33 UTC

15 points

2 comments2 min readLW link

Alignment is not enough

Alan Chan12 Jan 2023 0:33 UTC

12 points

6 comments11 min readLW link

(coordination.substack.com)

How it feels to have your mind hacked by an AI

blaked12 Jan 2023 0:33 UTC

374 points

222 comments17 min readLW link

Categorical-measure-theoretic approach to optimal policies tending to seek power

jacek12 Jan 2023 0:32 UTC

31 points

3 comments6 min readLW link

Any person/mind should have the right to suicide

askofa12 Jan 2023 0:32 UTC

18 points

13 comments2 min readLW link

Have we really forsaken natural selection?

KatjaGrace12 Jan 2023 0:10 UTC

34 points

7 comments2 min readLW link

(worldspiritsockpuppet.com)

[Question] Using Finite Factored Sets for Causal Representation Learning?

David Reber11 Jan 2023 22:06 UTC

2 points

3 comments1 min readLW link

GWWC’s Handling of Conflicting Funding Bars

jefftk11 Jan 2023 20:30 UTC

19 points

0 comments3 min readLW link

(www.jefftk.com)

How to write a big cartesian product symbol in MathJax

Matthias G. Mayer11 Jan 2023 20:21 UTC

11 points

1 comment1 min readLW link

What’s the deal with AI consciousness?

TW12311 Jan 2023 16:37 UTC

6 points

13 comments9 min readLW link

(aiwatchtower.substack.com)

[Question] Any significant updates on long covid risk analysis?

Randomized, Controlled11 Jan 2023 14:31 UTC

23 points

11 comments1 min readLW link

internal in nonstandard analysis

Alok Singh11 Jan 2023 9:58 UTC

9 points

1 comment1 min readLW link

Compounding Resource X

Raemon11 Jan 2023 3:14 UTC

77 points

6 comments9 min readLW link

Running With a Backpack

jefftk11 Jan 2023 3:00 UTC

19 points

11 comments1 min readLW link

(www.jefftk.com)

A simple thought experiment showing why recessions are an unnecessary bug in our economic system

skogsnisse11 Jan 2023 0:43 UTC

1 point

1 comment1 min readLW link

We don’t trade with ants

KatjaGrace10 Jan 2023 23:50 UTC

281 points

110 comments7 min readLW link 1 review

(worldspiritsockpuppet.com)

[Question] Who are the people who are currently profiting from inflation?

skogsnisse10 Jan 2023 21:39 UTC

1 point

2 comments1 min readLW link

Is Progress Real?

rogersbacon10 Jan 2023 17:42 UTC

5 points

14 comments14 min readLW link

(www.secretorum.life)

200 COP in MI: Interpreting Reinforcement Learning

Neel Nanda10 Jan 2023 17:37 UTC

25 points

1 comment10 min readLW link

AGI and the EMH: markets are not expecting aligned or unaligned AI in the next 30 years

basil.halperin, J. Zachary Mazlish and tmychow

10 Jan 2023 16:06 UTC

127 points

45 comments26 min readLW link

The Alignment Problem from a Deep Learning Perspective (major rewrite)

SoerenMind, Richard_Ngo and LawrenceC

10 Jan 2023 16:06 UTC

84 points

9 comments39 min readLW link

(arxiv.org)

Against using stock prices to forecast AI timelines

basil.halperin, tmychow and J. Zachary Mazlish

10 Jan 2023 16:03 UTC

26 points

2 comments2 min readLW link

Sorting Pebbles Into Correct Heaps: The Animation

Writer10 Jan 2023 15:58 UTC

26 points

2 comments1 min readLW link

(youtu.be)

Escape Velocity from Bullshit Jobs

Zvi10 Jan 2023 14:30 UTC

61 points

17 comments5 min readLW link

(thezvi.wordpress.com)

Scaling laws vs individual differences

beren10 Jan 2023 13:22 UTC

45 points

21 comments7 min readLW link

Notes on writing

RP10 Jan 2023 4:01 UTC

35 points

11 comments3 min readLW link

Idea: Learning How To Move Towards The Metagame

Algon10 Jan 2023 0:58 UTC

10 points

3 comments1 min readLW link

Review AI Alignment posts to help figure out how to make a proper AI Alignment review

habryka and Raemon

10 Jan 2023 0:19 UTC

85 points

31 comments2 min readLW link

Against the paradox of tolerance

pchvykov10 Jan 2023 0:12 UTC

1 point

11 comments3 min readLW link

Increased Scam Quality/Quantity (Hypothesis in need of data)?

Beeblebrox9 Jan 2023 22:57 UTC

9 points

6 comments1 min readLW link

Wentworth and Larsen on buying time

Orpheus16, Thomas Larsen and johnswentworth

9 Jan 2023 21:31 UTC

74 points

6 comments12 min readLW link