All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 262728 29 30 31

OpenAI’s new Preparedness team is hiring

leopold26 Oct 2023 20:42 UTC

60 points

2 comments1 min readLW link

Fake Deeply

Zack_M_Davis26 Oct 2023 19:55 UTC

33 points

7 comments1 min readLW link

(unremediatedgender.space)

Symbol/Referent Confusions in Language Model Alignment Experiments

johnswentworth26 Oct 2023 19:49 UTC

120 points

51 comments6 min readLW link 1 review

Unsupervised Methods for Concept Discovery in AlphaZero

aog26 Oct 2023 19:05 UTC

9 points

0 comments1 min readLW link

(arxiv.org)

[Question] Nonlinear limitations of ReLUs

magfrump26 Oct 2023 18:51 UTC

13 points

1 comment1 min readLW link

AI Alignment Problem: Requirement not optional (A Critical Analysis through Mass Effect Trilogy)

TAWSIF AHMED26 Oct 2023 18:02 UTC

−9 points

0 comments4 min readLW link

[Thought Experiment] Tomorrow’s Echo—The future of synthetic companionship.

Vimal Naran26 Oct 2023 17:54 UTC

−7 points

2 comments2 min readLW link

Disagreements over the prioritization of existential risk from AI

Olivier Coutu26 Oct 2023 17:54 UTC

10 points

0 comments6 min readLW link

[Question] What if AGI had its own universe to maybe wreck?

mseale26 Oct 2023 17:49 UTC

−1 points

2 comments1 min readLW link

Changing Contra Dialects

jefftk26 Oct 2023 17:30 UTC

25 points

2 comments1 min readLW link

(www.jefftk.com)

5 psychological reasons for dismissing x-risks from AGI

Igor Ivanov26 Oct 2023 17:21 UTC

24 points

6 comments4 min readLW link

5. Risks from preventing legitimate value change (value collapse)

Nora_Ammann26 Oct 2023 14:38 UTC

13 points

1 comment9 min readLW link

4. Risks from causing illegitimate value change (performative predictors)

Nora_Ammann26 Oct 2023 14:38 UTC

8 points

3 comments5 min readLW link

3. Premise three & Conclusion: AI systems can affect value change trajectories & the Value Change Problem

Nora_Ammann26 Oct 2023 14:38 UTC

28 points

4 comments4 min readLW link

2. Premise two: Some cases of value change are (il)legitimate

Nora_Ammann26 Oct 2023 14:36 UTC

24 points

7 comments6 min readLW link

1. Premise one: Values are malleable

Nora_Ammann26 Oct 2023 14:36 UTC

21 points

1 comment15 min readLW link

0. The Value Change Problem: introduction, overview and motivations

Nora_Ammann26 Oct 2023 14:36 UTC

32 points

0 comments5 min readLW link

EPUBs of MIRI Blog Archives and selected LW Sequences

mesaoptimizer26 Oct 2023 14:17 UTC

44 points

5 comments1 min readLW link

(git.sr.ht)

UK Government publishes “Frontier AI: capabilities and risks” Discussion Paper

A.H.26 Oct 2023 13:55 UTC

5 points

0 comments2 min readLW link

(www.gov.uk)

AI #35: Responsible Scaling Policies

Zvi26 Oct 2023 13:30 UTC

66 points

10 comments55 min readLW link

(thezvi.wordpress.com)

RA Bounty: Looking for feedback on screenplay about AI Risk

Writer26 Oct 2023 13:23 UTC

32 points

6 comments1 min readLW link

Notes on “How do we become confident in the safety of a machine learning system?”

RohanS26 Oct 2023 3:13 UTC

4 points

0 comments13 min readLW link

Apply to the Constellation Visiting Researcher Program and Astra Fellowship, in Berkeley this Winter

Nate Thomas26 Oct 2023 3:07 UTC

42 points

10 comments1 min readLW link

CHAI internship applications are open (due Nov 13)

Erik Jenner26 Oct 2023 0:53 UTC

34 points

0 comments3 min readLW link

Architects of Our Own Demise: We Should Stop Developing AI Carelessly

Roko26 Oct 2023 0:36 UTC

170 points

75 comments3 min readLW link

EA Infrastructure Fund: June 2023 grant recommendations

Linch26 Oct 2023 0:35 UTC

21 points

0 comments12 min readLW link

Responsible Scaling Policies Are Risk Management Done Wrong

simeon_c25 Oct 2023 23:46 UTC

123 points

35 comments22 min readLW link 1 review

(www.navigatingrisks.ai)

AI as a science, and three obstacles to alignment strategies

So8res25 Oct 2023 21:00 UTC

194 points

80 comments11 min readLW link

My hopes for alignment: Singular learning theory and whole brain emulation

Garrett Baker25 Oct 2023 18:31 UTC

61 points

5 comments12 min readLW link

[Question] Lying to chess players for alignment

Zane25 Oct 2023 17:47 UTC

100 points

55 comments1 min readLW link

Anthropic, Google, Microsoft & OpenAI announce Executive Director of the Frontier Model Forum & over $10 million for a new AI Safety Fund

Zach Stein-Perlman25 Oct 2023 15:20 UTC

31 points

8 comments4 min readLW link

(www.frontiermodelforum.org)

“The Economics of Time Travel”—call for reviewers (Seeds of Science)

rogersbacon25 Oct 2023 15:13 UTC

4 points

2 comments1 min readLW link

Compositional preference models for aligning LMs

Tomek Korbak25 Oct 2023 12:17 UTC

18 points

2 comments5 min readLW link

[Question] Should the US House of Representatives adopt rank choice voting for leadership positions?

jmh25 Oct 2023 11:16 UTC

16 points

6 comments1 min readLW link

Researchers believe they have found a way for artists to fight back against AI style capture

vernamcipher25 Oct 2023 10:54 UTC

3 points

1 comment1 min readLW link

(finance.yahoo.com)

Why We Disagree

zulupineapple25 Oct 2023 10:50 UTC

7 points

2 comments2 min readLW link

Beyond the Data: Why aid to poor doesn’t work

Lyrongolem25 Oct 2023 5:03 UTC

2 points

31 comments12 min readLW link

Announcing Epoch’s newly expanded Parameters, Compute and Data Trends in Machine Learning database

Robi Rahman, Jaime Sevilla Molina, Tamay, Ege Erdil, Pablo Villalobos, Ben Cottier and Matthew Barnett

25 Oct 2023 2:55 UTC

18 points

0 comments1 min readLW link

(epochai.org)

What is a Sequencing Read?

jefftk25 Oct 2023 2:10 UTC

17 points

2 comments2 min readLW link

(www.jefftk.com)

Verifiable private execution of machine learning models with Risc0?

mako yass25 Oct 2023 0:44 UTC

30 points

2 comments2 min readLW link

[Question] How to Resolve Forecasts With No Central Authority?

Nathan Young25 Oct 2023 0:28 UTC

17 points

6 comments1 min readLW link

Thoughts on responsible scaling policies and regulation

paulfchristiano24 Oct 2023 22:21 UTC

220 points

34 comments6 min readLW link

The Screenplay Method

Yeshua God24 Oct 2023 17:41 UTC

−15 points

0 comments25 min readLW link

Blunt Razor

fryolysis24 Oct 2023 17:27 UTC

3 points

0 comments2 min readLW link

Halloween Problem

Saint Blasphemer24 Oct 2023 16:46 UTC

−10 points

1 comment1 min readLW link

Who is Harry Potter? Some predictions.

Donald Hobson24 Oct 2023 16:14 UTC

23 points

7 comments2 min readLW link

Book Review: Going Infinite

Zvi24 Oct 2023 15:00 UTC

247 points

113 comments97 min readLW link 1 review

(thezvi.wordpress.com)

[Interview w/ Quintin Pope] Evolution, values, and AI Safety

fowlertm24 Oct 2023 13:53 UTC

11 points

0 comments1 min readLW link

Lying is Cowardice, not Strategy

Connor Leahy and Gabriel Alfour

24 Oct 2023 13:24 UTC

30 points

73 comments5 min readLW link

(cognition.cafe)

[Question] Anyone Else Using Brilliant?

Sable24 Oct 2023 12:12 UTC

19 points

0 comments1 min readLW link