All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 293031

Grokking, memorization, and generalization — a discussion

Kaarel and Dmitry Vaintrob

Oct 29, 2023, 11:17 PM

75 points

11 comments23 min readLW link

Comp Sci in 2027 (Short story by Eliezer Yudkowsky)

sudoOct 29, 2023, 11:09 PM

202 points

24 comments10 min readLW link 1 review

(nitter.net)

Mathematically-Defined Optimization Captures A Lot of Useful Information

J BostockOct 29, 2023, 5:17 PM

19 points

0 comments2 min readLW link

Clarifying the free energy principle (with quotes)

Ryo Oct 29, 2023, 4:03 PM

8 points

0 comments9 min readLW link

A new intro to Quantum Physics, with the math fixed

titotalOct 29, 2023, 3:11 PM

113 points

24 comments17 min readLW link

(titotal.substack.com)

My idea of sacredness, divinity, and religion

Kaj_SotalaOct 29, 2023, 12:50 PM

40 points

10 comments4 min readLW link

(kajsotala.fi)

The AI Boom Mainly Benefits Big Firms, but long-term, markets will concentrate

Hauke HillebrandtOct 29, 2023, 8:38 AM

−1 points

0 comments LW link

What’s up with “Responsible Scaling Policies”?

habryka and ryan_greenblatt

Oct 29, 2023, 4:17 AM

99 points

9 comments20 min readLW link 1 review

Experiments as a Third Alternative

Adam ZernerOct 29, 2023, 12:39 AM

48 points

21 comments5 min readLW link

Comparing representation vectors between llama 2 base and chat

Nina PanicksseryOct 28, 2023, 10:54 PM

36 points

5 comments2 min readLW link

Vaniver’s thoughts on Anthropic’s RSP

VaniverOct 28, 2023, 9:06 PM

46 points

4 comments3 min readLW link

Book Review: Orality and Literacy: The Technologizing of the Word

Fergus FettesOct 28, 2023, 8:12 PM

13 points

0 comments16 min readLW link

Regrant up to $600,000 to AI safety projects with GiveWiki

Dawn DrescherOct 28, 2023, 7:56 PM

33 points

1 comment LW link

Shane Legg interview on alignment

Seth HerdOct 28, 2023, 7:28 PM

66 points

20 comments2 min readLW link

(www.youtube.com)

AI Existential Safety Fellowships

mmfliOct 28, 2023, 6:07 PM

5 points

0 comments1 min readLW link

AI Safety Hub Serbia Official Opening

DusanDNesic and Tanja T

Oct 28, 2023, 5:03 PM

55 points

0 comments3 min readLW link

(forum.effectivealtruism.org)

 Managing AI Risks in an Era of Rapid Progress

AlgonOct 28, 2023, 3:48 PM

36 points

5 comments11 min readLW link

(managing-ai-risks.com)

[Question] ELI5 Why isn’t alignment easier as models get stronger?

Logan ZoellnerOct 28, 2023, 2:34 PM

3 points

9 comments1 min readLW link

Truthseeking, EA, Simulacra levels, and other stuff

Elizabeth and Vaniver

Oct 27, 2023, 11:56 PM

45 points

12 comments9 min readLW link

[Question] Do you believe “E=mc^2” is a correct and/or useful equation, and, whether yes or no, precisely what are your reasons for holding this belief (with such a degree of confidence)?

l8cOct 27, 2023, 10:46 PM

10 points

14 comments1 min readLW link

Value systematization: how values become coherent (and misaligned)

Richard_NgoOct 27, 2023, 7:06 PM

103 points

49 comments13 min readLW link

Techno-humanism is techno-optimism for the 21st century

Richard_NgoOct 27, 2023, 6:37 PM

88 points

5 comments14 min readLW link

(www.mindthefuture.info)

Sanctuary for Humans

Nikola JurkovicOct 27, 2023, 6:08 PM

22 points

9 comments1 min readLW link

Wireheading and misalignment by composition on NetHack

pierlucadoroOct 27, 2023, 5:43 PM

34 points

4 comments4 min readLW link

We’re Not Ready: thoughts on “pausing” and responsible scaling policies

HoldenKarnofskyOct 27, 2023, 3:19 PM

200 points

33 comments8 min readLW link

Aspiration-based Q-Learning

Clément Dumas and Jobst Heitzig

Oct 27, 2023, 2:42 PM

38 points

5 comments11 min readLW link

Linkpost: Rishi Sunak’s Speech on AI (26th October)

bideupOct 27, 2023, 11:57 AM

85 points

8 comments7 min readLW link

(www.gov.uk)

ASPR & WARP: Rationality Camps for Teens in Taiwan and Oxford

Anna GajdovaOct 27, 2023, 8:40 AM

18 points

0 comments1 min readLW link

[Question] To what extent is the UK Government’s recent AI Safety push entirely due to Rishi Sunak?

Stephen FowlerOct 27, 2023, 3:29 AM

23 points

4 comments1 min readLW link

Bayesian Punishment

Rob LucasOct 27, 2023, 3:24 AM

1 point

1 comment6 min readLW link

Online Dialogues Party — Sunday 5th November

Ben PaceOct 27, 2023, 2:41 AM

28 points

1 comment1 min readLW link

OpenAI’s new Preparedness team is hiring

leopoldOct 26, 2023, 8:42 PM

60 points

2 comments1 min readLW link

Fake Deeply

Zack_M_DavisOct 26, 2023, 7:55 PM

33 points

7 comments1 min readLW link

(unremediatedgender.space)

Symbol/Referent Confusions in Language Model Alignment Experiments

johnswentworthOct 26, 2023, 7:49 PM

116 points

50 comments6 min readLW link 1 review

Unsupervised Methods for Concept Discovery in AlphaZero

aogOct 26, 2023, 7:05 PM

9 points

0 comments1 min readLW link

(arxiv.org)

[Question] Nonlinear limitations of ReLUs

magfrumpOct 26, 2023, 6:51 PM

13 points

1 comment1 min readLW link

AI Alignment Problem: Requirement not optional (A Critical Analysis through Mass Effect Trilogy)

TAWSIF AHMEDOct 26, 2023, 6:02 PM

−9 points

0 comments4 min readLW link

[Thought Experiment] Tomorrow’s Echo—The future of synthetic companionship.

Vimal NaranOct 26, 2023, 5:54 PM

−7 points

2 comments2 min readLW link

Disagreements over the prioritization of existential risk from AI

Olivier CoutuOct 26, 2023, 5:54 PM

10 points

0 comments6 min readLW link

[Question] What if AGI had its own universe to maybe wreck?

msealeOct 26, 2023, 5:49 PM

−1 points

2 comments1 min readLW link

Changing Contra Dialects

jefftkOct 26, 2023, 5:30 PM

25 points

2 comments1 min readLW link

(www.jefftk.com)

5 psychological reasons for dismissing x-risks from AGI

Igor IvanovOct 26, 2023, 5:21 PM

24 points

6 comments4 min readLW link

5. Risks from preventing legitimate value change (value collapse)

Nora_AmmannOct 26, 2023, 2:38 PM

13 points

1 comment9 min readLW link

4. Risks from causing illegitimate value change (performative predictors)

Nora_AmmannOct 26, 2023, 2:38 PM

8 points

3 comments5 min readLW link

3. Premise three & Conclusion: AI systems can affect value change trajectories & the Value Change Problem

Nora_AmmannOct 26, 2023, 2:38 PM

28 points

4 comments4 min readLW link

2. Premise two: Some cases of value change are (il)legitimate

Nora_AmmannOct 26, 2023, 2:36 PM

24 points

7 comments6 min readLW link

1. Premise one: Values are malleable

Nora_AmmannOct 26, 2023, 2:36 PM

21 points

1 comment15 min readLW link

0. The Value Change Problem: introduction, overview and motivations

Nora_AmmannOct 26, 2023, 2:36 PM

32 points

0 comments5 min readLW link

EPUBs of MIRI Blog Archives and selected LW Sequences

mesaoptimizerOct 26, 2023, 2:17 PM

44 points

5 comments1 min readLW link

(git.sr.ht)

UK Government publishes “Frontier AI: capabilities and risks” Discussion Paper

A.H.26 Oct 2023 13:55 UTC

5 points

0 comments2 min readLW link

(www.gov.uk)