All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025

All Jan FebMarApr May Jun Jul Aug Sep Oct Nov Dec

All12 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Gradations of moral weight

MichaelStJulesFeb 29, 2024, 11:08 PM

1 point

0 comments LW link

Approaching Human-Level Forecasting with Language Models

Fred Zhang, dannyhalawi and jsteinhardt

Feb 29, 2024, 10:36 PM

60 points

6 comments3 min readLW link

Paper review: “The Unreasonable Effectiveness of Easy Training Data for Hard Tasks”

Vassil TashevFeb 29, 2024, 6:44 PM

11 points

0 comments4 min readLW link

What’s in the box?! – Towards interpretability by distinguishing niches of value within neural networks.

Joshua ClancyFeb 29, 2024, 6:33 PM

3 points

4 comments128 min readLW link

Short Post: Discerning Truth from Trash

FinalFormal2Feb 29, 2024, 6:09 PM

−2 points

0 comments1 min readLW link

AI #53: One More Leap

ZviFeb 29, 2024, 4:10 PM

45 points

0 comments38 min readLW link

(thezvi.wordpress.com)

Cryonics p(success) estimates are only weakly associated with interest in pursuing cryonics in the LW 2023 Survey

Andy_McKenzieFeb 29, 2024, 2:47 PM

28 points

6 comments1 min readLW link

Bengio’s Alignment Proposal: “Towards a Cautious Scientist AI with Convergent Safety Bounds”

mattmacdermottFeb 29, 2024, 1:59 PM

76 points

19 comments14 min readLW link

(yoshuabengio.org)

Tips for Empirical Alignment Research

Ethan PerezFeb 29, 2024, 6:04 AM

163 points

4 comments23 min readLW link

[Question] Supposing the 1bit LLM paper pans out

O OFeb 29, 2024, 5:31 AM

27 points

11 comments1 min readLW link

Can RLLMv3′s ability to defend against jailbreaks be attributed to datasets containing stories about Jung’s shadow integration theory?

MiguelDevFeb 29, 2024, 5:13 AM

7 points

2 comments11 min readLW link

Post series on “Liability Law for reducing Existential Risk from AI”

Nora_AmmannFeb 29, 2024, 4:39 AM

42 points

1 comment1 min readLW link

(forum.effectivealtruism.org)

Tour Retrospective February 2024

jefftkFeb 29, 2024, 3:50 AM

10 points

0 comments4 min readLW link

(www.jefftk.com)

Locating My Eyes (Part 3 of “The Sense of Physical Necessity”)

LoganStrohlFeb 29, 2024, 3:09 AM

43 points

4 comments22 min readLW link

Conspiracy Theorists Aren’t Ignorant. They’re Bad At Epistemology.

omnizoidFeb 28, 2024, 11:39 PM

18 points

10 comments5 min readLW link

Discovering alignment windfalls reduces AI risk

goodgravy and stuhlmueller

Feb 28, 2024, 9:23 PM

15 points

1 comment8 min readLW link

(blog.elicit.com)

my theory of the industrial revolution

bhauthFeb 28, 2024, 9:07 PM

23 points

7 comments3 min readLW link

(www.bhauth.com)

Wholesomeness and Effective Altruism

owencbFeb 28, 2024, 8:28 PM

42 points

3 comments LW link

timestamping through the Singularity

throwaway918119127Feb 28, 2024, 7:09 PM

−2 points

4 comments8 min readLW link

Evidential Cooperation in Large Worlds: Potential Objections & FAQ

Chi Nguyen and _will_

Feb 28, 2024, 6:58 PM

42 points

5 comments LW link

Timaeus’s First Four Months

Jesse Hoogland, Daniel Murfet, Stan van Wingerden and Alexander Gietelink Oldenziel

Feb 28, 2024, 5:01 PM

173 points

6 comments6 min readLW link

Notes on control evaluations for safety cases

ryan_greenblatt, Buck and Fabien Roger

Feb 28, 2024, 4:15 PM

49 points

0 comments32 min readLW link

Corporate Governance for Frontier AI Labs: A Research Agenda

Matthew WeardenFeb 28, 2024, 11:29 AM

4 points

0 comments16 min readLW link

(matthewwearden.co.uk)

How AI Will Change Education

robotelvisFeb 28, 2024, 5:30 AM

6 points

3 comments5 min readLW link

(messyprogress.substack.com)

Band Lessons?

jefftkFeb 28, 2024, 3:00 AM

13 points

3 comments1 min readLW link

(www.jefftk.com)

New LessWrong review winner UI (“The LeastWrong” section and full-art post pages)

kaveFeb 28, 2024, 2:42 AM

105 points

64 comments1 min readLW link

Counting arguments provide no evidence for AI doom

Nora Belrose and Quintin Pope

Feb 27, 2024, 11:03 PM

101 points

188 comments14 min readLW link

Which animals realize which types of subjective welfare?

MichaelStJulesFeb 27, 2024, 7:31 PM

4 points

0 comments LW link

Biosecurity and AI: Risks and Opportunities

Steve NewmanFeb 27, 2024, 6:45 PM

11 points

1 comment7 min readLW link

(www.safe.ai)

The Gemini Incident Continues

ZviFeb 27, 2024, 4:00 PM

45 points

6 comments48 min readLW link

(thezvi.wordpress.com)

How I internalized my achievements to better deal with negative feelings

Raymond KoopmanschapFeb 27, 2024, 3:10 PM

42 points

7 comments6 min readLW link

On Frustration and Regret

silentbobFeb 27, 2024, 12:19 PM

8 points

0 comments4 min readLW link

San Francisco ACX Meetup “Third Saturday”

Nate Sternberg and guenael

Feb 27, 2024, 7:07 AM

7 points

0 comments1 min readLW link

Examining Language Model Performance with Reconstructed Activations using Sparse Autoencoders

Evan Anders and Joseph Bloom

Feb 27, 2024, 2:43 AM

43 points

16 comments15 min readLW link

Project idea: an iterated prisoner’s dilemma competition/game

Adam ZernerFeb 26, 2024, 11:06 PM

8 points

0 comments5 min readLW link

Acting Wholesomely

owencbFeb 26, 2024, 9:49 PM

59 points

64 comments LW link

Getting rational now or later: navigating procrastination and time-inconsistent preferences for new rationalists

milo_thoughtsFeb 26, 2024, 7:38 PM

1 point

0 comments8 min readLW link

[Question] Whom Do You Trust?

JackOfAllTradesFeb 26, 2024, 7:38 PM

1 point

0 comments1 min readLW link

Boundary Violations vs Boundary Dissolution

ChipmonkFeb 26, 2024, 6:59 PM

8 points

4 comments1 min readLW link

[Question] Can we get an AI to “do our alignment homework for us”?

Chris_LeongFeb 26, 2024, 7:56 AM

53 points

33 comments1 min readLW link

How I build and run behavioral interviews

benkuhnFeb 26, 2024, 5:50 AM

32 points

6 comments4 min readLW link

(www.benkuhn.net)

Hidden Cognition Detection Methods and Benchmarks

Paul CologneseFeb 26, 2024, 5:31 AM

22 points

11 comments4 min readLW link

Cellular respiration as a steam engine

dkl9Feb 25, 2024, 8:17 PM

24 points

1 comment1 min readLW link

(dkl9.net)

[Question] Rationalism and Dependent Origination?

BaometrusFeb 25, 2024, 6:16 PM

2 points

3 comments1 min readLW link

China-AI forecasts

NathanBarnardFeb 25, 2024, 4:49 PM

40 points

29 comments6 min readLW link

Ideological Bayesians

Kevin DorstFeb 25, 2024, 2:17 PM

96 points

4 comments10 min readLW link

(kevindorst.substack.com)

Deconfusing In-Context Learning

Arjun PanicksseryFeb 25, 2024, 9:48 AM

37 points

1 comment2 min readLW link

Everett branches, inter-light cone trade and other alien matters: Appendix to “An ECL explainer”

Chi Nguyen and _will_

Feb 24, 2024, 11:09 PM

17 points

0 comments LW link

Cooperating with aliens and AGIs: An ECL explainer

Chi Nguyen, _will_ and Orpheus16

Feb 24, 2024, 10:58 PM

55 points

8 comments LW link

Choosing My Quest (Part 2 of “The Sense Of Physical Necessity”)

LoganStrohlFeb 24, 2024, 9:31 PM

40 points

7 comments12 min readLW link