All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025 2026

All JanFebMar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 232425 26 27 28 29

Why you, personally, should want a larger human population

jasoncrawford23 Feb 2024 19:48 UTC

32 points

33 comments5 min readLW link

(rootsofprogress.org)

Deliberative Cognitive Algorithms as Scaffolding

Cole Wyeth23 Feb 2024 17:15 UTC

21 points

4 comments3 min readLW link

The Shutdown Problem: Incomplete Preferences as a Solution

Elliott Thornley23 Feb 2024 16:01 UTC

62 points

33 comments41 min readLW link

In set theory, everything is a set

Jacob G-W23 Feb 2024 14:35 UTC

12 points

9 comments2 min readLW link

The role of philosophical thinking in understanding large language models: Calibrating and closing the gap between first-person experience and underlying mechanisms

Bill Benzon23 Feb 2024 12:19 UTC

4 points

0 comments10 min readLW link

Deep and obvious points in the gap between your thoughts and your pictures of thought

KatjaGrace23 Feb 2024 7:30 UTC

54 points

7 comments1 min readLW link 1 review

(worldspiritsockpuppet.com)

Parasocial relationship logic

KatjaGrace23 Feb 2024 7:30 UTC

26 points

2 comments1 min readLW link

(worldspiritsockpuppet.com)

Shaming with and without naming

KatjaGrace23 Feb 2024 7:30 UTC

19 points

5 comments2 min readLW link

(worldspiritsockpuppet.com)

Complexity of value but not disvalue implies more focus on s-risk. Moral uncertainty and preference utilitarianism also do.

Chi Nguyen23 Feb 2024 6:10 UTC

53 points

18 comments2 min readLW link

[Question] Does increasing the power of a multimodal LLM get you an agentic AI?

yanni kyriacos23 Feb 2024 4:14 UTC

3 points

3 comments1 min readLW link

Popular conceptions of “boundaries” don’t make sense

Chris Lakin23 Feb 2024 1:09 UTC

12 points

5 comments1 min readLW link 2 reviews

(chrislakin.blog)

Contra Ngo et al. “Every ‘Every Bay Area House Party’ Bay Area House Party”

Ricki Heicklen22 Feb 2024 23:56 UTC

191 points

5 comments4 min readLW link

(bayesshammai.substack.com)

AI #52: Oops

Zvi22 Feb 2024 21:50 UTC

50 points

9 comments29 min readLW link

(thezvi.wordpress.com)

Embed your second brain in your first brain

dkl922 Feb 2024 21:46 UTC

10 points

3 comments1 min readLW link

(dkl9.net)

The Gemini Incident

Zvi22 Feb 2024 21:00 UTC

80 points

19 comments18 min readLW link

(thezvi.wordpress.com)

Some Thoughts On Using Auctions For Land Valuation

harsimony22 Feb 2024 19:54 UTC

0 points

9 comments9 min readLW link

(progressandpoverty.substack.com)

The Binding of Isaac & Transparent Newcomb’s Problem

suvjectibity22 Feb 2024 18:56 UTC

−10 points

0 comments10 min readLW link

Language Models Don’t Learn the Physical Manifestation of Language

Bruce W. Lee and Jaehyuk Lim

22 Feb 2024 18:52 UTC

39 points

23 comments1 min readLW link

(arxiv.org)

Sora What

Zvi22 Feb 2024 18:10 UTC

47 points

3 comments9 min readLW link

(thezvi.wordpress.com)

Do sparse autoencoders find “true features”?

Demian Till22 Feb 2024 18:06 UTC

76 points

33 comments11 min readLW link

Everything Wrong with Roko’s Claims about an Engineered Pandemic

WitheringWeights22 Feb 2024 15:59 UTC

97 points

11 comments16 min readLW link

The One and a Half Gemini

Zvi22 Feb 2024 13:10 UTC

73 points

4 comments8 min readLW link

(thezvi.wordpress.com)

[Question] How do I make predictions about the future to make sense of what to do with my life?

Raj Thimmiah22 Feb 2024 11:22 UTC

8 points

1 comment1 min readLW link

How are voluntary commitments on vulnerability reporting going?

Adam Jones22 Feb 2024 8:43 UTC

23 points

1 comment1 min readLW link

(adamjones.me)

Notes on Internal Objectives in Toy Models of Agents

Paul Colognese22 Feb 2024 8:02 UTC

16 points

0 comments8 min readLW link

The Byronic Hero Always Loses

Cole Wyeth22 Feb 2024 1:31 UTC

32 points

4 comments2 min readLW link

Job Listing: Managing Editor / Writer

Gretta Duleba21 Feb 2024 23:41 UTC

43 points

2 comments1 min readLW link

The Pareto Best and the Curse of Doom

Screwtape21 Feb 2024 23:10 UTC

132 points

22 comments9 min readLW link 1 review

AISN #31: A New AI Policy Bill in California Plus, Precedents for AI Governance and The EU AI Office

Dan H21 Feb 2024 21:58 UTC

17 points

0 comments6 min readLW link

(newsletter.safe.ai)

Analogies between scaling labs and misaligned superintelligent AI

scasper21 Feb 2024 19:29 UTC

77 points

5 comments4 min readLW link

Extinction Risks from AI: Invisible to Science?

VojtaKovarik, Chris van Merwijk and Ida Mattsson

21 Feb 2024 18:07 UTC

24 points

7 comments1 min readLW link

(arxiv.org)

Extinction-level Goodhart’s Law as a Property of the Environment

VojtaKovarik and Ida Mattsson

21 Feb 2024 17:56 UTC

23 points

0 comments10 min readLW link

Dynamics Crucial to AI Risk Seem to Make for Complicated Models

VojtaKovarik and Ida Mattsson

21 Feb 2024 17:54 UTC

19 points

0 comments9 min readLW link

Which Model Properties are Necessary for Evaluating an Argument?

VojtaKovarik and Ida Mattsson

21 Feb 2024 17:52 UTC

18 points

2 comments7 min readLW link

Weak vs Quantitative Extinction-level Goodhart’s Law

VojtaKovarik and Ida Mattsson

21 Feb 2024 17:38 UTC

27 points

1 comment2 min readLW link

Dual Wielding Kindle Scribes

mesaoptimizer21 Feb 2024 17:17 UTC

57 points

18 comments6 min readLW link

A Tale of Two Restaurant Types

Zvi21 Feb 2024 13:50 UTC

15 points

0 comments6 min readLW link

(thezvi.wordpress.com)

Less Wrong automated systems are inadvertently Censoring me

Roko21 Feb 2024 12:57 UTC

4 points

52 comments1 min readLW link

[Question] What is the research speed multiplier of the most advanced current LLMs?

wunan21 Feb 2024 12:39 UTC

6 points

2 comments1 min readLW link

Jailbreaking GPT-4 with the tool API

mishajw21 Feb 2024 11:16 UTC

20 points

2 comments4 min readLW link

Gut Renovating Another Bathroom

jefftk21 Feb 2024 3:00 UTC

22 points

0 comments2 min readLW link

(www.jefftk.com)

Thoughts for and against an ASI figuring out ethics for itself

sweenesm20 Feb 2024 23:40 UTC

6 points

10 comments3 min readLW link

AI #51: Altman’s Ambition

Zvi20 Feb 2024 19:50 UTC

83 points

5 comments38 min readLW link

(thezvi.wordpress.com)

The Third Gemini

Zvi20 Feb 2024 19:50 UTC

30 points

2 comments9 min readLW link

(thezvi.wordpress.com)

Why does generalization work?

Martín Soto20 Feb 2024 17:51 UTC

43 points

18 comments4 min readLW link 1 review

ChatGPT refuses to accept a challenge where it would get shot between the eyes [game theory]

Bill Benzon20 Feb 2024 16:55 UTC

4 points

6 comments4 min readLW link

Inducing human-like biases in moral reasoning LMs

Artem Karpov, Austin Meek, Bogdan Ionut Cirstea and SCho

20 Feb 2024 16:28 UTC

23 points

3 comments14 min readLW link

Monthly Roundup #15: February 2024

Zvi20 Feb 2024 13:10 UTC

22 points

7 comments32 min readLW link

(thezvi.wordpress.com)

Selections From “The Trouble With Being Born”

Arjun Panickssery20 Feb 2024 10:07 UTC

24 points

2 comments2 min readLW link

(arjunpanickssery.substack.com)

Difficulty classes for alignment properties

Jozdien20 Feb 2024 9:08 UTC

34 points

5 comments2 min readLW link