24 Feb 2024 23:09 UTC

17 points

0 comments11 min readLW link

Cooperating with aliens and AGIs: An ECL explainer

Chi Nguyen, _will_ and Orpheus16

24 Feb 2024 22:58 UTC

60 points

8 comments20 min readLW link

Choosing My Quest (Part 2 of “The Sense Of Physical Necessity”)

LoganStrohl24 Feb 2024 21:31 UTC

40 points

7 comments12 min readLW link

Rationality Research Report: Towards 10x OODA Looping?

Raemon24 Feb 2024 21:06 UTC

118 points

26 comments15 min readLW link

Exercise: Planmaking, Surprise Anticipation, and “Baba is You”

Raemon24 Feb 2024 20:33 UTC

72 points

31 comments6 min readLW link

In search of God.

Spiritus Dei24 Feb 2024 18:59 UTC

−19 points

3 comments7 min readLW link

Impossibility of Anthropocentric-Alignment

False Name24 Feb 2024 18:31 UTC

−8 points

2 comments39 min readLW link

The Inner Alignment Problem

Jakub Halmeš24 Feb 2024 17:55 UTC

1 point

1 comment3 min readLW link

(jakubhalmes.substack.com)

We Need Major, But Not Radical, FDA Reform

Maxwell Tabarrok24 Feb 2024 16:54 UTC

42 points

12 comments7 min readLW link

(www.maximum-progress.com)

After Overmorrow: Scattered Musings on the Immediate Post-AGI World

Yuli_Ban24 Feb 2024 15:49 UTC

−3 points

0 comments26 min readLW link

[Question] CDT vs. EDT on Deterrence

Terence Coelho24 Feb 2024 15:41 UTC

1 point

9 comments1 min readLW link

Balancing Games

jefftk24 Feb 2024 14:40 UTC

63 points

18 comments1 min readLW link

(www.jefftk.com)

How well do truth probes generalise?

mishajw24 Feb 2024 14:12 UTC

96 points

11 comments9 min readLW link

Rawls’s Veil of Ignorance Doesn’t Make Any Sense

Arjun Panickssery24 Feb 2024 13:18 UTC

9 points

9 comments1 min readLW link

[Question] Can someone explain to me what went wrong with ChatGPT?

Valentin Baltadzhiev24 Feb 2024 11:50 UTC

9 points

1 comment1 min readLW link

The Sense Of Physical Necessity: A Naturalism Demo (Introduction)

LoganStrohl24 Feb 2024 2:56 UTC

59 points

1 comment6 min readLW link

Instrumental deception and manipulation in LLMs—a case study

Olli Järviniemi24 Feb 2024 2:07 UTC

39 points

13 comments12 min readLW link

A starting point for making sense of task structure (in machine learning)

Kaarel, RP and jake_mendel

24 Feb 2024 1:51 UTC

51 points

2 comments12 min readLW link

Why you, personally, should want a larger human population

jasoncrawford23 Feb 2024 19:48 UTC

32 points

33 comments5 min readLW link

(rootsofprogress.org)

Deliberative Cognitive Algorithms as Scaffolding

Cole Wyeth23 Feb 2024 17:15 UTC

21 points

4 comments3 min readLW link

The Shutdown Problem: Incomplete Preferences as a Solution

Elliott Thornley23 Feb 2024 16:01 UTC

62 points

33 comments41 min readLW link

In set theory, everything is a set

Jacob G-W23 Feb 2024 14:35 UTC

12 points

9 comments2 min readLW link

The role of philosophical thinking in understanding large language models: Calibrating and closing the gap between first-person experience and underlying mechanisms

Bill Benzon23 Feb 2024 12:19 UTC

4 points

0 comments10 min readLW link

Deep and obvious points in the gap between your thoughts and your pictures of thought

KatjaGrace23 Feb 2024 7:30 UTC

54 points

7 comments1 min readLW link 1 review

(worldspiritsockpuppet.com)

Parasocial relationship logic

KatjaGrace23 Feb 2024 7:30 UTC

26 points

2 comments1 min readLW link

(worldspiritsockpuppet.com)

Shaming with and without naming

KatjaGrace23 Feb 2024 7:30 UTC

19 points

5 comments2 min readLW link

(worldspiritsockpuppet.com)

Complexity of value but not disvalue implies more focus on s-risk. Moral uncertainty and preference utilitarianism also do.

Chi Nguyen23 Feb 2024 6:10 UTC

53 points

18 comments2 min readLW link

[Question] Does increasing the power of a multimodal LLM get you an agentic AI?

yanni kyriacos23 Feb 2024 4:14 UTC

3 points

3 comments1 min readLW link

Popular conceptions of “boundaries” don’t make sense

Chris Lakin23 Feb 2024 1:09 UTC

12 points

5 comments1 min readLW link 2 reviews

(chrislakin.blog)

Contra Ngo et al. “Every ‘Every Bay Area House Party’ Bay Area House Party”

Ricki Heicklen22 Feb 2024 23:56 UTC

191 points

5 comments4 min readLW link

(bayesshammai.substack.com)

AI #52: Oops

Zvi22 Feb 2024 21:50 UTC

50 points

9 comments29 min readLW link

(thezvi.wordpress.com)

Embed your second brain in your first brain

dkl922 Feb 2024 21:46 UTC

10 points

3 comments1 min readLW link

(dkl9.net)

The Gemini Incident

Zvi22 Feb 2024 21:00 UTC

80 points

19 comments18 min readLW link

(thezvi.wordpress.com)

Some Thoughts On Using Auctions For Land Valuation

harsimony22 Feb 2024 19:54 UTC

0 points

9 comments9 min readLW link

(progressandpoverty.substack.com)

The Binding of Isaac & Transparent Newcomb’s Problem

suvjectibity22 Feb 2024 18:56 UTC

−10 points

0 comments10 min readLW link

Language Models Don’t Learn the Physical Manifestation of Language

Bruce W. Lee and Jaehyuk Lim

22 Feb 2024 18:52 UTC

39 points

23 comments1 min readLW link

(arxiv.org)

Sora What

Zvi22 Feb 2024 18:10 UTC

47 points

3 comments9 min readLW link

(thezvi.wordpress.com)

Do sparse autoencoders find “true features”?

Demian Till22 Feb 2024 18:06 UTC

76 points

33 comments11 min readLW link

Everything Wrong with Roko’s Claims about an Engineered Pandemic

WitheringWeights22 Feb 2024 15:59 UTC

97 points

11 comments16 min readLW link

The One and a Half Gemini

Zvi22 Feb 2024 13:10 UTC

73 points

4 comments8 min readLW link

(thezvi.wordpress.com)

[Question] How do I make predictions about the future to make sense of what to do with my life?

Raj Thimmiah22 Feb 2024 11:22 UTC

8 points

1 comment1 min readLW link

How are voluntary commitments on vulnerability reporting going?

Adam Jones22 Feb 2024 8:43 UTC

23 points

1 comment1 min readLW link

(adamjones.me)

Notes on Internal Objectives in Toy Models of Agents

Paul Colognese22 Feb 2024 8:02 UTC

16 points

0 comments8 min readLW link

The Byronic Hero Always Loses

Cole Wyeth22 Feb 2024 1:31 UTC

32 points

4 comments2 min readLW link

Job Listing: Managing Editor / Writer

Gretta Duleba21 Feb 2024 23:41 UTC

43 points

2 comments1 min readLW link

The Pareto Best and the Curse of Doom

Screwtape21 Feb 2024 23:10 UTC

132 points

22 comments9 min readLW link 1 review

AISN #31: A New AI Policy Bill in California Plus, Precedents for AI Governance and The EU AI Office

Dan H21 Feb 2024 21:58 UTC

17 points

0 comments6 min readLW link

(newsletter.safe.ai)

Analogies between scaling labs and misaligned superintelligent AI

scasper21 Feb 2024 19:29 UTC

77 points

5 comments4 min readLW link

Extinction Risks from AI: Invisible to Science?

VojtaKovarik, Chris van Merwijk and Ida Mattsson

21 Feb 2024 18:07 UTC

24 points

7 comments1 min readLW link

(arxiv.org)

Extinction-level Goodhart’s Law as a Property of the Environment

VojtaKovarik and Ida Mattsson

21 Feb 2024 17:56 UTC

23 points

0 comments10 min readLW link