All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025 2026

All Jan Feb Mar Apr May Jun Jul AugSepOct Nov Dec

All 1 2 3 456 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

The purpose of the (Mosaic) law

mruwnik4 Sep 2023 23:38 UTC

7 points

5 comments6 min readLW link

Against the Open Source / Closed Source Dichotomy: Regulated Source as a Model for Responsible AI Development

alex.herwix4 Sep 2023 20:25 UTC

4 points

12 comments6 min readLW link

(forum.effectivealtruism.org)

Notes on nukes, IR, and AI from “Arsenals of Folly” (and other books)

tlevin4 Sep 2023 19:02 UTC

11 points

0 comments6 min readLW link

Hertford, Sourbut (rationality lessons from University Challenge)

Oliver Sourbut4 Sep 2023 18:44 UTC

30 points

7 comments14 min readLW link

(www.oliversourbut.net)

a rant on politician-engineer coalitional conflict

bhauth4 Sep 2023 17:15 UTC

64 points

12 comments4 min readLW link

How ForumMagnum builds communities of inquiry

Jim Fisher4 Sep 2023 16:52 UTC

35 points

21 comments5 min readLW link

Interpreting a matrix-valued word embedding with a mathematically proven characterization of all optima

Joseph Van Name4 Sep 2023 16:19 UTC

3 points

4 comments12 min readLW link

Hard Questions Are Language Bugs

George3d64 Sep 2023 14:44 UTC

30 points

13 comments7 min readLW link

(ontologi.cc)

Defunding My Mistake

ymeskhout4 Sep 2023 14:43 UTC

185 points

41 comments6 min readLW link

The omnizoid—Heighn FDT Debate #1: Why FDT Isn’t Crazy

Heighn4 Sep 2023 12:57 UTC

24 points

4 comments6 min readLW link

Paper: On measuring situational awareness in LLMs

Owain_Evans, Daniel Kokotajlo, Mikita Balesni, Tomek Korbak, Asa Cooper Stickland, Meg and Maximilian Kaufmann

4 Sep 2023 12:54 UTC

111 points

17 comments5 min readLW link

(arxiv.org)

Impending AGI doesn’t make everything else unimportant

Igor Ivanov4 Sep 2023 12:34 UTC

29 points

12 comments5 min readLW link

Open Thread – Autumn 2023

Raemon3 Sep 2023 22:54 UTC

26 points

113 comments1 min readLW link

What must be the case that ChatGPT would have memorized “To be or not to be”? – Three kinds of conceptual objects for LLMs

Bill Benzon3 Sep 2023 18:39 UTC

19 points

0 comments12 min readLW link

Fundamental question: What determines a mind’s effects?

TsviBT3 Sep 2023 17:15 UTC

16 points

4 comments13 min readLW link

An embedding decoder model, trained with a different objective on a different dataset, can decode another model’s embeddings surprisingly accurately

Logan Zoellner3 Sep 2023 11:34 UTC

20 points

1 comment1 min readLW link

Series of absurd upgrades in nature’s great search

lemonhope3 Sep 2023 9:35 UTC

15 points

8 comments1 min readLW link

Conservation of Expected Evidence and Random Sampling in Anthropics

Ape in the coat3 Sep 2023 6:55 UTC

9 points

9 comments7 min readLW link

The goal of physics

Jim Pivarski2 Sep 2023 23:08 UTC

47 points

4 comments5 min readLW link

Will value of paid sex drop right before the end of the world?

azamatvaliev2 Sep 2023 19:03 UTC

−9 points

0 comments4 min readLW link

PIBBSS Summer Symposium 2023

Nora_Ammann and DusanDNesic

2 Sep 2023 17:22 UTC

25 points

2 comments3 min readLW link

The smallest possible button (or: moth traps!)

Neil 2 Sep 2023 15:24 UTC

126 points

18 comments3 min readLW link

(neilwarren.substack.com)

Steven Harnad: Symbol grounding and the structure of dictionaries

Bill Benzon2 Sep 2023 12:28 UTC

5 points

3 comments2 min readLW link

Is Metaethics Unnecessary Given Intent-Aligned AI?

Caleb Biddulph2 Sep 2023 9:48 UTC

12 points

0 comments7 min readLW link

Rational Agents Cooperate in the Prisoner’s Dilemma

Isaac King2 Sep 2023 6:15 UTC

17 points

68 comments12 min readLW link

[Linkpost] Large language models converge toward human-like concept organization

Bogdan Ionut Cirstea2 Sep 2023 6:00 UTC

22 points

1 comment1 min readLW link

Plum Cooking Temperature

jefftk2 Sep 2023 1:30 UTC

11 points

0 comments1 min readLW link

(www.jefftk.com)

[Question] What did you learn from leaked documents?

wassname2 Sep 2023 1:28 UTC

15 points

10 comments1 min readLW link

One Minute Every Moment

abramdemski1 Sep 2023 20:23 UTC

126 points

24 comments3 min readLW link

Tensor Trust: An online game to uncover prompt injection vulnerabilities

Luke Bailey and qxcv

1 Sep 2023 19:31 UTC

30 points

0 comments5 min readLW link

(tensortrust.ai)

Reproducing ARC Evals’ recent report on language model agents

Thomas Broadley1 Sep 2023 16:52 UTC

104 points

17 comments3 min readLW link

(thomasbroadley.com)

[Question] Why aren’t more people in AIS familiar with PDP?

Prometheus1 Sep 2023 15:27 UTC

12 points

9 comments1 min readLW link

AGI isn’t just a technology

Seth Herd1 Sep 2023 14:35 UTC

18 points

12 comments2 min readLW link

Can an LLM identify ring-composition in a literary text? [ChatGPT]

Bill Benzon1 Sep 2023 14:18 UTC

4 points

2 comments11 min readLW link

What is OpenAI’s plan for making AI Safer?

brook1 Sep 2023 11:15 UTC

6 points

0 comments4 min readLW link

(aisafetyexplained.substack.com)

Progress links digest, 2023-09-01: How ancient people manipulated water, and more

jasoncrawford1 Sep 2023 4:33 UTC

13 points

4 comments6 min readLW link

(rootsofprogress.org)

A Golden Age of Building? Excerpts and lessons from Empire State, Pentagon, Skunk Works and SpaceX

Bird Concept1 Sep 2023 4:03 UTC

188 points

26 comments24 min readLW link 1 review

Meta Questions about Metaphilosophy

Wei Dai1 Sep 2023 1:17 UTC

165 points

80 comments3 min readLW link

[Linkpost] Michael Nielsen remarks on ‘Oppenheimer’

22tom31 Aug 2023 15:46 UTC

78 points

7 comments2 min readLW link

(michaelnotebook.com)

My thoughts on AI and personal future plan after learning about AI Safety for 4 months

Ziyue Wang31 Aug 2023 15:32 UTC

7 points

0 comments4 min readLW link

Which Questions Are Anthropic Questions?

dadadarren31 Aug 2023 15:15 UTC

16 points

13 comments3 min readLW link

The Tree of Life, and a Note on Job

Bill Benzon31 Aug 2023 14:03 UTC

13 points

7 comments4 min readLW link

Cleaning a SoundCraft Mixer

jefftk31 Aug 2023 13:20 UTC

11 points

0 comments1 min readLW link

(www.jefftk.com)

AI #27: Portents of Gemini

Zvi31 Aug 2023 12:40 UTC

54 points

37 comments47 min readLW link

(thezvi.wordpress.com)

[CANCELLED DUE TO ILLNESS] San Francisco ACX Meetup “First Saturday”

guenael31 Aug 2023 12:34 UTC

1 point

0 comments1 min readLW link

Long-Term Future Fund Ask Us Anything (September 2023)

Linch, calebp99, abergal, habryka, Thomas Larsen, LawrenceC and Lauro Langosco

31 Aug 2023 0:28 UTC

33 points

6 comments1 min readLW link

(forum.effectivealtruism.org)

Responses to apparent rationalist confusions about game / decision theory

Anthony DiGiovanni30 Aug 2023 22:02 UTC

143 points

20 comments12 min readLW link 1 review

Invulnerable Incomplete Preferences: A Formal Statement

SCP30 Aug 2023 21:59 UTC

139 points

39 comments24 min readLW link

Report on Frontier Model Training

YafahEdelman30 Aug 2023 20:02 UTC

124 points

21 comments21 min readLW link

(docs.google.com)

An adversarial example for Direct Logit Attribution: memory management in gelu-4l

Can, Yeu-Tong Lau, James Dao and Jett Janiak

30 Aug 2023 17:36 UTC

17 points

0 comments8 min readLW link

(arxiv.org)