All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025 2026

All Jan FebMarApr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 111213 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought

Miles Turpin11 Mar 2024 23:46 UTC

16 points

0 comments1 min readLW link

(arxiv.org)

AI Safety Action Plan—A report commissioned by the US State Department

agucova11 Mar 2024 22:14 UTC

22 points

1 comment1 min readLW link

(www.gladstone.ai)

A discussion of AI risk and the cost/benefit calculation of stopping or pausing AI development

DuncanFowler11 Mar 2024 21:41 UTC

1 point

0 comments1 min readLW link

Among the A.I. Doomsayers—The New Yorker

agucova11 Mar 2024 21:35 UTC

12 points

1 comment1 min readLW link

(www.newyorker.com)

Be More Katja

Nathan Young11 Mar 2024 21:12 UTC

53 points

0 comments3 min readLW link

AI Incident Reporting: A Regulatory Review

Deric Cheng and Elliot Mckernon

11 Mar 2024 21:03 UTC

16 points

0 comments6 min readLW link

Results from an Adversarial Collaboration on AI Risk (FRI)

Josh Rosenberg, AvitalM, Molly and rosehadshar

11 Mar 2024 20:00 UTC

61 points

3 comments9 min readLW link

(forecastingresearch.org)

The Astronomical Sacrifice Dilemma

Matthew McRedmond11 Mar 2024 19:58 UTC

15 points

3 comments4 min readLW link

Epiphenomenalism leads to eliminativism about qualia

Clément L11 Mar 2024 19:53 UTC

4 points

0 comments7 min readLW link

The Best Essay (Paul Graham)

Chris_Leong11 Mar 2024 19:25 UTC

25 points

2 comments1 min readLW link

(paulgraham.com)

Open Thread Spring 2024

habryka11 Mar 2024 19:17 UTC

22 points

162 comments1 min readLW link

New social credit formalizations

KatjaGrace11 Mar 2024 19:00 UTC

23 points

3 comments2 min readLW link

(worldspiritsockpuppet.com)

How disagreements about Evidential Correlations could be settled

Martín Soto11 Mar 2024 18:28 UTC

12 points

3 comments4 min readLW link

“Artificial General Intelligence”: an extremely brief FAQ

Steven Byrnes11 Mar 2024 17:49 UTC

75 points

6 comments2 min readLW link

Some (problematic) aesthetics of what constitutes good work in academia

Steven Byrnes11 Mar 2024 17:47 UTC

158 points

12 comments12 min readLW link

Storable Votes with a Pay as you win mechanism: a contribution for institutional design

Arturo Macias11 Mar 2024 15:58 UTC

17 points

19 comments2 min readLW link

Tend to your clarity, not your confusion

Severin T. Seehrich11 Mar 2024 15:09 UTC

23 points

1 comment6 min readLW link

[Question] What do we know about the AI knowledge and views, especially about existential risk, of the new OpenAI board members?

Zvi11 Mar 2024 14:55 UTC

60 points

2 comments2 min readLW link

“How could I have thought that faster?”

mesaoptimizer11 Mar 2024 10:56 UTC

256 points

37 comments2 min readLW link 4 reviews

(twitter.com)

Simple versus Short: Higher-order degeneracy and error-correction

Daniel Murfet11 Mar 2024 7:52 UTC

115 points

12 comments13 min readLW link 3 reviews

Deconstructing Bostrom’s Classic Argument for AI Doom

Nora Belrose11 Mar 2024 5:58 UTC

16 points

15 comments1 min readLW link

(www.youtube.com)

Advice Needed: Does Using a LLM Compomise My Personal Epistemic Security?

Naomi11 Mar 2024 5:57 UTC

17 points

7 comments2 min readLW link

Some Thoughts on Concept Formation and Use in Agents

CatGoddess11 Mar 2024 5:03 UTC

12 points

0 comments8 min readLW link

Steelmanning as an especially insidious form of strawmanning

Cornelius Dybdahl11 Mar 2024 2:25 UTC

10 points

13 comments5 min readLW link

One-shot strategy games?

Raemon11 Mar 2024 0:19 UTC

41 points

42 comments1 min readLW link

Understanding SAE Features with the Logit Lens

Joseph Bloom and Johnny Lin

11 Mar 2024 0:16 UTC

71 points

2 comments14 min readLW link

Replacing the Water Heater’s Anode

jefftk11 Mar 2024 0:00 UTC

22 points

0 comments2 min readLW link

(www.jefftk.com)

Briefly Extending Differential Optimization to Distributions

J Bostock10 Mar 2024 20:41 UTC

4 points

0 comments2 min readLW link

Evolution did a surprising good job at aligning humans...to social status

Eli Tyre10 Mar 2024 19:34 UTC

62 points

46 comments1 min readLW link 1 review

Pausing AI is Positive Expected Value

Liron10 Mar 2024 17:10 UTC

9 points

2 comments3 min readLW link

(twitter.com)

W2SG: Introduction

Maria Kapros10 Mar 2024 16:25 UTC

2 points

2 comments10 min readLW link

An Optimistic Solution to the Fermi Paradox

Glenn Clayton10 Mar 2024 14:39 UTC

4 points

6 comments13 min readLW link

Counterfactual Civilization Simulation Version −1.0 aka my application to Johannes Mayer’s SPAR project

Morphism10 Mar 2024 10:10 UTC

1 point

0 comments14 min readLW link

Notes from a Prompt Factory

Richard_Ngo10 Mar 2024 5:13 UTC

114 points

19 comments9 min readLW link

(www.narrativeark.xyz)

Investigating Basin Volume with XOR Networks

CatGoddess10 Mar 2024 1:35 UTC

10 points

0 comments5 min readLW link

[Linkpost] MindEye2: Shared-Subject Models Enable fMRI-To-Image With 1 Hour of Data

Bogdan Ionut Cirstea10 Mar 2024 1:30 UTC

10 points

0 comments1 min readLW link

(openreview.net)

0th Person and 1st Person Logic

Adele Lopez10 Mar 2024 0:56 UTC

63 points

29 comments6 min readLW link

Completion Estimates

Commander Zander9 Mar 2024 22:56 UTC

7 points

2 comments3 min readLW link

Semi-Simplicial Types, Part I: Motivation and History

astradiol9 Mar 2024 22:07 UTC

20 points

3 comments10 min readLW link

Distinctions when Discussing Utility Functions

ozziegooen9 Mar 2024 20:14 UTC

24 points

7 comments8 min readLW link

What is progress?

jasoncrawford9 Mar 2024 16:28 UTC

10 points

4 comments6 min readLW link

(rootsofprogress.org)

Fifteen Lawsuits against OpenAI

Remmelt9 Mar 2024 12:22 UTC

27 points

4 comments1 min readLW link

Cambridge ACX/SSC monthly meetup (location changed to Fort St George!)

hamishtodd19 Mar 2024 11:10 UTC

2 points

0 comments1 min readLW link

MA E-ZPass Without a Car?

jefftk9 Mar 2024 2:40 UTC

15 points

2 comments1 min readLW link

(www.jefftk.com)

Closeness To the Issue (Part 5 of “The Sense Of Physical Necessity”)

LoganStrohl9 Mar 2024 0:36 UTC

36 points

1 comment15 min readLW link 1 review

Exploring the Evolution and Migration of Different Layer Embedding in LLMs

Ruixuan Huang8 Mar 2024 15:01 UTC

6 points

0 comments8 min readLW link

[Question] When and why did ‘training’ become ‘pretraining’?

beren8 Mar 2024 14:29 UTC

16 points

6 comments1 min readLW link

A T-o-M test: ‘popcorn’ or ‘chocolate’

MiguelDev8 Mar 2024 4:24 UTC

20 points

13 comments1 min readLW link

Scenario Forecasting Workshop: Materials and Learnings

elifland and Charlie Griffin

8 Mar 2024 2:30 UTC

50 points

3 comments2 min readLW link

Forecasting future gains due to post-training enhancements

elifland, Joel Becker and simeon_c

8 Mar 2024 2:11 UTC

31 points

2 comments1 min readLW link

(docs.google.com)