All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025 2026

All Jan Feb Mar Apr May JunJulAug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 202122 23 24 25 26 27 28 29 30 31

Problems with predictive history classes

dkl920 Jul 2023 23:28 UTC

15 points

5 comments1 min readLW link

Announcement: AI Narrations Available for All New LessWrong Posts

Solenoid_Entity, Ruby, Raemon, peter_hartree and TYPE III AUDIO

20 Jul 2023 22:17 UTC

71 points

28 comments1 min readLW link

AI #21: The Cup Overfloweth

Zvi20 Jul 2023 21:30 UTC

47 points

4 comments64 min readLW link

(thezvi.wordpress.com)

All AGI Safety questions welcome (especially basic ones) [July 2023]

smallsilo20 Jul 2023 20:20 UTC

38 points

42 comments2 min readLW link

(forum.effectivealtruism.org)

Growth of Publicly Available Genetic Sequencing Data

jefftk20 Jul 2023 19:50 UTC

11 points

2 comments1 min readLW link

(www.jefftk.com)

Progress links and tweets, 2023-07-20: “A goddess enthroned on a car”

jasoncrawford20 Jul 2023 18:28 UTC

12 points

4 comments2 min readLW link

(rootsofprogress.org)

Boundary Placement Rebellion

tailcalled20 Jul 2023 17:40 UTC

54 points

21 comments12 min readLW link

Going Beyond Linear Mode Connectivity: The Layerwise Linear Feature Connectivity

zhanpeng_zhou20 Jul 2023 17:38 UTC

22 points

13 comments3 min readLW link

(openreview.net)

Even Superhuman Go AIs Have Surprising Failure Modes

AdamGleave, EuanMcLean, Tony Wang, Kellin Pelrine, Tom Tseng, Yawen Duan, Joseph Miller and MichaelDennis

20 Jul 2023 17:31 UTC

131 points

22 comments10 min readLW link

(far.ai)

Paper digestion: “May We Have Your Attention Please? Human-Rights NGOs and the Problem of Global Communication”

Klara Helene Nielsen20 Jul 2023 17:08 UTC

4 points

1 comment2 min readLW link

(journals.sagepub.com)

The (short) case for predicting what Aliens value

Jim Buhler20 Jul 2023 15:25 UTC

17 points

5 comments3 min readLW link

Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla

Neel Nanda, Tom Lieberum, Matthew Rahtz, János Kramár, Geoffrey Irving, Rohin Shah and Vlad Mikulik

20 Jul 2023 10:50 UTC

44 points

3 comments2 min readLW link

(arxiv.org)

Speculative inferences about path dependence in LLM supervised fine-tuning from results on linear mode connectivity and model souping

RobertKirk20 Jul 2023 9:56 UTC

39 points

2 comments5 min readLW link

A case for gamete personhood (reductio ad absurdum)

Ansyn131220 Jul 2023 8:25 UTC

−1 points

4 comments1 min readLW link

Contra Contra the Social Model of Disability

DirectedEvolution20 Jul 2023 6:59 UTC

21 points

22 comments16 min readLW link

[Question] Do you speed up capabilities when you do AI integrations and consume overhangs?

Michael Tontchev20 Jul 2023 6:40 UTC

6 points

1 comment1 min readLW link

Project Lawful Audiobook: An Unofficial Fan Production with ElevenLabs AI

Askwho19 Jul 2023 23:34 UTC

23 points

3 comments1 min readLW link

(askwhocastsai.substack.com)

Using predictors in corrigible systems

porby19 Jul 2023 22:29 UTC

21 points

6 comments27 min readLW link

mental number lines

bhauth19 Jul 2023 21:01 UTC

10 points

5 comments1 min readLW link

[Question] Any suggestions for an impactful master’s thesis in Political Science?

Klara Helene Nielsen19 Jul 2023 17:44 UTC

1 point

0 comments1 min readLW link

Incident reporting for AI safety

Zach Stein-Perlman19 Jul 2023 17:00 UTC

22 points

0 comments18 min readLW link

Alignment Grantmaking is Funding-Limited Right Now

johnswentworth19 Jul 2023 16:49 UTC

312 points

68 comments1 min readLW link

Zener Science

Screwtape19 Jul 2023 16:40 UTC

16 points

11 comments6 min readLW link

Tallinn, Estonia ACX Summer Meetup

Andrew19 Jul 2023 16:22 UTC

1 point

1 comment1 min readLW link

Desiderata for an AI

Nathan Helm-Burger19 Jul 2023 16:18 UTC

9 points

0 comments4 min readLW link

Valuism—an approach to life for you to consider

spencerg19 Jul 2023 15:23 UTC

17 points

2 comments1 min readLW link

Hedonic Loops and Taming RL

beren19 Jul 2023 15:12 UTC

20 points

14 comments9 min readLW link

[Question] What Caused the Puzzling Decline in Activism Against Policy Violence Towards Black People?

ChristianKl19 Jul 2023 14:40 UTC

12 points

2 comments1 min readLW link

Lisa Feldman Barrett versus Paul Ekman on facial expressions & basic emotions

Steven Byrnes19 Jul 2023 14:26 UTC

42 points

26 comments15 min readLW link

AISN#15: China and the US take action to regulate AI, results from a tournament forecasting AI risk, updates on xAI’s plan, and Meta releases its open-source and commercially available Llama 2

Corin Katzke and Dan H

19 Jul 2023 13:01 UTC

16 points

0 comments6 min readLW link

(newsletter.safe.ai)

Technological solutions to the climate crisis

dominicq19 Jul 2023 12:39 UTC

6 points

5 comments3 min readLW link

(sundaystopwatch.eu)

Secret Cosmos: Introduction

Al Link19 Jul 2023 11:51 UTC

−35 points

3 comments14 min readLW link

(allink.substack.com)

Critiques of prominent AI safety organizations: Introduction

Omega.19 Jul 2023 6:54 UTC

7 points

0 comments5 min readLW link

(forum.effectivealtruism.org)

House Grocery Spending

jefftk19 Jul 2023 3:00 UTC

13 points

0 comments5 min readLW link

(www.jefftk.com)

A brief history of computers

Adam Zerner19 Jul 2023 2:59 UTC

72 points

18 comments33 min readLW link

Simple alignment plan that maybe works

Iknownothing18 Jul 2023 22:48 UTC

4 points

8 comments1 min readLW link

Prospera-dump

tailcalled18 Jul 2023 21:36 UTC

11 points

16 comments1 min readLW link

Tiny Mech Interp Projects: Emergent Positional Embeddings of Words

Neel Nanda18 Jul 2023 21:24 UTC

52 points

1 comment9 min readLW link

Quick Thoughts on Language Models

RohanS18 Jul 2023 20:38 UTC

6 points

0 comments4 min readLW link

Still no Lie Detector for LLMs

Daniel Herrmann and ben_levinstein

18 Jul 2023 19:56 UTC

50 points

3 comments21 min readLW link

Meta announces Llama 2; “open sources” it for commercial use

LawrenceC18 Jul 2023 19:28 UTC

46 points

12 comments1 min readLW link

(about.fb.com)

The Rope Management Theory: A Comprehensive Approach to Modulating Reward Perception and Mitigating Hedonic Adaptation

Eris Discordia18 Jul 2023 17:45 UTC

−23 points

2 comments3 min readLW link

AI Impacts Quarterly Newsletter, Apr-Jun 2023

Harlan and Richard Korzekwa

18 Jul 2023 17:14 UTC

6 points

0 comments3 min readLW link

(blog.aiimpacts.org)

Clever arguers give weak evidence, not zero

dkl918 Jul 2023 17:07 UTC

7 points

2 comments1 min readLW link

(dkl9.net)

Measuring and Improving the Faithfulness of Model-Generated Reasoning

Ansh Radhakrishnan, tamera, karinanguyen, Sam Bowman and Ethan Perez

18 Jul 2023 16:36 UTC

111 points

15 comments6 min readLW link 1 review

[Question] Least-problematic Resource for learning RL?

Dalcy18 Jul 2023 16:30 UTC

24 points

9 comments1 min readLW link

Charter Cities: why they’re exciting & how they might work

Jackson Wagner18 Jul 2023 13:57 UTC

21 points

7 comments8 min readLW link

Train for incorrigibility, then reverse it (Shutdown Problem Contest Submission)

Daniel_Eth18 Jul 2023 8:26 UTC

9 points

1 comment2 min readLW link

The shape of AGI: Cartoons and back of envelope

Boaz Barak17 Jul 2023 20:57 UTC

33 points

19 comments6 min readLW link 1 review

Predictive history classes

dkl917 Jul 2023 20:48 UTC

69 points

17 comments2 min readLW link

(dkl9.net)