All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025 2026

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All 1 2 3 4 5 6 7 8 91011 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

NYT on the Manifest forecasting conference

Austin Chen9 Oct 2023 21:40 UTC

45 points

14 comments2 min readLW link

(www.nytimes.com)

Forecasting and prediction markets

CarlJ9 Oct 2023 20:43 UTC

3 points

0 comments1 min readLW link

Comparing Two Forecasters in an Ideal World

nikos9 Oct 2023 19:52 UTC

5 points

0 comments6 min readLW link

The case for aftermarket blind spot mirrors

Brendan Long9 Oct 2023 19:30 UTC

60 points

14 comments2 min readLW link

(www.brendanlong.com)

New contractor role: Web security task force contractor for AI safety announcements

Ethan Ashkie and Andrew_Critch

9 Oct 2023 18:36 UTC

11 points

0 comments2 min readLW link

(survivalandflourishing.com)

[Question] Anyone working on D. Amodei’s Bartlett show transcript?

Leopard9 Oct 2023 18:17 UTC

10 points

0 comments1 min readLW link

Knowledge Base 3: Shopping advisor and other uses of knowledge base about products

iwis9 Oct 2023 11:53 UTC

0 points

0 comments4 min readLW link

Knowledge Base 2: The structure and the method of building

iwis9 Oct 2023 11:53 UTC

2 points

4 comments7 min readLW link

We don’t understand what happened with culture enough

Jan_Kulveit9 Oct 2023 9:54 UTC

88 points

22 comments6 min readLW link 1 review

Leveraging Bayes’ Theorem to Supercharge Memory Techniques

disoha9 Oct 2023 3:34 UTC

−15 points

1 comment4 min readLW link

Paper: Identifying the Risks of LM Agents with an LM-Emulated Sandbox—University of Toronto 2023 - Benchmark consisting of 36 high-stakes tools and 144 test cases!

Singularian25019 Oct 2023 0:00 UTC

6 points

0 comments1 min readLW link

AI Alignment Breakthroughs this week (10/08/23)

Logan Zoellner8 Oct 2023 23:30 UTC

32 points

14 comments6 min readLW link

“The Heart of Gaming is the Power Fantasy”, and Cohabitive Games

Raemon8 Oct 2023 21:02 UTC

81 points

50 comments4 min readLW link

(bottomfeeder.substack.com)

FAQ: What the heck is goal agnosticism?

porby8 Oct 2023 19:11 UTC

66 points

38 comments28 min readLW link

Time is homogeneous sequentially-composable determination

TsviBT8 Oct 2023 14:58 UTC

15 points

0 comments21 min readLW link

Linkpost: Are Emergent Abilities in Large Language Models just In-Context Learning?

Erich_Grunewald8 Oct 2023 12:14 UTC

12 points

7 comments2 min readLW link

(arxiv.org)

Bird-eye view visualization of LLM activations

Sergii8 Oct 2023 12:12 UTC

11 points

2 comments1 min readLW link

(grgv.xyz)

Perspective Based Reasoning Could Absolve CDT

dadadarren8 Oct 2023 11:22 UTC

4 points

5 comments5 min readLW link

The Gradient – The Artificiality of Alignment

mic8 Oct 2023 4:06 UTC

12 points

1 comment5 min readLW link

(thegradient.pub)

Comparing Anthropic’s Dictionary Learning to Ours

Robert_AIZI7 Oct 2023 23:30 UTC

137 points

8 comments4 min readLW link

A thought about the constraints of debtlessness in online communities

mako yass7 Oct 2023 21:26 UTC

60 points

23 comments1 min readLW link

Arguments for utilitarianism are impossibility arguments under unbounded prospects

MichaelStJules7 Oct 2023 21:08 UTC

7 points

7 comments21 min readLW link

Sam Altman’s sister claims Sam sexually abused her—Part 1: Introduction, outline, author’s notes

pythagoras50157 Oct 2023 21:06 UTC

96 points

108 comments8 min readLW link

Griffin Island

jefftk7 Oct 2023 18:40 UTC

14 points

3 comments1 min readLW link

(www.jefftk.com)

Every Mention of EA in “Going Infinite”

KirstenH7 Oct 2023 14:42 UTC

48 points

0 comments8 min readLW link

(open.substack.com)

Fixing Insider Threats in the AI Supply Chain

Madhav Malhotra7 Oct 2023 13:19 UTC

20 points

2 comments5 min readLW link

Contra Nora Belrose on Orthogonality Thesis Being Trivial

tailcalled7 Oct 2023 11:47 UTC

18 points

21 comments1 min readLW link

Related Discussion from Thomas Kwa’s MIRI Research Experience

Raemon7 Oct 2023 6:25 UTC

72 points

140 comments1 min readLW link

[Question] Current State of Probabilistic Logic

Alexander Heckett7 Oct 2023 5:06 UTC

3 points

2 comments1 min readLW link

On the Relationship Between Variability and the Evolutionary Outcomes of Systems in Nature

Artyom Shaposhnikov7 Oct 2023 3:06 UTC

2 points

0 comments1 min readLW link

Announcing Dialogues

Ben Pace7 Oct 2023 2:57 UTC

160 points

60 comments4 min readLW link

Don’t Dismiss Simple Alignment Approaches

Chris_Leong7 Oct 2023 0:35 UTC

139 points

9 comments4 min readLW link

Linking Alt Accounts

jefftk6 Oct 2023 17:00 UTC

70 points

33 comments1 min readLW link

(www.jefftk.com)

Super-Exponential versus Exponential Growth in Compute Price-Performance

moridinamael6 Oct 2023 16:23 UTC

37 points

25 comments2 min readLW link

A personal explanation of ELK concept and task.

Zeyu Qin6 Oct 2023 3:55 UTC

1 point

0 comments1 min readLW link

The Long-Term Future Fund is looking for a full-time fund chair

Linch, calebp and abergal

5 Oct 2023 22:18 UTC

52 points

0 comments7 min readLW link

(forum.effectivealtruism.org)

Provably Safe AI

PeterMcCluskey5 Oct 2023 22:18 UTC

37 points

15 comments4 min readLW link

(bayesianinvestor.com)

Stampy’s AI Safety Info soft launch

steven0461 and Robert Miles

5 Oct 2023 22:13 UTC

120 points

9 comments2 min readLW link

Impacts of AI on the housing markets

PottedRosePetal5 Oct 2023 21:24 UTC

8 points

0 comments5 min readLW link

Towards Monosemanticity: Decomposing Language Models With Dictionary Learning

Zac Hatfield-Dodds5 Oct 2023 21:01 UTC

289 points

22 comments2 min readLW link 1 review

(transformer-circuits.pub)

Ideation and Trajectory Modelling in Language Models

NickyP5 Oct 2023 19:21 UTC

16 points

2 comments10 min readLW link

A well-defined history in measurable factor spaces

Matthias G. Mayer5 Oct 2023 18:36 UTC

25 points

0 comments2 min readLW link

Evaluating the historical value misspecification argument

Matthew Barnett5 Oct 2023 18:34 UTC

192 points

163 comments7 min readLW link 3 reviews

Translations Should Invert

abramdemski5 Oct 2023 17:44 UTC

54 points

19 comments3 min readLW link

Censorship in LLMs is here to stay because it mirrors how our own intelligence is structured

mnvr5 Oct 2023 17:37 UTC

3 points

0 comments1 min readLW link

Twin Cities ACX Meetup October 2023

Timothy M.5 Oct 2023 16:29 UTC

1 point

2 comments1 min readLW link

This anime storyboard doesn’t exist: a graphic novel written and illustrated by GPT4

RomanS5 Oct 2023 14:01 UTC

12 points

7 comments55 min readLW link

AI #32: Lie Detector

Zvi5 Oct 2023 13:50 UTC

45 points

19 comments44 min readLW link

(thezvi.wordpress.com)

Can the House Legislate?

jefftk5 Oct 2023 13:40 UTC

26 points

6 comments2 min readLW link

(www.jefftk.com)

Making progress on the ``what alignment target should be aimed at?″ question, is urgent

ThomasCederborg5 Oct 2023 12:55 UTC

2 points

0 comments18 min readLW link