All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025 2026

AllJanFeb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 222324 25 26 27 28 29 30 31

Starting in mechanistic interpretability

Jakub Smékal22 Jan 2024 23:40 UTC

1 point

0 comments3 min readLW link

(jakubsmekal.com)

We need a Science of Evals

Marius Hobbhahn and Jérémy Scheurer

22 Jan 2024 20:30 UTC

75 points

13 comments9 min readLW link

Announcing the SoS Research Collective for independent researchers (and academics thinking independently)

rogersbacon22 Jan 2024 20:13 UTC

15 points

0 comments8 min readLW link

(www.theseedsofscience.pub)

A Brief Assessment of OpenAI’s Preparedness Framework & Some Suggestions for Improvement

simeon_c22 Jan 2024 20:08 UTC

14 points

0 comments6 min readLW link

(uploads-ssl.webflow.com)

D&D.Sci(-fi): Colonizing the SuperHyperSphere [Evaluation and Ruleset]

abstractapplic22 Jan 2024 19:20 UTC

40 points

7 comments3 min readLW link

′ petertodd’’s last stand: The final days of open GPT-3 research

mwatkins22 Jan 2024 18:47 UTC

109 points

16 comments45 min readLW link

InterLab – a toolkit for experiments with multi-agent interactions

Tomáš Gavenčiak, Ada Böhm and Jan_Kulveit

22 Jan 2024 18:23 UTC

69 points

0 comments8 min readLW link

(acsresearch.org)

San Fernando Valley Rationalist Meetup

Thomas Broadley22 Jan 2024 16:49 UTC

3 points

1 comment1 min readLW link

Who Organizes Dances?

jefftk22 Jan 2024 14:30 UTC

12 points

0 comments1 min readLW link

(www.jefftk.com)

Values Darwinism

pchvykov22 Jan 2024 10:44 UTC

11 points

13 comments3 min readLW link

[Question] The akrasia doom loop and executive function disorders: a question

TeaTieAndHat22 Jan 2024 7:01 UTC

20 points

7 comments2 min readLW link

Predicting AGI by the Turing Test

Yuxi_Liu22 Jan 2024 4:22 UTC

21 points

2 comments10 min readLW link

(yuxi-liu-wired.github.io)

Incorporating Justice Theory into Decision Theory

StrivingForLegibility21 Jan 2024 19:17 UTC

13 points

20 comments5 min readLW link

Deliberate Dysentery: Q&A about Human Challenge Trials

Niko_McCarty21 Jan 2024 19:05 UTC

16 points

1 comment18 min readLW link

(www.asimov.press)

When Does Altruism Strengthen Altruism?

jefftk21 Jan 2024 18:50 UTC

47 points

2 comments3 min readLW link

(www.jefftk.com)

A Shutdown Problem Proposal

johnswentworth and David Lorell

21 Jan 2024 18:12 UTC

126 points

67 comments6 min readLW link

Is principled mass-outreach possible, for AGI X-risk?

Nicholas Kross21 Jan 2024 17:45 UTC

9 points

5 comments3 min readLW link

Vacuum: Theory and Technologies

nomagicpill21 Jan 2024 17:23 UTC

34 points

0 comments25 min readLW link

(210ethan.github.io)

Another Non-Anthropic Paradox: The Unsurprising Rareness of Rare Events

Ape in the coat21 Jan 2024 15:58 UTC

21 points

16 comments6 min readLW link

Book review: Cuisine and Empire

eukaryote21 Jan 2024 6:15 UTC

40 points

2 comments12 min readLW link

(eukaryotewritesblog.com)

Catalogue of POLITICO Reports and Other Cited Articles on Effective Altruism and AI Safety Connections in Washington, DC

Evan_Gaensbauer21 Jan 2024 2:15 UTC

4 points

0 comments1 min readLW link

(docs.google.com)

You can rack up massive amounts of data quickly by asking questions to all your friends

Neil 21 Jan 2024 1:27 UTC

14 points

2 comments2 min readLW link

[Question] Party for biomedical rejuvenation research: European parliament elections

Iakov Dudinsky21 Jan 2024 0:35 UTC

2 points

0 comments1 min readLW link

[Question] Why have insurance markets succeeded where prediction markets have not?

JNank21 Jan 2024 0:35 UTC

13 points

13 comments1 min readLW link

[linkpost] Self-Rewarding Language Models

Jacob G-W21 Jan 2024 0:30 UTC

13 points

2 comments1 min readLW link

(arxiv.org)

Why Improving Dialogue Feels So Hard

matto20 Jan 2024 21:26 UTC

22 points

8 comments3 min readLW link

Research Log, RLLMv2: Phi-1.5, GPT2XL and Falcon-RW-1B as paperclip maximizers

MiguelDev20 Jan 2024 15:30 UTC

6 points

0 comments10 min readLW link

Against the Burden of Knowledge

Maxwell Tabarrok20 Jan 2024 14:37 UTC

22 points

6 comments6 min readLW link

(maximumprogress.substack.com)

legged robot scaling laws

bhauth20 Jan 2024 5:45 UTC

34 points

8 comments7 min readLW link

(www.bhauth.com)

Legibility Makes Logical Line-Of-Sight Transitive

StrivingForLegibility19 Jan 2024 23:39 UTC

13 points

0 comments5 min readLW link

Decent plan prize winner & highlights

lemonhope19 Jan 2024 23:30 UTC

25 points

2 comments4 min readLW link

A quick investigation of AI pro-AI bias

Fabien Roger19 Jan 2024 23:26 UTC

55 points

1 comment2 min readLW link

On “Geeks, MOPs, and Sociopaths”

alkjash and Gordon Seidoh Worley

19 Jan 2024 21:04 UTC

31 points

35 comments8 min readLW link

There is way too much serendipity

Malmesbury19 Jan 2024 19:37 UTC

401 points

59 comments7 min readLW link 1 review

Estimating efficiency improvements in LLM pre-training

Daan19 Jan 2024 19:32 UTC

43 points

3 comments21 min readLW link

Update: Orienting Ourselves in 2024 | Guild of the ROSE

moridinamael19 Jan 2024 16:48 UTC

14 points

0 comments1 min readLW link

(guildoftherose.org)

I Want XMP But I Know Why I Can’t Have It

jefftk19 Jan 2024 15:30 UTC

23 points

0 comments3 min readLW link

(www.jefftk.com)

Arguments for Robustness in AI Alignment

Fabian Schimpf19 Jan 2024 10:24 UTC

2 points

1 comment1 min readLW link

[Question] What rationality failure modes are there?

Ulisse Mini19 Jan 2024 9:12 UTC

42 points

11 comments1 min readLW link

[Question] What’s up with online media and our ability to get sh*t done?

TeaTieAndHat19 Jan 2024 9:12 UTC

2 points

0 comments6 min readLW link

Logical Line-Of-Sight Makes Games Sequential or Loopy

StrivingForLegibility19 Jan 2024 4:05 UTC

40 points

0 comments7 min readLW link

[Question] Are there high-quality surveys available detailing the rates of polyamory among Americans age 18-45 in metropolitan areas in the United States?

Evan_Gaensbauer18 Jan 2024 23:50 UTC

23 points

0 comments1 min readLW link

Manifund: 2023 in Review

Austin Chen18 Jan 2024 23:50 UTC

32 points

0 comments23 min readLW link

(manifund.substack.com)

The Underreaction to OpenAI

Sherrinford18 Jan 2024 22:08 UTC

21 points

0 comments6 min readLW link

Against Nonlinear (Thing Of Things)

tailcalled18 Jan 2024 21:40 UTC

58 points

18 comments1 min readLW link

(thingofthings.substack.com)

Toward A Mathematical Framework for Computation in Superposition

Dmitry Vaintrob, jake_mendel and Kaarel

18 Jan 2024 21:06 UTC

214 points

19 comments63 min readLW link

The True Story of How GPT-2 Became Maximally Lewd

Writer and Jai

18 Jan 2024 21:03 UTC

74 points

7 comments6 min readLW link

(youtu.be)

Gaia Network: An Illustrated Primer

Rafael Kaufmann Nedal and Roman Leventov

18 Jan 2024 18:23 UTC

3 points

2 comments15 min readLW link

On the abolition of man

Joe Carlsmith18 Jan 2024 18:17 UTC

98 points

19 comments41 min readLW link 1 review

More Usable Recipes

jefftk18 Jan 2024 17:40 UTC

14 points

1 comment1 min readLW link

(www.jefftk.com)