All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025 2026

All Jan Feb Mar Apr May Jun Jul AugSepOct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 272829 30

Metaculus Launches 2023/2024 FluSight Challenge Supporting CDC, $5K in Prizes

ChristianWilliams27 Sep 2023 21:35 UTC

5 points

0 comments1 min readLW link

(www.metaculus.com)

Projects I would like to see (possibly at AI Safety Camp)

Linda Linsefors27 Sep 2023 21:27 UTC

22 points

12 comments4 min readLW link

Towards Better Milestones for Monitoring AI Capabilities

snewman27 Sep 2023 21:18 UTC

11 points

0 comments14 min readLW link

[Question] Is Bjorn Lomborg roughly right about climate change policy?

yhoiseth27 Sep 2023 20:06 UTC

29 points

14 comments2 min readLW link

(www.sciencedirect.com)

Commonsense Good, Creative Good

jefftk27 Sep 2023 19:50 UTC

70 points

11 comments3 min readLW link

(www.jefftk.com)

Petrov Day [Spoiler Warning]

lsusr27 Sep 2023 19:20 UTC

6 points

5 comments1 min readLW link

The Hidden Complexity of Wishes—The Animation

Writer27 Sep 2023 17:59 UTC

33 points

0 comments1 min readLW link

(youtu.be)

MMLU’s Moral Scenarios Benchmark Doesn’t Measure What You Think it Measures

corey morris27 Sep 2023 17:54 UTC

18 points

3 comments4 min readLW link

(medium.com)

[Question] What’s your standard for good work performance?

Chi Nguyen27 Sep 2023 16:58 UTC

30 points

3 comments1 min readLW link

The Role of Groups in the Progression of Human Understanding

Chris_Leong27 Sep 2023 15:09 UTC

11 points

0 comments2 min readLW link

The Great Disembedding

rogersbacon27 Sep 2023 14:53 UTC

16 points

6 comments16 min readLW link

(www.secretorum.life)

[Question] how do short-timeliners reason about the differences between brain and AI?

JavierCC27 Sep 2023 8:13 UTC

2 points

11 comments1 min readLW link

[Question] Is there a widely accepted metric for ‘genuineness’ in interpersonal communication?

M. Y. Zuo27 Sep 2023 5:30 UTC

6 points

2 comments1 min readLW link

Bariatric surgery seems like a no-brainer for most morbidly obese people

lc27 Sep 2023 1:05 UTC

12 points

12 comments3 min readLW link

Jacob on the Precipice

Richard_Ngo26 Sep 2023 21:16 UTC

48 points

8 comments11 min readLW link

(narrativeark.substack.com)

Text Posts from the Kids Group: 2022

jefftk26 Sep 2023 20:40 UTC

33 points

2 comments7 min readLW link

(www.jefftk.com)

GPT-4 for personal productivity: online distraction blocker

Sergii26 Sep 2023 17:41 UTC

67 points

13 comments2 min readLW link

(grgv.xyz)

ARENA 2.0 - Impact Report

CallumMcDougall26 Sep 2023 17:13 UTC

35 points

5 comments13 min readLW link

Mechanistic Interpretability Reading group

1stuserhere and woog

26 Sep 2023 16:26 UTC

15 points

0 comments1 min readLW link

Announcing the CNN Interpretability Competition

scasper26 Sep 2023 16:21 UTC

22 points

0 comments4 min readLW link

Making AIs less likely to be spiteful

Nicolas Macé, Anthony DiGiovanni and JesseClifton

26 Sep 2023 14:12 UTC

118 points

7 comments10 min readLW link

[Linkpost] Mark Zuckerberg confronted about Meta’s Llama 2 AI’s ability to give users detailed guidance on making anthrax—Business Insider

mic26 Sep 2023 12:05 UTC

18 points

11 comments2 min readLW link

(www.businessinsider.com)

Enforcing Far-Future Contracts for Governments

FCCC26 Sep 2023 4:26 UTC

−7 points

49 comments3 min readLW link

Carioca Petrov Day

Giskard26 Sep 2023 0:30 UTC

1 point

0 comments1 min readLW link

[Question] A few Alignment questions: utility optimizers, SLT, sharp left turn and identifiability

Igor Timofeev26 Sep 2023 0:27 UTC

6 points

1 comment2 min readLW link

Impact stories for model internals: an exercise for interpretability researchers

jenny25 Sep 2023 23:15 UTC

29 points

3 comments7 min readLW link

Autonomic Sanity

Sable25 Sep 2023 22:37 UTC

20 points

9 comments4 min readLW link

(affablyevil.substack.com)

[Question] What is wrong with this “utility switch button problem” approach?

Donald Hobson25 Sep 2023 21:36 UTC

14 points

3 comments1 min readLW link

You should just smile at strangers a lot

chaosmage25 Sep 2023 20:12 UTC

18 points

10 comments1 min readLW link

The King and the Golem

Richard_Ngo25 Sep 2023 19:51 UTC

210 points

19 comments5 min readLW link 1 review

(narrativeark.substack.com)

Public Opinion on AI Safety: AIMS 2023 and 2021 Summary

Jacy Reese Anthis, Janet Pauketat and Ali

25 Sep 2023 18:55 UTC

3 points

2 comments3 min readLW link

(www.sentienceinstitute.org)

Welcome to Apply: The 2024 Vitalik Buterin Fellowships in AI Existential Safety by FLI!

Zhijing Jin25 Sep 2023 18:42 UTC

5 points

2 comments2 min readLW link

Evaluating hidden directions on the utility dataset: classification, steering and removal

Annah and shash42

25 Sep 2023 17:19 UTC

25 points

3 comments7 min readLW link

Linkpost: A model of biases as arising from meta-beliefs

JuanGarcia25 Sep 2023 17:14 UTC

5 points

0 comments1 min readLW link

[Question] What causes a decision theory to be used?

Dagon25 Sep 2023 16:33 UTC

8 points

2 comments1 min readLW link

Understanding strategic deception and deceptive alignment

Marius Hobbhahn, Mikita Balesni, Jérémy Scheurer and Dan Braun

25 Sep 2023 16:27 UTC

64 points

16 comments7 min readLW link

(www.apolloresearch.ai)

The Merits of Contrarianism & Why I hate Chatbots. [My Experience with the Ideological Turing Test @ a Less Wrong meetup]

Amina V.25 Sep 2023 16:13 UTC

4 points

1 comment1 min readLW link

(bimbollectual.com)

Inside Views, Impostor Syndrome, and the Great LARP

johnswentworth25 Sep 2023 16:08 UTC

339 points

54 comments5 min readLW link

“X distracts from Y” as a thinly-disguised fight over group status / politics

Steven Byrnes25 Sep 2023 15:18 UTC

113 points

14 comments8 min readLW link

Amazon to invest up to $4 billion in Anthropic

Davis_Kingsley25 Sep 2023 14:55 UTC

44 points

8 comments1 min readLW link

(twitter.com)

Should Effective Altruists be Valuists instead of utilitarians?

spencerg and AmberDawn

25 Sep 2023 14:03 UTC

1 point

3 comments6 min readLW link

Feedly Breaks MathML

jefftk25 Sep 2023 13:40 UTC

15 points

3 comments1 min readLW link

(www.jefftk.com)

[Question] How have you become more hard-working?

Chi Nguyen25 Sep 2023 12:37 UTC

84 points

42 comments1 min readLW link

Automating Intelligence: A Cursory Glance at How AutoML Brings Precision to AI Development

RoscoHunter25 Sep 2023 9:39 UTC

3 points

0 comments3 min readLW link

Interpreting OpenAI’s Whisper

EllenaR24 Sep 2023 17:53 UTC

116 points

13 comments7 min readLW link

Contradiction Appeal Bias

onur24 Sep 2023 17:03 UTC

3 points

2 comments1 min readLW link

RAIN: Your Language Models Can Align Themselves without Finetuning—Microsoft Research 2023 - Reduces the adversarial prompt attack success rate from 94% to 19%!

Singularian250124 Sep 2023 16:48 UTC

5 points

0 comments1 min readLW link

Honor System for Vaccination?

jefftk24 Sep 2023 11:50 UTC

17 points

22 comments1 min readLW link

(www.jefftk.com)

Far-Future Commitments as a Policy Consensus Strategy

FCCC24 Sep 2023 6:34 UTC

7 points

40 comments1 min readLW link

Five neglected work areas that could reduce AI risk

CharlotteS and Aaron_Scher

24 Sep 2023 2:03 UTC

17 points

5 comments9 min readLW link