All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025 2026

All Jan Feb Mar Apr May Jun JulAugSep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 212223 24 25 26 27 28 29 30 31

Just because an LLM said it doesn’t mean it’s true: an illustrative example

dirk21 Aug 2024 21:05 UTC

26 points

12 comments3 min readLW link

[Question] How do you finish your tasks faster?

Cipolla21 Aug 2024 20:01 UTC

4 points

2 comments1 min readLW link

AI Safety Newsletter #40: California AI Legislation Plus, NVIDIA Delays Chip Production, and Do AI Safety Benchmarks Actually Measure Safety?

Corin Katzke, Julius, Alexa Pan and Dan H

21 Aug 2024 18:09 UTC

11 points

0 comments6 min readLW link

(newsletter.safe.ai)

[Question] Should LW suggest standard metaprompts?

Dagon21 Aug 2024 16:41 UTC

4 points

6 comments1 min readLW link

Eternal Existence and Eternal Boredom: The Case for AI and Immortal Humans

Tuan Tu Nguyen21 Aug 2024 9:58 UTC

−12 points

2 comments5 min readLW link

Please do not use AI to write for you

Richard_Kennaway21 Aug 2024 9:53 UTC

67 points

35 comments4 min readLW link

Apply to Aether—Independent LLM Agent Safety Research Group

RohanS21 Aug 2024 9:47 UTC

13 points

0 comments7 min readLW link

(forum.effectivealtruism.org)

the Giga Press was a mistake

bhauth21 Aug 2024 4:51 UTC

103 points

26 comments5 min readLW link

(bhauth.com)

Exploring the Boundaries of Cognitohazards and the Nature of Reality

Victor Novikov21 Aug 2024 3:42 UTC

−2 points

2 comments1 min readLW link

[Question] What is the point of 2v2 debates?

Axel Ahlqvist20 Aug 2024 21:59 UTC

2 points

1 comment1 min readLW link

[Question] Where should I look for information on gut health?

FinalFormal220 Aug 2024 19:44 UTC

10 points

10 comments1 min readLW link

Would you benefit from, or object to, a page with LW users’ reacts?

Raemon20 Aug 2024 16:35 UTC

23 points

6 comments1 min readLW link

Freedom of Speech

Zero Contradictions20 Aug 2024 16:34 UTC

−13 points

2 comments2 min readLW link

(thewaywardaxolotl.blogspot.com)

AGI Safety and Alignment at Google DeepMind: A Summary of Recent Work

Rohin Shah, Seb Farquhar and Anca Dragan

20 Aug 2024 16:22 UTC

217 points

33 comments9 min readLW link

Trying to be rational for the wrong reasons

Viliam20 Aug 2024 16:18 UTC

26 points

9 comments3 min readLW link

[Question] How great is the utility of “saving” endangered languages?

SpectrumDT20 Aug 2024 13:14 UTC

18 points

29 comments1 min readLW link

Guide to SB 1047

Zvi20 Aug 2024 13:10 UTC

71 points

18 comments53 min readLW link

(thezvi.wordpress.com)

Finding Deception in Language Models

Esben Kran and Archana Vaidheeswaran

20 Aug 2024 9:42 UTC

20 points

4 comments4 min readLW link

Next automated reasoning grand challenge: CompCert

sanxiyn20 Aug 2024 5:27 UTC

−5 points

0 comments1 min readLW link

Thiel on AI & Racing with China

Ben Pace20 Aug 2024 3:19 UTC

55 points

10 comments12 min readLW link

Reflecting on the transhumanist rebuttal to AI existential risk and critique of our debate methodologies and misuse of statistics

catgirlsruletheworld20 Aug 2024 1:59 UTC

−5 points

0 comments4 min readLW link

Artificial Intelligence and Eternal Torture and Suffering

Tuan Tu Nguyen20 Aug 2024 1:53 UTC

0 points

0 comments4 min readLW link

AI #77: A Few Upgrades

Zvi20 Aug 2024 0:20 UTC

23 points

3 comments52 min readLW link

(thezvi.wordpress.com)

Monthly Roundup #21: August 2024

Zvi20 Aug 2024 0:20 UTC

22 points

6 comments40 min readLW link

(thezvi.wordpress.com)

[Linkpost] Automated Design of Agentic Systems

Bogdan Ionut Cirstea19 Aug 2024 23:06 UTC

8 points

1 comment1 min readLW link

(arxiv.org)

Limitations on Formal Verification for AI Safety

Andrew Dickson19 Aug 2024 23:03 UTC

135 points

60 comments23 min readLW link

The Conscious River: Conscious Turing machines negate materialism

blallo19 Aug 2024 21:54 UTC

0 points

4 comments7 min readLW link

LLM Applications I Want To See

sarahconstantin19 Aug 2024 21:10 UTC

102 points

6 comments8 min readLW link

(sarahconstantin.substack.com)

Defining alignment research

Richard_Ngo19 Aug 2024 20:42 UTC

131 points

26 comments7 min readLW link 2 reviews

Vilnius – ACX Meetups Everywhere Fall 2024

NoUsernameSelected and Mnephisto

19 Aug 2024 17:38 UTC

3 points

1 comment1 min readLW link

Can Current LLMs be Trusted To Produce Paperclips Safely?

Rohit Chatterjee19 Aug 2024 17:17 UTC

4 points

0 comments9 min readLW link

A primer on why computational predictive toxicology is hard

Abhishaike Mahajan19 Aug 2024 17:16 UTC

63 points

2 comments12 min readLW link

(www.owlposting.com)

Introduction and Exploration of AI Ethics Through a Global Lens

ThePathYouWillChoose19 Aug 2024 17:11 UTC

1 point

0 comments1 min readLW link

Trustworthy and untrustworthy models

Olli Järviniemi19 Aug 2024 16:27 UTC

47 points

3 comments8 min readLW link

Apartment Price Map Discontinuity

jefftk19 Aug 2024 15:30 UTC

12 points

0 comments1 min readLW link

(www.jefftk.com)

Will we ever run out of new jobs?

Kevin Kohler19 Aug 2024 15:04 UTC

17 points

7 comments7 min readLW link

(machinocene.substack.com)

[Question] What are the best resources for building gears-level models of how governments actually work?

adamShimi19 Aug 2024 14:05 UTC

20 points

6 comments1 min readLW link

[Cross-post] Book Review: Bureaucracy, by James Q Wilson

davekasten19 Aug 2024 13:57 UTC

18 points

0 comments7 min readLW link

[Question] If AI is in a bubble and the bubble bursts, what would you do?

Remmelt19 Aug 2024 10:56 UTC

12 points

18 comments1 min readLW link

Thinking About Propensity Evaluations

Maxime Riché, Harrison G, JaimeRV and Edoardo Pona

19 Aug 2024 9:23 UTC

12 points

0 comments27 min readLW link

A Taxonomy Of AI System Evaluations

Maxime Riché, JaimeRV, Harrison G and Edoardo Pona

19 Aug 2024 9:07 UTC

13 points

0 comments14 min readLW link

Beware the science fiction bias in predictions of the future

nsokolsky19 Aug 2024 5:32 UTC

25 points

20 comments4 min readLW link

(nsokolsky.substack.com)

Interdictor Ship

lsusr19 Aug 2024 4:59 UTC

63 points

10 comments7 min readLW link

Why you should be using a retinoid

GeneSmith19 Aug 2024 3:07 UTC

109 points

60 comments5 min readLW link

Liability regimes for AI

Ege Erdil19 Aug 2024 1:25 UTC

153 points

34 comments5 min readLW link

Something Is Lost When AI Makes Art

utilistrutil18 Aug 2024 22:53 UTC

18 points

1 comment11 min readLW link

Scaling Laws and Likely Limits to AI

Davidmanheim18 Aug 2024 17:19 UTC

19 points

0 comments3 min readLW link

What is “True Love”?

johnswentworth18 Aug 2024 16:05 UTC

76 points

11 comments1 min readLW link

Quick look: applications of chaos theory

Elizabeth and Alex_Altair

18 Aug 2024 15:00 UTC

83 points

51 comments8 min readLW link

(acesounderglass.com)

Restructuring Pop Songs for Contra

jefftk18 Aug 2024 14:10 UTC

11 points

0 comments2 min readLW link

(www.jefftk.com)