All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025 2026

All Jan FebMarApr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 272829 30 31

An A.I. Safety Presentation at RIT

Nicholas Kross27 Mar 2023 23:49 UTC

8 points

0 comments1 min readLW link

(www.youtube.com)

Which AI outputs should humans check for shenanigans, to avoid AI takeover? A simple model

Tom Davidson27 Mar 2023 23:36 UTC

16 points

3 comments8 min readLW link

The Prospect of an AI Winter

Erich_Grunewald27 Mar 2023 20:55 UTC

62 points

24 comments15 min readLW link

(www.erichgrunewald.com)

[Question] Best arguments against the outside view that AGI won’t be a huge deal, thus we survive.

Noosphere8927 Mar 2023 20:49 UTC

4 points

7 comments1 min readLW link

EA & LW Forum Weekly Summary (20th − 26th March 2023)

Zoe Williams27 Mar 2023 20:46 UTC

4 points

0 comments6 min readLW link

Three of my beliefs about upcoming AGI

Robert_AIZI27 Mar 2023 20:27 UTC

6 points

0 comments3 min readLW link

(aizi.substack.com)

Nobody knows how to reliably test for AI safety

marcusarvan27 Mar 2023 19:48 UTC

1 point

0 comments5 min readLW link

New blog: Planned Obsolescence

Ajeya Cotra27 Mar 2023 19:46 UTC

96 points

7 comments1 min readLW link

(www.planned-obsolescence.org)

South Bay ACX/SSC Spring Meetups Everywhere

allisona27 Mar 2023 19:39 UTC

2 points

0 comments1 min readLW link

[Question] Resources to see how people think/approach mathematics and problem-solving

zef27 Mar 2023 19:12 UTC

7 points

2 comments1 min readLW link

Staggering Hunters

Screwtape27 Mar 2023 19:11 UTC

12 points

2 comments5 min readLW link

Neurotechnology is Critical for AI Alignment

Milan Cvitkovic27 Mar 2023 18:27 UTC

10 points

3 comments1 min readLW link

(milan.cvitkovic.net)

[Question] Best resources to learn philosophy of mind and AI?

Sky Moo27 Mar 2023 18:22 UTC

1 point

0 comments1 min readLW link

the tensor is a lonely place

jml627 Mar 2023 18:22 UTC

−11 points

0 comments4 min readLW link

(ekjsgrjelrbno.substack.com)

[Question] Bermudez Interface Problem

Motor Vehicle27 Mar 2023 18:11 UTC

1 point

2 comments1 min readLW link

Would you be a better RLHF labeler than GPT-4?

kache27 Mar 2023 18:10 UTC

1 point

1 comment1 min readLW link

LLM Powered LW Search

odraode1727 Mar 2023 18:09 UTC

−1 points

0 comments1 min readLW link

Announcing the Swiss Existential Risk Initiative (CHERI) 2023 Research Fellowship

Tobias H27 Mar 2023 16:36 UTC

3 points

0 comments2 min readLW link

Industrialization/Computerization Analogies

Gordon Seidoh Worley27 Mar 2023 16:34 UTC

16 points

2 comments2 min readLW link

Lessons from Convergent Evolution for AI Alignment

Jan_Kulveit and rosehadshar

27 Mar 2023 16:25 UTC

54 points

9 comments8 min readLW link

GPT-4 is bad at strategic thinking

Christopher King27 Mar 2023 15:11 UTC

22 points

8 comments1 min readLW link

The salt in pasta water fallacy

Thomas Sepulchre27 Mar 2023 14:53 UTC

244 points

52 comments3 min readLW link 2 reviews

CAIS-inspired approach towards safer and more interpretable AGIs

Peter Hroššo27 Mar 2023 14:36 UTC

13 points

7 comments1 min readLW link

An Overview of Sparks of Artificial General Intelligence: Early experiments with GPT-4

Annapurna27 Mar 2023 13:44 UTC

10 points

0 comments7 min readLW link

(jorgevelez.substack.com)

A Hivemind of GPT-4 bots REALLY IS A HIVEMIND!

Erlja Jkdf.27 Mar 2023 12:44 UTC

−10 points

1 comment1 min readLW link

Duploish Marble Runs

jefftk27 Mar 2023 12:20 UTC

26 points

1 comment1 min readLW link

(www.jefftk.com)

GPT-4 Plugs In

Zvi27 Mar 2023 12:10 UTC

198 points

47 comments6 min readLW link

(thezvi.wordpress.com)

Please help me sense-check my assumptions about the needs of the AI Safety community and related career plans

peterslattery27 Mar 2023 8:23 UTC

6 points

4 comments2 min readLW link

Practical Pitfalls of Causal Scrubbing

Jérémy Scheurer, Phil3, tony, jacquesthibs and David Lindner

27 Mar 2023 7:47 UTC

87 points

17 comments13 min readLW link

[Question] What If: An Earthquake in Taiwan?

Sable27 Mar 2023 7:31 UTC

8 points

2 comments1 min readLW link

What can we learn from Lex Fridman’s interview with Sam Altman?

Karl von Wendt27 Mar 2023 6:27 UTC

56 points

22 comments9 min readLW link

[Question] Steelmanning OpenAI’s Short-Timelines Slow-Takeoff Goal

FinalFormal227 Mar 2023 2:55 UTC

5 points

2 comments1 min readLW link

The default outcome for aligned AGI still looks pretty bad

GeneSmith27 Mar 2023 0:02 UTC

14 points

19 comments3 min readLW link

LLM Modularity: The Separability of Capabilities in Large Language Models

NickyP26 Mar 2023 21:57 UTC

99 points

3 comments41 min readLW link

Testing ChatGPT for white lies

twkaiser26 Mar 2023 21:32 UTC

3 points

2 comments6 min readLW link

Don’t take bad options away from people

Dumbledore's Army26 Mar 2023 20:12 UTC

42 points

100 comments5 min readLW link

What would a compute monitoring plan look like? [Linkpost]

Orpheus1626 Mar 2023 19:33 UTC

158 points

10 comments4 min readLW link

(arxiv.org)

[Question] GPT-4 Specs: 1 Trillion Parameters?

infinibot2726 Mar 2023 18:56 UTC

6 points

8 comments1 min readLW link

If it quacks like a duck...

RationalMindset26 Mar 2023 18:54 UTC

−4 points

0 comments4 min readLW link

Chronostasis: The Time-Capsule Conundrum of Language Models

RationalMindset26 Mar 2023 18:54 UTC

−5 points

0 comments1 min readLW link

[Question] What happens with logical induction when...

Donald Hobson26 Mar 2023 18:31 UTC

18 points

2 comments1 min readLW link

Draft: Introduction to optimization

Alex_Altair26 Mar 2023 17:25 UTC

43 points

8 comments16 min readLW link

Chat bot as CEO at NetDragon Websoft

ChristianKl26 Mar 2023 16:01 UTC

8 points

2 comments1 min readLW link

(www.firstpost.com)

Datapoint: median 10% AI x-risk mentioned on Dutch public TV channel

Chris van Merwijk26 Mar 2023 12:50 UTC

17 points

1 comment1 min readLW link

[Question] How Politics interacts with AI ?

qbolec26 Mar 2023 9:53 UTC

−11 points

4 comments1 min readLW link

Descriptive vs. specifiable values

TsviBT26 Mar 2023 9:10 UTC

17 points

2 comments2 min readLW link

The alignment stability problem

Seth Herd26 Mar 2023 2:10 UTC

35 points

15 comments4 min readLW link

Survey on lifeloggers for a research project

Mati_Roy26 Mar 2023 0:02 UTC

20 points

0 comments1 min readLW link

Manifold: If okay AGI, why?

Eliezer Yudkowsky25 Mar 2023 22:43 UTC

121 points

37 comments1 min readLW link

(manifold.markets)

A stylized dialogue on John Wentworth’s claims about markets and optimization

So8res25 Mar 2023 22:32 UTC

169 points

22 comments8 min readLW link