All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025

All Jan Feb Mar Apr MayJunJul Aug Sep Oct Nov Dec

All1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Adversarial training, importance sampling, and anti-adversarial training for AI whistleblowing

BuckJun 2, 2022, 11:48 PM

42 points

0 comments3 min readLW link

Tao, Kontsevich & others on HLAI in Math

intersticeJun 10, 2022, 2:25 AM

41 points

5 comments2 min readLW link

(www.youtube.com)

Linkpost: Robin Hanson—Why Not Wait On AI Risk?

Yair HalberstadtJun 24, 2022, 2:23 PM

41 points

14 comments1 min readLW link

(www.overcomingbias.com)

Blake Richards on Why he is Skeptical of Existential Risk from AI

Michaël TrazziJun 14, 2022, 7:09 PM

41 points

12 comments4 min readLW link

(theinsideview.ai)

Georgism, in theory

Stuart_ArmstrongJun 15, 2022, 3:20 PM

40 points

22 comments4 min readLW link

Key Papers in Language Model Safety

aogJun 20, 2022, 3:00 PM

40 points

1 comment22 min readLW link

D&D.Sci June 2022: A Goddess Tried To Reincarnate Me Into A Fantasy World, But I Insisted On Using Data Science To Select An Optimal Combination Of Cheat Skills!

abstractapplicJun 4, 2022, 1:28 AM

40 points

22 comments3 min readLW link

A Litany Missing from the Canon

benwrJun 17, 2022, 1:39 AM

39 points

3 comments1 min readLW link

(www.benwr.net)

Four reasons I find AI safety emotionally compelling

KatWoods and AmberDawn

Jun 28, 2022, 2:10 PM

39 points

3 comments4 min readLW link

Another Calming Example

jefftkJun 3, 2022, 2:20 AM

39 points

13 comments2 min readLW link

(www.jefftk.com)

The table of different sampling assumptions in anthropics

avturchinJun 29, 2022, 10:41 AM

39 points

5 comments12 min readLW link

[Yann Lecun] A Path Towards Autonomous Machine Intelligence

DragonGodJun 27, 2022, 7:24 PM

38 points

14 comments1 min readLW link

(openreview.net)

Grokking “Forecasting TAI with biological anchors”

anson.hoJun 6, 2022, 6:58 PM

38 points

0 comments14 min readLW link

Beauty and the Beast

Tomás B.Jun 11, 2022, 6:59 PM

38 points

8 comments6 min readLW link

Gradient hacking: definitions and examples

Richard_NgoJun 29, 2022, 9:35 PM

38 points

2 comments5 min readLW link

Vael Gates: Risks from Advanced AI (June 2022)

Vael GatesJun 14, 2022, 12:54 AM

38 points

2 comments30 min readLW link

[Question] What’s the “This AI is of moral concern.” fire alarm?

Quintin PopeJun 13, 2022, 8:05 AM

37 points

56 comments2 min readLW link

Quick Look: Asymptomatic Herpes Shedding

ElizabethJun 4, 2022, 9:40 PM

37 points

4 comments2 min readLW link

(acesounderglass.com)

Scott Aaronson and Steven Pinker Debate AI Scaling

LironJun 28, 2022, 4:04 PM

37 points

7 comments1 min readLW link

(scottaaronson.blog)

Why agents are powerful

Daniel KokotajloJun 6, 2022, 1:37 AM

37 points

7 comments7 min readLW link

Announcing the Clearer Thinking Regrants program

spencergJun 17, 2022, 1:14 PM

36 points

1 comment1 min readLW link

[Link] Adversarially trained neural representations may already be as robust as corresponding biological neural representations

Gunnar_ZarnckeJun 24, 2022, 8:51 PM

35 points

9 comments1 min readLW link

Optimization and Adequacy in Five Bullets

james.lucassenJun 6, 2022, 5:48 AM

35 points

2 comments4 min readLW link

(jlucassen.com)

Alignment Risk Doesn’t Require Superintelligence

JustisMillsJun 15, 2022, 3:12 AM

35 points

4 comments2 min readLW link

D&D.Sci June 2022 Evaluation and Ruleset

abstractapplicJun 13, 2022, 10:31 AM

34 points

11 comments4 min readLW link

Steganography and the CycleGAN—alignment failure case study

Jan CzechowskiJun 11, 2022, 9:41 AM

34 points

0 comments4 min readLW link

[Question] Are long-form dating profiles productive?

AABoylesJun 27, 2022, 5:03 PM

34 points

32 comments1 min readLW link

[Question] How much does cybersecurity reduce AI risk?

DarmaniJun 12, 2022, 10:13 PM

34 points

23 comments1 min readLW link

[Question] Why don’t you introduce really impressive people you personally know to AI alignment (more often)?

VerdenJun 11, 2022, 3:59 PM

33 points

14 comments1 min readLW link

To what extent have ideas and scientific discoveries gotten harder to find?

lsusrJun 18, 2022, 7:15 AM

33 points

10 comments6 min readLW link

Reflection Mechanisms as an Alignment target: A survey

Marius Hobbhahn, elandgre and Beth Barnes

Jun 22, 2022, 3:05 PM

32 points

1 comment14 min readLW link

Google’s new text-to-image model—Parti, a demonstration of scaling benefits

KaydenJun 22, 2022, 8:00 PM

32 points

4 comments1 min readLW link

A claim that Google’s LaMDA is sentient

Ben LivengoodJun 12, 2022, 4:18 AM

31 points

133 comments1 min readLW link

[Question] How are compute assets distributed in the world?

Chris van MerwijkJun 12, 2022, 10:13 PM

30 points

7 comments1 min readLW link

[Question] Why don’t we think we’re in the simplest universe with intelligent life?

ADifferentAnonymousJun 18, 2022, 3:05 AM

30 points

33 comments1 min readLW link

Assessing AlephAlphas Multimodal Model

p.b.Jun 28, 2022, 9:28 AM

30 points

5 comments3 min readLW link

Common but neglected risk factors that may let you get Paxlovid

DirectedEvolutionJun 21, 2022, 7:34 AM

29 points

8 comments4 min readLW link

Covid 6/16/22: Do Not Hand it to Them

ZviJun 16, 2022, 2:40 PM

29 points

5 comments7 min readLW link

(thezvi.wordpress.com)

Entitlement as a major amplifier of unhappiness

VipulNaikJun 8, 2022, 10:08 PM

29 points

6 comments7 min readLW link

Forecasting Fusion Power

Daniel KokotajloJun 18, 2022, 12:04 AM

29 points

8 comments1 min readLW link

(astralcodexten.substack.com)

Juneberry Cake

jefftkJun 19, 2022, 1:40 AM

29 points

0 comments1 min readLW link

(www.jefftk.com)

A Butterfly’s View of Probability

Gabriel WuJun 15, 2022, 2:14 AM

29 points

17 comments11 min readLW link

Why it’s bad to kill Grandma

dynomightJun 9, 2022, 6:12 PM

29 points

14 comments8 min readLW link

(dynomight.substack.com)

Was the Industrial Revolution The Industrial Revolution?

Davis KedroskyJun 14, 2022, 2:48 PM

29 points

0 comments12 min readLW link

(daviskedrosky.substack.com)

Wielding civilization

dominicqJun 1, 2022, 7:11 AM

29 points

2 comments2 min readLW link

[Link-post] On Deference and Yudkowsky’s AI Risk Estimates

bmgJun 19, 2022, 5:25 PM

29 points

8 comments1 min readLW link

Investigating causal understanding in LLMs

Marius Hobbhahn and Tom Lieberum

Jun 14, 2022, 1:57 PM

28 points

6 comments13 min readLW link

[Question] Is CIRL a promising agenda?

Chris_LeongJun 23, 2022, 5:12 PM

28 points

16 comments1 min readLW link

Intelligence in Commitment Races

David UdellJun 24, 2022, 2:30 PM

28 points

8 comments5 min readLW link

Limits of Bodily Autonomy

jefftkJun 27, 2022, 7:50 PM

28 points

18 comments1 min readLW link

(www.jefftk.com)