All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025 2026

All Jan Feb Mar Apr MayJunJul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 131415 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

OpenAI: GPT-based LLMs show ability to discriminate between its own wrong answers, but inability to explain how/why it makes that discrimination, even as model scales

Aditya Jain13 Jun 2022 23:33 UTC

14 points

5 comments1 min readLW link

(openai.com)

[Question] Who said something like “The fact that putting 2 apples next to 2 other apples leads to there being 4 apples there has nothing to do with the fact that 2 + 2 = 4”?

hunterglenn13 Jun 2022 22:23 UTC

1 point

2 comments1 min readLW link

Continuity Assumptions

Jan_Kulveit13 Jun 2022 21:31 UTC

53 points

13 comments4 min readLW link

Crypto-fed Computation

aaguirre13 Jun 2022 21:20 UTC

24 points

7 comments7 min readLW link

A Modest Pivotal Act

anonymousaisafety13 Jun 2022 19:24 UTC

−16 points

1 comment5 min readLW link

Contra EY: Can AGI destroy us without trial & error?

nsokolsky13 Jun 2022 18:26 UTC

137 points

72 comments15 min readLW link

What are some smaller-but-concrete challenges related to AI safety that are impacting people today?

nonzerosum13 Jun 2022 17:36 UTC

4 points

3 comments1 min readLW link

[Link] New SEP article on Bayesian Epistemology

Aryeh Englander13 Jun 2022 15:03 UTC

6 points

0 comments1 min readLW link

Training Trace Priors

Adam Jermyn13 Jun 2022 14:22 UTC

12 points

17 comments4 min readLW link

[Question] Can you MRI a deep learning model?

Yair Halberstadt13 Jun 2022 13:43 UTC

3 points

3 comments1 min readLW link

On A List of Lethalities

Zvi13 Jun 2022 12:30 UTC

165 points

50 comments54 min readLW link 1 review

(thezvi.wordpress.com)

D&D.Sci June 2022 Evaluation and Ruleset

abstractapplic13 Jun 2022 10:31 UTC

34 points

11 comments4 min readLW link

[Question] What’s the “This AI is of moral concern.” fire alarm?

Quintin Pope13 Jun 2022 8:05 UTC

37 points

56 comments2 min readLW link

The beautiful magical enchanted golden Dall-e Mini is underrated

p.b.13 Jun 2022 7:58 UTC

14 points

0 comments1 min readLW link

Why so little AI risk on rationalist-adjacent blogs?

Grant Demaree13 Jun 2022 6:31 UTC

46 points

23 comments8 min readLW link

Code Quality and Rule Consequentialism

Adam Zerner13 Jun 2022 3:12 UTC

17 points

13 comments6 min readLW link

Grokking “Semi-informative priors over AI timelines”

anson.ho12 Jun 2022 22:17 UTC

15 points

7 comments14 min readLW link

[Question] How much does cybersecurity reduce AI risk?

Darmani12 Jun 2022 22:13 UTC

34 points

23 comments1 min readLW link

[Question] How are compute assets distributed in the world?

Chris van Merwijk12 Jun 2022 22:13 UTC

30 points

7 comments1 min readLW link

Intuitive Explanation of AIXI

Thomas Larsen12 Jun 2022 21:41 UTC

22 points

2 comments5 min readLW link

Why all the fuss about recursive self-improvement?

So8res12 Jun 2022 20:53 UTC

166 points

63 comments7 min readLW link 1 review

Why the Kaldor-Hicks criterion can be non-transitive

Rupert12 Jun 2022 17:26 UTC

4 points

10 comments2 min readLW link

[Question] How do you post links here?

skybrian12 Jun 2022 16:23 UTC

1 point

1 comment1 min readLW link

[Question] Filter out tags from the front page?

jaspax12 Jun 2022 10:59 UTC

9 points

2 comments1 min readLW link

How To: A Workshop (or anything)

Duncan Sabien (Inactive)12 Jun 2022 8:00 UTC

53 points

13 comments38 min readLW link 1 review

A claim that Google’s LaMDA is sentient

Ben Livengood12 Jun 2022 4:18 UTC

31 points

133 comments1 min readLW link

[Question] How much stupider than humans can AI be and still kill us all through sheer numbers and resource access?

Shmi12 Jun 2022 1:01 UTC

11 points

11 comments1 min readLW link

ELK Proposal—Make the Reporter care about the Predictor’s beliefs

Adam Jermyn and Nicholas Schiefer

11 Jun 2022 22:53 UTC

8 points

0 comments6 min readLW link

[Question] Why has no person / group ever taken over the world?

Aryeh Englander11 Jun 2022 20:51 UTC

25 points

19 comments1 min readLW link

[Question] Are there English-speaking meetups in Frankfurt/Munich/Zurich?

Grant Demaree11 Jun 2022 20:02 UTC

6 points

2 comments1 min readLW link

Beauty and the Beast

Tomás B.11 Jun 2022 18:59 UTC

57 points

10 comments6 min readLW link

Poorly-Aimed Death Rays

Thane Ruthenis11 Jun 2022 18:29 UTC

49 points

5 comments4 min readLW link

AGI Safety Communications Initiative

ines11 Jun 2022 17:34 UTC

7 points

0 comments1 min readLW link

A gaming group for rationality-aware people

dhatas11 Jun 2022 16:04 UTC

7 points

0 comments1 min readLW link

[Question] Why don’t you introduce really impressive people you personally know to AI alignment (more often)?

Verden11 Jun 2022 15:59 UTC

33 points

14 comments1 min readLW link

Godzilla Strategies

johnswentworth11 Jun 2022 15:44 UTC

174 points

72 comments3 min readLW link

Steganography and the CycleGAN—alignment failure case study

Jan Czechowski11 Jun 2022 9:41 UTC

34 points

0 comments4 min readLW link

The Mountain Troll

lsusr11 Jun 2022 9:14 UTC

108 points

26 comments2 min readLW link

Show LW: YodaTimer.com

Adam Zerner11 Jun 2022 8:52 UTC

27 points

4 comments1 min readLW link

How fast can we perform a forward pass?

jsteinhardt10 Jun 2022 23:30 UTC

53 points

9 comments15 min readLW link

(bounded-regret.ghost.io)

Summary of “AGI Ruin: A List of Lethalities”

Stephen McAleese10 Jun 2022 22:35 UTC

45 points

2 comments8 min readLW link

How dangerous is human-level AI?

Alex_Altair10 Jun 2022 17:38 UTC

21 points

4 comments8 min readLW link

Another plausible scenario of AI risk: AI builds military infrastructure while collaborating with humans, defects later.

avturchin10 Jun 2022 17:24 UTC

10 points

2 comments1 min readLW link

Leaving Google, Joining the Nucleic Acid Observatory

jefftk10 Jun 2022 17:00 UTC

114 points

4 comments3 min readLW link

(www.jefftk.com)

On The Spectrum, On The Guest List: (v) The Fleur Room

party girl10 Jun 2022 14:50 UTC

8 points

1 comment14 min readLW link

(onthespectrumontheguestlist.substack.com)

Progress Report 6: get the tool working

Nathan Helm-Burger10 Jun 2022 11:18 UTC

4 points

0 comments2 min readLW link

[Question] Is AI Alignment Impossible?

Heighn10 Jun 2022 10:08 UTC

3 points

3 comments1 min readLW link

I No Longer Believe Intelligence to be “Magical”

DragonGod10 Jun 2022 8:58 UTC

28 points

34 comments6 min readLW link

[linkpost] The final AI benchmark: BIG-bench

RomanS10 Jun 2022 8:53 UTC

25 points

21 comments1 min readLW link

[Question] Could Patent-Trolling delay AI timelines?

Pablo Repetto10 Jun 2022 2:53 UTC

1 point

3 comments1 min readLW link