16 Dec 2024 22:42 UTC

53 points

1 comment2 min readLW link

(arxiv.org)

A practical guide to tiling the universe with hedonium

Vittu Perkele16 Dec 2024 21:25 UTC

−8 points

1 comment1 min readLW link

(perkeleperusing.substack.com)

AI Safety Seed Funding Network—Join as a Donor or Investor

Alexandra Bos16 Dec 2024 19:30 UTC

30 points

0 comments2 min readLW link

I read every major AI lab’s safety plan so you don’t have to

sarahhw16 Dec 2024 18:51 UTC

20 points

0 comments12 min readLW link

(longerramblings.substack.com)

Grokking revisited: reverse engineering grokking modulo addition in LSTM

Nikita Khomich and Danik

16 Dec 2024 18:48 UTC

4 points

0 comments6 min readLW link

Progress links and short notes, 2024-12-16

jasoncrawford16 Dec 2024 17:24 UTC

7 points

0 comments2 min readLW link

(newsletter.rootsofprogress.org)

Effective Altruism FAQ

Bentham's Bulldog16 Dec 2024 16:27 UTC

0 points

7 comments12 min readLW link

Variably compressibly studies are fun

dkl916 Dec 2024 16:00 UTC

0 points

0 comments2 min readLW link

(dkl9.net)

AIs Will Increasingly Attempt Shenanigans

Zvi16 Dec 2024 15:20 UTC

119 points

2 comments26 min readLW link

(thezvi.wordpress.com)

Testing which LLM architectures can do hidden serial reasoning

Filip Sondej16 Dec 2024 13:48 UTC

86 points

9 comments4 min readLW link

NeuroAI for AI safety: A Differential Path

nz and Patrick Mineault

16 Dec 2024 13:17 UTC

23 points

0 comments7 min readLW link

(arxiv.org)

Circling as practice for “just be yourself”

Kaj_Sotala16 Dec 2024 7:40 UTC

88 points

6 comments4 min readLW link

(kajsotala.fi)

Reanalyzing the 2023 Expert Survey on Progress in AI

AI Impacts16 Dec 2024 6:10 UTC

8 points

0 comments1 min readLW link

(blog.aiimpacts.org)

Ideas for benchmarking LLM creativity

gwern16 Dec 2024 5:18 UTC

60 points

11 comments1 min readLW link

(gwern.net)

Comparing the AirFanta 3Pro to the Coway AP-1512

jefftk16 Dec 2024 1:40 UTC

13 points

0 comments1 min readLW link

(www.jefftk.com)

[Question] are IQ tests a good measure of intelligence?

KvmanThinking15 Dec 2024 23:06 UTC

0 points

5 comments1 min readLW link

Madison Secular Solstice

svfritz15 Dec 2024 21:52 UTC

1 point

0 comments1 min readLW link

[Question] Is AI alignment a purely functional property?

Roko15 Dec 2024 21:42 UTC

13 points

8 comments1 min readLW link

[Question] How counterfactual are logical counterfactuals?

Donald Hobson15 Dec 2024 21:16 UTC

11 points

10 comments1 min readLW link

Debunking the myth of safe AI

henophilia15 Dec 2024 17:44 UTC

−11 points

8 comments1 min readLW link

(henophilia.substack.com)

Introducing Avatarism: A Rational Framework for Building actual Heaven

ratiba ro15 Dec 2024 17:17 UTC

2 points

2 comments2 min readLW link

A Public Choice Take on Effective Altruism

vaishnav9215 Dec 2024 16:58 UTC

10 points

4 comments3 min readLW link

(www.optimaloutliers.com)

World Models I’m Currently Building

temporary15 Dec 2024 16:29 UTC

5 points

1 comment1 min readLW link

(samuelshadrach.com)

Dress Up For Secular Solstice

Gordon H.S.15 Dec 2024 16:28 UTC

33 points

13 comments7 min readLW link

Remap your caps lock key

bilalchughtai15 Dec 2024 14:03 UTC

82 points

21 comments1 min readLW link

Effective Evil’s AI Misalignment Plan

lsusr15 Dec 2024 7:39 UTC

83 points

9 comments3 min readLW link

How to Edit an Essay into a Solstice Speech?

Czynski15 Dec 2024 4:30 UTC

5 points

1 comment1 min readLW link

(thepdv.wordpress.com)

How Your Physiology Affects the Mind’s Projection Fallacy

YanLyutnev14 Dec 2024 21:10 UTC

−1 points

0 comments6 min readLW link

Introducing the Evidence Color Wheel

Larry Lee14 Dec 2024 16:08 UTC

6 points

0 comments3 min readLW link

An Illustrated Summary of “Robust Agents Learn Causal World Model”

Dalcy14 Dec 2024 15:02 UTC

75 points

2 comments10 min readLW link

Best-of-N Jailbreaking

John Hughes, saraprice, Aengus Lynch, Rylan Schaeffer, fbarez, Henry Sleight, Ethan Perez and mrinank_sharma

14 Dec 2024 4:58 UTC

79 points

5 comments2 min readLW link

(arxiv.org)

D&D.Sci Dungeonbuilding: the Dungeon Tournament

aphyer14 Dec 2024 4:30 UTC

50 points

16 comments3 min readLW link

Creating Interpretable Latent Spaces with Gradient Routing

Jacob G-W14 Dec 2024 4:00 UTC

26 points

6 comments2 min readLW link

(jacobgw.com)

Probability of death by suicide by a 26 year old

John Wiseman14 Dec 2024 3:33 UTC

−25 points

4 comments1 min readLW link

Matryoshka Sparse Autoencoders

Noa Nabeshima14 Dec 2024 2:52 UTC

100 points

15 comments11 min readLW link

[Question] What is MIRI currently doing?

Roko14 Dec 2024 2:39 UTC

33 points

14 comments1 min readLW link

The o1 System Card Is Not About o1

Zvi13 Dec 2024 20:30 UTC

116 points

5 comments16 min readLW link

(thezvi.wordpress.com)

Arch-anarchy and The Fable of the Dragon-Tyrant

Peter lawless 13 Dec 2024 20:15 UTC

−10 points

0 comments1 min readLW link

Communications in Hard Mode (My new job at MIRI)

tanagrabeast13 Dec 2024 20:13 UTC

211 points

25 comments5 min readLW link

How to Build Heaven: A Constrained Boltzmann Brain Generator

High Tides13 Dec 2024 1:04 UTC

−8 points

3 comments5 min readLW link

Representing Irrationality in Game Theory

Larry Lee13 Dec 2024 0:50 UTC

−1 points

3 comments11 min readLW link

“Charity” as a conflationary alliance term

Jan_Kulveit12 Dec 2024 21:49 UTC

35 points

2 comments5 min readLW link

The Dangers of Mirrored Life

Niko_McCarty and fin

12 Dec 2024 20:58 UTC

121 points

9 comments29 min readLW link

(www.asimov.press)

Effective Networking as Sending Hard to Fake Signals

vaishnav9212 Dec 2024 20:32 UTC

27 points

2 comments7 min readLW link

(www.optimaloutliers.com)

Mini PAPR Review

jefftk12 Dec 2024 19:10 UTC

10 points

0 comments2 min readLW link

(www.jefftk.com)

Biological risk from the mirror world

jasoncrawford12 Dec 2024 19:07 UTC

338 points

40 comments7 min readLW link 1 review

(newsletter.rootsofprogress.org)

Naturalistic dualism

Arturo Macias12 Dec 2024 16:19 UTC

−4 points

0 comments4 min readLW link

AI #94: Not Now, Google

Zvi12 Dec 2024 15:40 UTC

49 points

3 comments64 min readLW link

(thezvi.wordpress.com)

Consciousness, Intelligence, and AI – Some Quick Notes [call it a mini-ramble]

Bill Benzon12 Dec 2024 15:04 UTC

−3 points

0 comments4 min readLW link

The Dissolution of AI Safety

Roko12 Dec 2024 10:34 UTC

8 points

44 comments1 min readLW link

(www.transhumanaxiology.com)