1 Jul 2024 21:35 UTC

75 points

12 comments9 min readLW link

Honest science is spirituality

pchvykov1 Jul 2024 20:33 UTC

−1 points

10 comments4 min readLW link

New Executive Team & Board — PIBBSS

Nora_Ammann1 Jul 2024 19:30 UTC

43 points

1 comment1 min readLW link

Uncursing Civilization

Lorec1 Jul 2024 18:44 UTC

0 points

3 comments5 min readLW link

[Question] Self-censoring on AI x-risk discussions?

Decaeneus1 Jul 2024 18:24 UTC

17 points

2 comments1 min readLW link

Rationalists As People Who Build Piles Of Rocks

Sable1 Jul 2024 10:32 UTC

11 points

0 comments5 min readLW link

(affablyevil.substack.com)

How good are LLMs at doing ML on an unknown dataset?

Håvard Tveit Ihle1 Jul 2024 9:04 UTC

33 points

4 comments13 min readLW link

Whirlwind Tour of Chain of Thought Literature Relevant to Automating Alignment Research.

sevdeawesome1 Jul 2024 5:50 UTC

25 points

0 comments17 min readLW link

Probabilistic Logic ⇔ Oracles?

yudhister1 Jul 2024 5:36 UTC

23 points

0 comments4 min readLW link

Important open problems in voting

Closed Limelike Curves1 Jul 2024 2:53 UTC

33 points

1 comment1 min readLW link

In Defense of Lawyers Playing Their Part

Isaac King1 Jul 2024 1:32 UTC

32 points

9 comments9 min readLW link

Review of METR’s public evaluation protocol

nahoj and JaimeRV

30 Jun 2024 22:03 UTC

10 points

0 comments5 min readLW link

Superposition, Self-Modeling, and the Path to AGI: A New Perspective

Peterpiper30 Jun 2024 17:20 UTC

−13 points

0 comments2 min readLW link

The Xerox Parc/ARPA version of the intellectual Turing test: Class 1 vs Class 2 disagreement

hamishtodd130 Jun 2024 15:34 UTC

6 points

3 comments1 min readLW link

LLMs Universally Learn a Feature Representing Token Frequency / Rarity

Sean Osier30 Jun 2024 2:48 UTC

13 points

5 comments6 min readLW link

(github.com)

My 5-step program for losing weight

nsokolsky30 Jun 2024 1:05 UTC

22 points

20 comments5 min readLW link

(nsokolsky.substack.com)

Datasets that change the odds you exist

dynomight29 Jun 2024 18:45 UTC

56 points

4 comments6 min readLW link

(dynomight.net)

A “Scaling Monosemanticity” Explainer

latterframe and TheoR

29 Jun 2024 17:50 UTC

10 points

0 comments3 min readLW link

Analysis of key AI analogies

Kevin Kohler29 Jun 2024 10:55 UTC

10 points

2 comments15 min readLW link

Georgism Crash Course

Zero Contradictions29 Jun 2024 6:18 UTC

9 points

5 comments1 min readLW link

(zerocontradictions.net)

Activation Pattern SVD: A proposal for SAE Interpretability

Daniel Tan28 Jun 2024 22:12 UTC

15 points

2 comments2 min readLW link

Podcast: Elizabeth & Austin on “What Manifold was allowed to do”

Austin Chen28 Jun 2024 22:10 UTC

20 points

0 comments24 min readLW link

(share.descript.com)

The Incredible Fentanyl-Detecting Machine

sarahconstantin28 Jun 2024 22:10 UTC

158 points

26 comments7 min readLW link

(sarahconstantin.substack.com)

Saving Lives Reduces Over-Population—A Counter-Intuitive Non-Zero-Sum Game

James Stephen Brown28 Jun 2024 19:29 UTC

6 points

0 comments5 min readLW link

(nonzerosum.games)

Mentorship in AGI Safety: Applications for mentorship are open!

Valentin2026 and Joe Rogero

28 Jun 2024 14:49 UTC

5 points

0 comments1 min readLW link

Contra Acemoglu on AI

Maxwell Tabarrok28 Jun 2024 13:13 UTC

48 points

0 comments5 min readLW link

(www.maximum-progress.com)

Five toy worlds to think about heritability

David Hugh-Jones28 Jun 2024 13:11 UTC

13 points

0 comments9 min readLW link

(wyclif.substack.com)

[Question] How do natural sciences prove causation?

Kongo Landwalker28 Jun 2024 11:58 UTC

1 point

3 comments1 min readLW link

LessWrong/ACX meetup Transilvanya tour—Sibiu

Marius Adrian Nicoară28 Jun 2024 11:41 UTC

1 point

1 comment1 min readLW link

Bayes’ Theorem: In Search of Gold (Lesson 1)

bayesyatina28 Jun 2024 8:39 UTC

3 points

0 comments3 min readLW link

How a chip is designed

YM28 Jun 2024 8:04 UTC

65 points

4 comments5 min readLW link

The Wisdom of Living for 200 Years

Martin Sustrik28 Jun 2024 4:44 UTC

25 points

3 comments4 min readLW link

A Generally Intelligent Game

snerx28 Jun 2024 1:31 UTC

−1 points

1 comment4 min readLW link

Corrigibility = Tool-ness?

johnswentworth and David Lorell

28 Jun 2024 1:19 UTC

85 points

8 comments9 min readLW link

Situational Awareness

PeterMcCluskey28 Jun 2024 1:08 UTC

11 points

0 comments12 min readLW link

(bayesianinvestor.com)

Toward a taxonomy of cognitive benchmarks for agentic AGIs

Ben Smith27 Jun 2024 23:50 UTC

15 points

0 comments5 min readLW link

How Big a Deal are MatMul-Free Transformers?

JustisMills27 Jun 2024 22:28 UTC

19 points

6 comments5 min readLW link

(justismills.substack.com)

Secondary forces of debt

KatjaGrace27 Jun 2024 21:10 UTC

81 points

21 comments2 min readLW link

(worldspiritsockpuppet.com)

Distillation of ‘Do language models plan for future tokens’

TheManxLoiner27 Jun 2024 20:57 UTC

26 points

2 comments6 min readLW link

how birds sense magnetic fields

bhauth27 Jun 2024 18:59 UTC

53 points

4 comments5 min readLW link

(www.bhauth.com)

Representation Tuning

Christopher Ackerman27 Jun 2024 17:44 UTC

35 points

9 comments13 min readLW link

An issue with training schemers with supervised fine-tuning

Fabien Roger27 Jun 2024 15:37 UTC

47 points

14 comments6 min readLW link

AI #70: A Beautiful Sonnet

Zvi27 Jun 2024 14:40 UTC

38 points

0 comments44 min readLW link

(thezvi.wordpress.com)

Detecting Genetically Engineered Viruses With Metagenomic Sequencing

jefftk27 Jun 2024 14:01 UTC

87 points

10 comments8 min readLW link

(naobservatory.org)

Cross Robin

jefftk27 Jun 2024 3:10 UTC

11 points

2 comments1 min readLW link

(www.jefftk.com)

Live Theory Part 0: Taking Intelligence Seriously

Sahil26 Jun 2024 21:37 UTC

105 points

3 comments8 min readLW link

Instrumental vs Terminal Desiderata

Max Harms26 Jun 2024 20:57 UTC

22 points

1 comment3 min readLW link

Imbue (Generally Intelligent) continue to make progress

Nathan Helm-Burger26 Jun 2024 20:41 UTC

18 points

0 comments1 min readLW link

(imbue.com)

Tracing the steps

matimissona26 Jun 2024 19:22 UTC

−8 points

2 comments4 min readLW link

Countering AI disinformation and deep fakes with digital signatures

Dave92F126 Jun 2024 18:09 UTC

13 points

5 comments1 min readLW link