16 Nov 2023 23:53 UTC

101 points

48 comments14 min readLW link 1 review

How much to update on recent AI governance moves?

habryka and So8res

16 Nov 2023 23:46 UTC

112 points

5 comments29 min readLW link

New LessWrong feature: Dialogue Matching

Bird Concept16 Nov 2023 21:27 UTC

107 points

22 comments3 min readLW link

Towards Evaluating AI Systems for Moral Status Using Self-Reports

Ethan Perez and Robbo

16 Nov 2023 20:18 UTC

45 points

3 comments1 min readLW link

(arxiv.org)

Social Dark Matter

Duncan Sabien (Inactive)16 Nov 2023 20:00 UTC

388 points

131 comments34 min readLW link 2 reviews

AI #38: Let’s Make a Deal

Zvi16 Nov 2023 19:50 UTC

44 points

2 comments55 min readLW link

(thezvi.wordpress.com)

Forecasting AI (Overview)

jsteinhardt16 Nov 2023 19:00 UTC

35 points

0 comments2 min readLW link

(bounded-regret.ghost.io)

We Should Talk About This More. Epistemic World Collapse as Imminent Safety Risk of Generative AI.

Joerg Weiss16 Nov 2023 18:46 UTC

11 points

2 comments29 min readLW link

Intelligence in systems (human, AI) can be conceptualized as the resolution and throughput at which a system can process and affect Shannon information.

AiresJL16 Nov 2023 17:46 UTC

0 points

0 comments2 min readLW link

Life on the Grid (Part 2)

rogersbacon16 Nov 2023 17:22 UTC

7 points

0 comments15 min readLW link

(www.secretorum.life)

The impossibility of rationally analyzing partisan news

RationalDino16 Nov 2023 16:19 UTC

4 points

4 comments1 min readLW link

We are Peacecraft.ai!

MadHatter16 Nov 2023 14:15 UTC

15 points

20 comments2 min readLW link

A dialectical view of the history of AI, Part 1: We’re only in the antithesis phase. [A synthesis is in the future.]

Bill Benzon16 Nov 2023 12:34 UTC

6 points

0 comments12 min readLW link

[Question] How much fraud is there in academia?

ChristianKl16 Nov 2023 11:50 UTC

23 points

10 comments1 min readLW link

Learning coefficient estimation: the details

Zach Furman16 Nov 2023 3:19 UTC

37 points

0 comments2 min readLW link

(colab.research.google.com)

[Question] AI Safety orgs- what’s your biggest bottleneck right now?

Kabir Kumar16 Nov 2023 2:02 UTC

1 point

0 comments1 min readLW link

My critique of Eliezer’s deeply irrational beliefs

Jorterder16 Nov 2023 0:34 UTC

−35 points

1 comment9 min readLW link

(docs.google.com)

Extrapolating from Five Words

Gordon Seidoh Worley15 Nov 2023 23:21 UTC

40 points

11 comments2 min readLW link

In Defense of Parselmouths

Screwtape15 Nov 2023 23:02 UTC

56 points

12 comments10 min readLW link 1 review

Life on the Grid (Part 1)

rogersbacon15 Nov 2023 22:37 UTC

12 points

4 comments9 min readLW link

(www.secretorum.life)

Testbed evals: evaluating AI safety even when it can’t be directly measured

joshc15 Nov 2023 19:00 UTC

72 points

2 comments4 min readLW link

EA/ACX/LW November Santa Cruz Meetup

madmail15 Nov 2023 18:39 UTC

1 point

0 comments1 min readLW link

New report: “Scheming AIs: Will AIs fake alignment during training in order to get power?”

Joe Carlsmith15 Nov 2023 17:16 UTC

83 points

28 comments30 min readLW link 1 review

Large Language Models can Strategically Deceive their Users when Put Under Pressure.

ReaderM15 Nov 2023 16:36 UTC

90 points

9 comments2 min readLW link 1 review

(arxiv.org)

AISN #26: National Institutions for AI Safety, Results From the UK Summit, and New Releases From OpenAI and xAI

Corin Katzke, allison huang and Dan H

15 Nov 2023 16:07 UTC

13 points

0 comments6 min readLW link

(newsletter.safe.ai)

‘Theories of Values’ and ‘Theories of Agents’: confusions, musings and desiderata

Mateusz Bagiński and Nora_Ammann

15 Nov 2023 16:00 UTC

35 points

8 comments24 min readLW link

Experiences and learnings from both sides of the AI safety job market

Marius Hobbhahn15 Nov 2023 15:40 UTC

111 points

4 comments18 min readLW link

A conceptual precursor to today’s language machines [Shannon]

Bill Benzon15 Nov 2023 13:50 UTC

24 points

6 comments2 min readLW link

[Question] Should Advanced Placement High School classes discuss Israel-Palestine? If so, how? If not, why? Who should make this decision?

Gesild Muka15 Nov 2023 4:50 UTC

−1 points

5 comments1 min readLW link

Reinforcement Via Giving People Cookies

Screwtape15 Nov 2023 4:34 UTC

70 points

9 comments6 min readLW link

Incidental polysemanticity

Victor Lecomte, Kushal Thaman, tmychow and Rylan Schaeffer

15 Nov 2023 4:00 UTC

43 points

7 comments11 min readLW link

LLMs May Find It Hard to FOOM

RogerDearnaley15 Nov 2023 2:52 UTC

13 points

30 comments12 min readLW link

Linearity Fallacies

hippo15 Nov 2023 2:23 UTC

15 points

0 comments5 min readLW link

SIA Is Just Being a Bayesian About the Fact That One Exists

Bentham's Bulldog14 Nov 2023 22:55 UTC

3 points

5 comments4 min readLW link

AI Alignment [progress] this Week (11/12/2023)

Logan Zoellner14 Nov 2023 22:21 UTC

6 points

0 comments2 min readLW link

(midwitalignment.substack.com)

[Question] When did Eliezer Yudkowsky change his mind about neural networks?

[deactivated]14 Nov 2023 21:24 UTC

32 points

15 comments1 min readLW link

Betting on what is un-falsifiable and un-verifiable

Abhimanyu Pallavi Sudhir14 Nov 2023 21:11 UTC

15 points

0 comments15 min readLW link

Facebook is Paying Me to Post

jefftk14 Nov 2023 19:10 UTC

26 points

5 comments1 min readLW link

(www.jefftk.com)

Feelings, Nothing More than Feelings, About AI

PaulBecon14 Nov 2023 18:50 UTC

7 points

0 comments3 min readLW link

Kids or No kids

Kids or no kids14 Nov 2023 18:37 UTC

100 points

10 comments13 min readLW link

Raemon’s Deliberate (“Purposeful?”) Practice Club

Raemon, Elizabeth, lynettebye and Alex_Altair

14 Nov 2023 18:24 UTC

62 points

11 comments22 min readLW link

More metal less ore

Logan Kieller14 Nov 2023 16:59 UTC

10 points

3 comments2 min readLW link

(logankieller.substack.com)

Monthly Roundup #12: November 2023

Zvi14 Nov 2023 15:20 UTC

34 points

5 comments33 min readLW link

(thezvi.wordpress.com)

Do you want a first-principled preparedness guide to prepare yourself and loved ones for potential catastrophes?

Ulrik Horn14 Nov 2023 12:13 UTC

16 points

5 comments15 min readLW link

[Question] Is there Work on Embedded Agency in Cellular Automata Toy Models?

Johannes C. Mayer14 Nov 2023 9:08 UTC

10 points

0 comments1 min readLW link

[Question] Would this be Progress in Solving Embedded Agency?

Johannes C. Mayer14 Nov 2023 9:08 UTC

9 points

2 comments2 min readLW link

Is Interpretability All We Need?

RogerDearnaley14 Nov 2023 5:31 UTC

2 points

1 comment1 min readLW link

What is wisdom?

TsviBT14 Nov 2023 2:13 UTC

47 points

3 comments13 min readLW link

Festival Stats 2023

jefftk14 Nov 2023 1:20 UTC

9 points

0 comments1 min readLW link

(www.jefftk.com)

Out of the Box

jesseduffield13 Nov 2023 23:43 UTC

5 points

1 comment7 min readLW link