3 Apr 2023 23:09 UTC

11 points

0 comments1 min readLW link

(forum.effectivealtruism.org)

Twin Cities ACX Meetup—April 2023

Timothy M.3 Apr 2023 23:07 UTC

5 points

3 comments1 min readLW link

Communicating effectively under Knightian norms

Richard_Ngo3 Apr 2023 22:39 UTC

88 points

54 comments6 min readLW link

If interpretability research goes well, it may get dangerous

So8res3 Apr 2023 21:48 UTC

197 points

10 comments2 min readLW link

Towards empathy in RL agents and beyond: Insights from cognitive science for AI Alignment

Marc Carauleanu3 Apr 2023 19:59 UTC

15 points

6 comments1 min readLW link

(clipchamp.com)

Monthly Roundup #5: April 2023

Zvi3 Apr 2023 18:50 UTC

26 points

12 comments14 min readLW link

(thezvi.wordpress.com)

Exploring non-anthropocentric aspects of AI existential safety

mishka3 Apr 2023 18:07 UTC

8 points

0 comments3 min readLW link

[Question] GJP on AGI

Suh_Prance_Alot3 Apr 2023 17:21 UTC

2 points

0 comments1 min readLW link

Do we have a plan for the “first critical try” problem?

Christopher King3 Apr 2023 16:27 UTC

−3 points

14 comments1 min readLW link

Exploratory Analysis of RLHF Transformers with TransformerLens

Curt Tigges3 Apr 2023 16:09 UTC

21 points

2 comments11 min readLW link

(blog.eleuther.ai)

AWS Has Raised Prices Before

jefftk3 Apr 2023 16:00 UTC

7 points

3 comments1 min readLW link

(www.jefftk.com)

Mati’s introduction to pausing giant AI experiments

Mati_Roy3 Apr 2023 15:56 UTC

7 points

0 comments2 min readLW link

Superintelligence will outsmart us or it isn’t superintelligence

Neil 3 Apr 2023 15:01 UTC

−4 points

4 comments1 min readLW link

AI-kills-everyone scenarios require robotic infrastructure, but not necessarily nanotech

avturchin3 Apr 2023 12:45 UTC

52 points

47 comments4 min readLW link

Orthogonality is expensive

beren3 Apr 2023 10:20 UTC

34 points

8 comments3 min readLW link

Repeated Play of Imperfect Newcomb’s Paradox in Infra-Bayesian Physicalism

Sven Nilsen3 Apr 2023 10:06 UTC

2 points

0 comments2 min readLW link

Effective Altruism Virtual Programs Apr-May 2023

Yve Nichols-Evans3 Apr 2023 6:40 UTC

1 point

0 comments1 min readLW link

Board Game Theory

Optimization Process3 Apr 2023 6:23 UTC

8 points

0 comments3 min readLW link

Planecrash Podcast

planecrashpodcast3 Apr 2023 4:34 UTC

10 points

5 comments1 min readLW link

[Question] I’m just starting to grasp Shard Theory. Is that a normal feeling?

twkaiser3 Apr 2023 3:08 UTC

−20 points

1 comment1 min readLW link

Rules for living in a 99.9+% lizardman world

at_the_zoo3 Apr 2023 2:39 UTC

−1 points

12 comments1 min readLW link

The Friendly Drunk Fool Alignment Strategy

JenniferRM3 Apr 2023 1:26 UTC

23 points

18 comments11 min readLW link

Slack Group: Rationalist Startup Founders

Adam Zerner3 Apr 2023 0:44 UTC

31 points

0 comments3 min readLW link

Orthogonality is Expensive

DragonGod3 Apr 2023 0:43 UTC

21 points

3 comments1 min readLW link

(www.beren.io)

GTP4 capable of limited recursive improving?

Boris Kashirin2 Apr 2023 21:38 UTC

2 points

3 comments1 min readLW link

[Question] Scared about the future of AI

eitan weiss2 Apr 2023 20:37 UTC

−1 points

0 comments1 min readLW link

“a dialogue with myself concerning eliezer yudkowsky” (not author)

the gears to ascension2 Apr 2023 20:12 UTC

13 points

18 comments3 min readLW link

Fine-insured bounties as AI deterrent

Virtual Instinct2 Apr 2023 19:44 UTC

1 point

0 comments2 min readLW link

[Question] What could EA’s new name be?

trevor2 Apr 2023 19:25 UTC

17 points

20 comments2 min readLW link

Exposure to Lizardman is Lethal

[DEACTIVATED] Duncan Sabien2 Apr 2023 18:57 UTC

70 points

96 comments3 min readLW link

Talkbox Bagpipe Drones

jefftk2 Apr 2023 18:50 UTC

5 points

0 comments1 min readLW link

(www.jefftk.com)

[Question] Is there a LessWrong-adjacent place to hire freelancers/seek freelance work?

nonzerosum2 Apr 2023 16:39 UTC

5 points

3 comments1 min readLW link

AISC 2023, Progress Report for March: Team Interpretable Architectures

Robert Kralisch, Eris, teahorse and Sohaib Imran

2 Apr 2023 16:19 UTC

14 points

0 comments14 min readLW link

Ultimate ends may be easily hidable behind convergent subgoals

TsviBT2 Apr 2023 14:51 UTC

57 points

4 comments22 min readLW link

Transparency for Generalizing Alignment from Toy Models

Johannes C. Mayer2 Apr 2023 10:47 UTC

13 points

3 comments4 min readLW link

Ask First

intellectronica2 Apr 2023 10:45 UTC

3 points

1 comment1 min readLW link

(intellectronica.net)

[Question] When should a neural network-based approach for plant control systems be preferred over a traditional control method?

Bob Guran2 Apr 2023 10:18 UTC

11 points

0 comments1 min readLW link

Pessimism about AI Safety

Max_He-Ho and Peter Kuhn

2 Apr 2023 7:43 UTC

4 points

1 comment25 min readLW link

Some lesser-known megaproject ideas

Linch2 Apr 2023 1:14 UTC

19 points

4 comments1 min readLW link

Analysis of GPT-4 competence in assessing complex legal language: Example of Bill C-11 of the Canadian Parliament. - Part 1

M. Y. Zuo2 Apr 2023 0:01 UTC

12 points

2 comments14 min readLW link

Policy discussions follow strong contextualizing norms

Richard_Ngo1 Apr 2023 23:51 UTC

230 points

61 comments3 min readLW link

A report about LessWrong karma volatility from a different universe

Ben Pace1 Apr 2023 21:48 UTC

176 points

7 comments1 min readLW link

A Confession about the LessWrong Team

Ruby1 Apr 2023 21:47 UTC

87 points

5 comments2 min readLW link

Shutting down AI is not enough. We need to destroy all technology.

Matthew Barnett1 Apr 2023 21:03 UTC

150 points

36 comments1 min readLW link

A policy guaranteed to increase AI timelines

Richard Korzekwa 1 Apr 2023 20:50 UTC

46 points

1 comment2 min readLW link

(aiimpacts.org)

Why I Think the Current Trajectory of AI Research has Low P(doom) - LLMs

GaPa1 Apr 2023 20:35 UTC

2 points

1 comment10 min readLW link

Repairing the Effort Asymmetry

[DEACTIVATED] Duncan Sabien1 Apr 2023 20:23 UTC

41 points

11 comments2 min readLW link

Draft: Inferring minimizers

Alex_Altair1 Apr 2023 20:20 UTC

9 points

0 comments1 min readLW link

AI Safety via Luck

Jozdien1 Apr 2023 20:13 UTC

76 points

7 comments11 min readLW link

Ho Chi Minh ACX Meetup

cygnus1 Apr 2023 19:41 UTC

1 point

0 comments1 min readLW link