All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 20232024

All JanFebMar Apr May

All 123 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

Running a Prediction Market Mafia Game

Arjun Panickssery1 Feb 2024 23:24 UTC

22 points

5 comments1 min readLW link

(arjunpanickssery.substack.com)

Evaluating Stability of Unreflective Alignment

james.lucassen1 Feb 2024 22:15 UTC

30 points

3 comments18 min readLW link

(jlucassen.com)

Davidad’s Provably Safe AI Architecture—ARIA’s Programme Thesis

simeon_c1 Feb 2024 21:30 UTC

69 points

17 comments1 min readLW link

(www.aria.org.uk)

Alignment has a Basin of Attraction: Beyond the Orthogonality Thesis

RogerDearnaley1 Feb 2024 21:15 UTC

4 points

15 comments13 min readLW link

OpenAI report also finds no effect of current LLMs on viability of bioterrorism attacks

lberglund1 Feb 2024 20:18 UTC

19 points

4 comments2 min readLW link

(openai.com)

Wrong answer bias

lukehmiles1 Feb 2024 20:05 UTC

49 points

24 comments1 min readLW link

On Not Requiring Vaccination

jefftk1 Feb 2024 19:20 UTC

31 points

21 comments1 min readLW link

(www.jefftk.com)

The economy is mostly newbs (strat predictions)

lukehmiles1 Feb 2024 19:15 UTC

27 points

6 comments2 min readLW link

Managing risks while trying to do good

Wei Dai1 Feb 2024 18:08 UTC

58 points

26 comments1 min readLW link

Putting multimodal LLMs to the Tetris test

Lovre and gabrielagc

1 Feb 2024 16:02 UTC

30 points

5 comments7 min readLW link

AI #49: Bioweapon Testing Begins

Zvi1 Feb 2024 15:30 UTC

37 points

11 comments42 min readLW link

(thezvi.wordpress.com)

Some Notes on Ethics

Pareto Optimal1 Feb 2024 10:18 UTC

−3 points

0 comments1 min readLW link

(paretooptimal.substack.com)

Increasingly vague interpersonal welfare comparisons

MichaelStJules1 Feb 2024 6:45 UTC

5 points

0 comments1 min readLW link

PIBBSS Speaker events comings up in February

DusanDNesic, Nora_Ammann and Lucas Teixeira

1 Feb 2024 3:28 UTC

10 points

2 comments1 min readLW link

Drone Wars Endgame

RussellThor1 Feb 2024 2:30 UTC

34 points

71 comments8 min readLW link

Sequencing Swabs

jefftk1 Feb 2024 1:50 UTC

19 points

1 comment5 min readLW link

(www.jefftk.com)

Leading The Parade

johnswentworth31 Jan 2024 22:39 UTC

142 points

30 comments9 min readLW link

Proposal for an AI Safety Prize

sweenesm31 Jan 2024 18:35 UTC

3 points

0 comments2 min readLW link

Literally Everything is Infinite

Spiral31 Jan 2024 18:31 UTC

−10 points

8 comments5 min readLW link

What fuels your ambition?

Cissy31 Jan 2024 18:30 UTC

29 points

1 comment5 min readLW link

(www.moremyself.xyz)

“Genlangs” and Zipf’s Law: Do languages generated by ChatGPT statistically look human?

Justin-Diamond31 Jan 2024 18:30 UTC

2 points

2 comments1 min readLW link

(arxiv.org)

AI, Intellectual Property, and the Techno-Optimist Revolution

Justin-Diamond31 Jan 2024 18:30 UTC

1 point

0 comments1 min readLW link

(www.researchgate.net)

A response to an attempted rebuttal of maximising ethics

JacobBowden31 Jan 2024 17:49 UTC

−5 points

8 comments3 min readLW link

My Alignment “Plan”: Avoid Strong Optimisation and Align Economy

VojtaKovarik31 Jan 2024 17:03 UTC

24 points

9 comments7 min readLW link

Where freedom comes from

Logan Kieller31 Jan 2024 16:53 UTC

−5 points

1 comment3 min readLW link

(logankieller.substack.com)

Per protocol analysis as medical malpractice

braces31 Jan 2024 16:22 UTC

53 points

8 comments1 min readLW link

Adam Smith Meets AI Doomers

James_Miller31 Jan 2024 15:53 UTC

24 points

9 comments5 min readLW link

Ten Modes of Culture War Discourse

jchan31 Jan 2024 13:58 UTC

54 points

15 comments15 min readLW link

Without Fundamental Advances, Rebellion and Coup d’État are the Inevitable Outcomes of Dictators & Monarchs Trying to Control Large, Capable Countries

Roko31 Jan 2024 10:14 UTC

27 points

34 comments1 min readLW link

Explaining Impact Markets

Saul Munn31 Jan 2024 9:51 UTC

95 points

2 comments3 min readLW link

(www.brasstacks.blog)

Exploring OpenAI’s Latent Directions: Tests, Observations, and Poking Around

Johnny Lin31 Jan 2024 6:01 UTC

26 points

4 comments14 min readLW link

Clip keys together with tiny carabiners

Brendan Long31 Jan 2024 4:26 UTC

10 points

5 comments1 min readLW link

The problem with proportional extrapolation

pathos_bot30 Jan 2024 23:40 UTC

6 points

0 comments1 min readLW link

Counterfactual Mechanism Networks

StrivingForLegibility30 Jan 2024 20:30 UTC

4 points

0 comments5 min readLW link

Control vs Selection: Civilisation is best at control, but navigating AGI requires selection

VojtaKovarik30 Jan 2024 19:06 UTC

7 points

1 comment1 min readLW link

AI governance frames

NathanBarnard30 Jan 2024 18:18 UTC

3 points

0 comments3 min readLW link

Deciding What Project/Org to Start: A Guide to Prioritization Research

Alexandra Bos30 Jan 2024 18:13 UTC

8 points

0 comments1 min readLW link

on neodymium magnets

bhauth30 Jan 2024 15:58 UTC

47 points

6 comments4 min readLW link

(www.bhauth.com)

[Question] Can we create self-improving AIs that perfect their own ethics?

Gabi QUENE30 Jan 2024 14:45 UTC

1 point

10 comments1 min readLW link

Childhood and Education Roundup #4

Zvi30 Jan 2024 13:50 UTC

43 points

10 comments24 min readLW link

(thezvi.wordpress.com)

Last call for submissions for TAIS 2024!

Blaine30 Jan 2024 12:08 UTC

4 points

0 comments1 min readLW link

(tais2024.cc)

[Question] Has anyone actually changed their mind regarding Sleeping Beauty problem?

Ape in the coat30 Jan 2024 8:34 UTC

14 points

50 comments1 min readLW link

San Fernando Valley Rationality: February 15, 2024

Thomas Broadley30 Jan 2024 4:40 UTC

3 points

0 comments1 min readLW link

The case for more ambitious language model evals

Jozdien30 Jan 2024 0:01 UTC

108 points

25 comments5 min readLW link

A short ‘derivation’ of Watanabe’s Free Energy Formula

Wuschel Schulz29 Jan 2024 23:41 UTC

13 points

6 comments7 min readLW link

How important is AI hacking as LLMs advance?

Artyom Karpov29 Jan 2024 18:41 UTC

1 point

0 comments6 min readLW link

LLM Psychometrics: A Speculative Approach to AI Safety

pskl29 Jan 2024 18:38 UTC

3 points

4 comments1 min readLW link

(pascal.cc)

[Question] How to write better?

TeaTieAndHat29 Jan 2024 17:02 UTC

7 points

24 comments1 min readLW link

Processor clock speeds are not how fast AIs think

Ege Erdil29 Jan 2024 14:39 UTC

129 points

55 comments2 min readLW link

Natural selection for ingame character build optimisation

Kongo Landwalker29 Jan 2024 11:34 UTC

8 points

5 comments2 min readLW link