Run­ning a Pre­dic­tion Mar­ket Mafia Game

Arjun Panickssery1 Feb 2024 23:24 UTC
22 points
5 comments1 min readLW link
(arjunpanickssery.substack.com)

Eval­u­at­ing Sta­bil­ity of Un­re­flec­tive Alignment

james.lucassen1 Feb 2024 22:15 UTC
30 points
3 comments18 min readLW link
(jlucassen.com)

Davi­dad’s Prov­ably Safe AI Ar­chi­tec­ture—ARIA’s Pro­gramme Thesis

simeon_c1 Feb 2024 21:30 UTC
69 points
17 comments1 min readLW link
(www.aria.org.uk)

Align­ment has a Basin of At­trac­tion: Beyond the Orthog­o­nal­ity Thesis

RogerDearnaley1 Feb 2024 21:15 UTC
4 points
15 comments13 min readLW link

OpenAI re­port also finds no effect of cur­rent LLMs on vi­a­bil­ity of bioter­ror­ism attacks

lberglund1 Feb 2024 20:18 UTC
19 points
4 comments2 min readLW link
(openai.com)

Wrong an­swer bias

lukehmiles1 Feb 2024 20:05 UTC
49 points
24 comments1 min readLW link

On Not Re­quiring Vaccination

jefftk1 Feb 2024 19:20 UTC
31 points
21 comments1 min readLW link
(www.jefftk.com)

The econ­omy is mostly newbs (strat pre­dic­tions)

lukehmiles1 Feb 2024 19:15 UTC
27 points
6 comments2 min readLW link

Manag­ing risks while try­ing to do good

Wei Dai1 Feb 2024 18:08 UTC
58 points
26 comments1 min readLW link

Put­ting mul­ti­modal LLMs to the Tetris test

1 Feb 2024 16:02 UTC
30 points
5 comments7 min readLW link

AI #49: Bioweapon Test­ing Begins

Zvi1 Feb 2024 15:30 UTC
37 points
11 comments42 min readLW link
(thezvi.wordpress.com)

Some Notes on Ethics

Pareto Optimal1 Feb 2024 10:18 UTC
−3 points
0 comments1 min readLW link
(paretooptimal.substack.com)

In­creas­ingly vague in­ter­per­sonal welfare comparisons

MichaelStJules1 Feb 2024 6:45 UTC
5 points
0 comments1 min readLW link

PIBBSS Speaker events com­ings up in February

1 Feb 2024 3:28 UTC
10 points
2 comments1 min readLW link

Drone Wars Endgame

RussellThor1 Feb 2024 2:30 UTC
34 points
71 comments8 min readLW link

Se­quenc­ing Swabs

jefftk1 Feb 2024 1:50 UTC
19 points
1 comment5 min readLW link
(www.jefftk.com)

Lead­ing The Parade

johnswentworth31 Jan 2024 22:39 UTC
142 points
30 comments9 min readLW link

Pro­posal for an AI Safety Prize

sweenesm31 Jan 2024 18:35 UTC
3 points
0 comments2 min readLW link

Liter­ally Every­thing is Infinite

Spiral31 Jan 2024 18:31 UTC
−10 points
8 comments5 min readLW link

What fuels your am­bi­tion?

Cissy31 Jan 2024 18:30 UTC
29 points
1 comment5 min readLW link
(www.moremyself.xyz)

“Gen­langs” and Zipf’s Law: Do lan­guages gen­er­ated by ChatGPT statis­ti­cally look hu­man?

Justin-Diamond31 Jan 2024 18:30 UTC
2 points
2 comments1 min readLW link
(arxiv.org)

AI, In­tel­lec­tual Prop­erty, and the Techno-Op­ti­mist Revolution

Justin-Diamond31 Jan 2024 18:30 UTC
1 point
0 comments1 min readLW link
(www.researchgate.net)

A re­sponse to an at­tempted re­but­tal of max­imis­ing ethics

JacobBowden31 Jan 2024 17:49 UTC
−5 points
8 comments3 min readLW link

My Align­ment “Plan”: Avoid Strong Op­ti­mi­sa­tion and Align Economy

VojtaKovarik31 Jan 2024 17:03 UTC
24 points
9 comments7 min readLW link

Where free­dom comes from

Logan Kieller31 Jan 2024 16:53 UTC
−5 points
1 comment3 min readLW link
(logankieller.substack.com)

Per pro­to­col anal­y­sis as med­i­cal malpractice

braces31 Jan 2024 16:22 UTC
53 points
8 comments1 min readLW link

Adam Smith Meets AI Doomers

James_Miller31 Jan 2024 15:53 UTC
24 points
9 comments5 min readLW link

Ten Modes of Cul­ture War Discourse

jchan31 Jan 2024 13:58 UTC
54 points
15 comments15 min readLW link

Without Fun­da­men­tal Ad­vances, Re­bel­lion and Coup d’État are the Inevitable Out­comes of Dic­ta­tors & Monar­chs Try­ing to Con­trol Large, Ca­pable Countries

Roko31 Jan 2024 10:14 UTC
27 points
34 comments1 min readLW link

Ex­plain­ing Im­pact Markets

Saul Munn31 Jan 2024 9:51 UTC
95 points
2 comments3 min readLW link
(www.brasstacks.blog)

Ex­plor­ing OpenAI’s La­tent Direc­tions: Tests, Ob­ser­va­tions, and Pok­ing Around

Johnny Lin31 Jan 2024 6:01 UTC
26 points
4 comments14 min readLW link

Clip keys to­gether with tiny carabiners

Brendan Long31 Jan 2024 4:26 UTC
10 points
5 comments1 min readLW link

The prob­lem with pro­por­tional extrapolation

pathos_bot30 Jan 2024 23:40 UTC
6 points
0 comments1 min readLW link

Coun­ter­fac­tual Mechanism Networks

StrivingForLegibility30 Jan 2024 20:30 UTC
4 points
0 comments5 min readLW link

Con­trol vs Selec­tion: Civil­i­sa­tion is best at con­trol, but nav­i­gat­ing AGI re­quires selection

VojtaKovarik30 Jan 2024 19:06 UTC
7 points
1 comment1 min readLW link

AI gov­er­nance frames

NathanBarnard30 Jan 2024 18:18 UTC
3 points
0 comments3 min readLW link

De­cid­ing What Pro­ject/​Org to Start: A Guide to Pri­ori­ti­za­tion Research

Alexandra Bos30 Jan 2024 18:13 UTC
8 points
0 comments1 min readLW link

on neodymium magnets

bhauth30 Jan 2024 15:58 UTC
47 points
6 comments4 min readLW link
(www.bhauth.com)

[Question] Can we cre­ate self-im­prov­ing AIs that perfect their own ethics?

Gabi QUENE30 Jan 2024 14:45 UTC
1 point
10 comments1 min readLW link

Child­hood and Ed­u­ca­tion Roundup #4

Zvi30 Jan 2024 13:50 UTC
43 points
10 comments24 min readLW link
(thezvi.wordpress.com)

Last call for sub­mis­sions for TAIS 2024!

Blaine30 Jan 2024 12:08 UTC
4 points
0 comments1 min readLW link
(tais2024.cc)

[Question] Has any­one ac­tu­ally changed their mind re­gard­ing Sleep­ing Beauty prob­lem?

Ape in the coat30 Jan 2024 8:34 UTC
14 points
50 comments1 min readLW link

San Fer­nando Valley Ra­tion­al­ity: Fe­bru­ary 15, 2024

Thomas Broadley30 Jan 2024 4:40 UTC
3 points
0 comments1 min readLW link

The case for more am­bi­tious lan­guage model evals

Jozdien30 Jan 2024 0:01 UTC
108 points
25 comments5 min readLW link

A short ‘deriva­tion’ of Watan­abe’s Free En­ergy Formula

Wuschel Schulz29 Jan 2024 23:41 UTC
13 points
6 comments7 min readLW link

How im­por­tant is AI hack­ing as LLMs ad­vance?

Artyom Karpov29 Jan 2024 18:41 UTC
1 point
0 comments6 min readLW link

LLM Psy­cho­met­rics: A Spec­u­la­tive Ap­proach to AI Safety

pskl29 Jan 2024 18:38 UTC
3 points
4 comments1 min readLW link
(pascal.cc)

[Question] How to write bet­ter?

TeaTieAndHat29 Jan 2024 17:02 UTC
7 points
24 comments1 min readLW link

Pro­ces­sor clock speeds are not how fast AIs think

Ege Erdil29 Jan 2024 14:39 UTC
129 points
55 comments2 min readLW link

Nat­u­ral se­lec­tion for ingame char­ac­ter build optimisation

Kongo Landwalker29 Jan 2024 11:34 UTC
8 points
5 comments2 min readLW link