An A.I. Safety Pre­sen­ta­tion at RIT

NicholasKross27 Mar 2023 23:49 UTC
8 points
0 comments1 min readLW link
(www.youtube.com)

Which AI out­puts should hu­mans check for shenani­gans, to avoid AI takeover? A sim­ple model

Tom Davidson27 Mar 2023 23:36 UTC
16 points
3 comments8 min readLW link

The Prospect of an AI Winter

Erich_Grunewald27 Mar 2023 20:55 UTC
62 points
24 comments15 min readLW link
(www.erichgrunewald.com)

[Question] Best ar­gu­ments against the out­side view that AGI won’t be a huge deal, thus we sur­vive.

Noosphere8927 Mar 2023 20:49 UTC
4 points
7 comments1 min readLW link

EA & LW Fo­rum Weekly Sum­mary (20th − 26th March 2023)

Zoe Williams27 Mar 2023 20:46 UTC
4 points
0 comments1 min readLW link

Three of my be­liefs about up­com­ing AGI

Robert_AIZI27 Mar 2023 20:27 UTC
6 points
0 comments3 min readLW link
(aizi.substack.com)

No­body knows how to re­li­ably test for AI safety

marcusarvan27 Mar 2023 19:48 UTC
1 point
0 comments5 min readLW link

New blog: Planned Obsolescence

Ajeya Cotra27 Mar 2023 19:46 UTC
96 points
7 comments1 min readLW link
(www.planned-obsolescence.org)

South Bay ACX/​SSC Spring Mee­tups Everywhere

allisona27 Mar 2023 19:39 UTC
2 points
0 comments1 min readLW link

[Question] Re­sources to see how peo­ple think/​ap­proach math­e­mat­ics and prob­lem-solving

zef27 Mar 2023 19:12 UTC
7 points
2 comments1 min readLW link

Stag­ger­ing Hunters

Screwtape27 Mar 2023 19:11 UTC
12 points
2 comments5 min readLW link

Neu­rotech­nol­ogy is Crit­i­cal for AI Alignment

Milan Cvitkovic27 Mar 2023 18:27 UTC
10 points
3 comments1 min readLW link
(milan.cvitkovic.net)

[Question] Best re­sources to learn philos­o­phy of mind and AI?

Sky Moo27 Mar 2023 18:22 UTC
1 point
0 comments1 min readLW link

the ten­sor is a lonely place

jml627 Mar 2023 18:22 UTC
−11 points
0 comments4 min readLW link
(ekjsgrjelrbno.substack.com)

[Question] Ber­mudez In­ter­face Problem

Motor Vehicle27 Mar 2023 18:11 UTC
1 point
2 comments1 min readLW link

Would you be a bet­ter RLHF la­beler than GPT-4?

kache27 Mar 2023 18:10 UTC
1 point
1 comment1 min readLW link

LLM Pow­ered LW Search

odraode1727 Mar 2023 18:09 UTC
−1 points
0 comments1 min readLW link

An­nounc­ing the Swiss Ex­is­ten­tial Risk Ini­ti­a­tive (CHERI) 2023 Re­search Fellowship

Tobias H27 Mar 2023 16:36 UTC
3 points
0 comments1 min readLW link

In­dus­tri­al­iza­tion/​Com­put­er­i­za­tion Analogies

Gordon Seidoh Worley27 Mar 2023 16:34 UTC
16 points
2 comments2 min readLW link

Les­sons from Con­ver­gent Evolu­tion for AI Alignment

27 Mar 2023 16:25 UTC
53 points
9 comments8 min readLW link

GPT-4 is bad at strate­gic thinking

Christopher King27 Mar 2023 15:11 UTC
22 points
8 comments1 min readLW link

The salt in pasta wa­ter fallacy

Thomas Sepulchre27 Mar 2023 14:53 UTC
157 points
38 comments3 min readLW link

CAIS-in­spired ap­proach to­wards safer and more in­ter­pretable AGIs

Peter Hroššo27 Mar 2023 14:36 UTC
13 points
7 comments1 min readLW link

An Overview of Sparks of Ar­tifi­cial Gen­eral In­tel­li­gence: Early ex­per­i­ments with GPT-4

Annapurna27 Mar 2023 13:44 UTC
10 points
0 comments7 min readLW link
(jorgevelez.substack.com)

A Hive­mind of GPT-4 bots REALLY IS A HIVEMIND!

Erlja Jkdf.27 Mar 2023 12:44 UTC
−10 points
1 comment1 min readLW link

Du­ploish Mar­ble Runs

jefftk27 Mar 2023 12:20 UTC
26 points
1 comment1 min readLW link
(www.jefftk.com)

GPT-4 Plugs In

Zvi27 Mar 2023 12:10 UTC
198 points
47 comments6 min readLW link
(thezvi.wordpress.com)

Please help me sense-check my as­sump­tions about the needs of the AI Safety com­mu­nity and re­lated ca­reer plans

peterslattery27 Mar 2023 8:23 UTC
6 points
4 comments2 min readLW link

Prac­ti­cal Pit­falls of Causal Scrubbing

27 Mar 2023 7:47 UTC
87 points
17 comments13 min readLW link

[Question] What If: An Earthquake in Taiwan?

Sable27 Mar 2023 7:31 UTC
8 points
2 comments1 min readLW link

What can we learn from Lex Frid­man’s in­ter­view with Sam Alt­man?

Karl von Wendt27 Mar 2023 6:27 UTC
56 points
22 comments9 min readLW link

[Question] Steel­man­ning OpenAI’s Short-Timelines Slow-Take­off Goal

FinalFormal227 Mar 2023 2:55 UTC
5 points
2 comments1 min readLW link

The de­fault out­come for al­igned AGI still looks pretty bad

GeneSmith27 Mar 2023 0:02 UTC
14 points
18 comments3 min readLW link

LLM Mo­du­lar­ity: The Separa­bil­ity of Ca­pa­bil­ities in Large Lan­guage Models

NickyP26 Mar 2023 21:57 UTC
98 points
3 comments41 min readLW link

Test­ing ChatGPT for white lies

twkaiser26 Mar 2023 21:32 UTC
3 points
2 comments6 min readLW link

Don’t take bad op­tions away from people

Dumbledore's Army26 Mar 2023 20:12 UTC
43 points
100 comments5 min readLW link

What would a com­pute mon­i­tor­ing plan look like? [Linkpost]

Akash26 Mar 2023 19:33 UTC
157 points
9 comments4 min readLW link
(arxiv.org)

[Question] GPT-4 Specs: 1 Trillion Pa­ram­e­ters?

infinibot2726 Mar 2023 18:56 UTC
6 points
8 comments1 min readLW link

Sen­tience in Machines—How Do We Test for This Ob­jec­tively?

Mayowa Osibodu26 Mar 2023 18:56 UTC
−2 points
0 comments2 min readLW link
(www.researchgate.net)

If it quacks like a duck...

RationalMindset26 Mar 2023 18:54 UTC
−4 points
0 comments4 min readLW link

Chronos­ta­sis: The Time-Cap­sule Co­nun­drum of Lan­guage Models

RationalMindset26 Mar 2023 18:54 UTC
−5 points
0 comments1 min readLW link

[Question] What hap­pens with log­i­cal in­duc­tion when...

Donald Hobson26 Mar 2023 18:31 UTC
18 points
2 comments1 min readLW link

Draft: In­tro­duc­tion to optimization

Alex_Altair26 Mar 2023 17:25 UTC
43 points
8 comments16 min readLW link

Chat bot as CEO at NetDragon Websoft

ChristianKl26 Mar 2023 16:01 UTC
8 points
2 comments1 min readLW link
(www.firstpost.com)

Dat­a­point: me­dian 10% AI x-risk men­tioned on Dutch pub­lic TV channel

Chris van Merwijk26 Mar 2023 12:50 UTC
17 points
1 comment1 min readLW link

[Question] How Poli­tics in­ter­acts with AI ?

qbolec26 Mar 2023 9:53 UTC
−18 points
4 comments1 min readLW link

De­scrip­tive vs. speci­fi­able values

TsviBT26 Mar 2023 9:10 UTC
17 points
2 comments2 min readLW link

The al­ign­ment sta­bil­ity problem

Seth Herd26 Mar 2023 2:10 UTC
24 points
10 comments4 min readLW link

Sur­vey on lifel­og­gers for a re­search project

Mati_Roy26 Mar 2023 0:02 UTC
20 points
0 comments1 min readLW link

Man­i­fold: If okay AGI, why?

Eliezer Yudkowsky25 Mar 2023 22:43 UTC
116 points
37 comments1 min readLW link
(manifold.markets)