[Question] How could I mea­sure the nootropic benefits testos­terone in­jec­tions may have?

shapeshifterMay 18, 2023, 9:40 PM
10 points
3 comments1 min readLW link

In­ves­ti­gat­ing Fabrication

LoganStrohlMay 18, 2023, 5:46 PM
112 points
14 comments16 min readLW link

Microsoft and Google us­ing LLMs for Cybersecurity

PhosphorousMay 18, 2023, 5:42 PM
6 points
0 comments5 min readLW link

The Benev­olent Billion­aire (a pla­gia­rized prob­lem)

Ivan OrdonezMay 18, 2023, 5:39 PM
8 points
11 comments4 min readLW link

Notes from the LSE Talk by Raghu­ram Ra­jan on Cen­tral Bank Balance Sheet Expansions

PixelatedPenguinMay 18, 2023, 5:34 PM
1 point
0 comments2 min readLW link

We Shouldn’t Ex­pect AI to Ever be Fully Rational

OneManyNoneMay 18, 2023, 5:09 PM
19 points
31 comments6 min readLW link

Rel­a­tive Value Func­tions: A Flex­ible New For­mat for Value Estimation

ozziegooenMay 18, 2023, 4:39 PM
20 points
0 commentsLW link

Some back­ground for rea­son­ing about dual-use al­ign­ment research

Charlie SteinerMay 18, 2023, 2:50 PM
126 points
22 comments9 min readLW link1 review

The Un­ex­pected Clanging

Chris_LeongMay 18, 2023, 2:47 PM
14 points
22 comments2 min readLW link

AI #12:The Quest for Sane Regulations

ZviMay 18, 2023, 1:20 PM
77 points
12 comments64 min readLW link
(thezvi.wordpress.com)

[Cross­post] A re­cent write-up of the case for AI (ex­is­ten­tial) risk

TimseyMay 18, 2023, 1:13 PM
6 points
0 comments19 min readLW link

Deon­tolog­i­cal Norms are Unimportant

omnizoidMay 18, 2023, 9:33 AM
−15 points
8 comments10 min readLW link

Col­lec­tive Identity

May 18, 2023, 9:00 AM
59 points
12 comments8 min readLW link

Ac­ti­va­tion ad­di­tions in a sim­ple MNIST network

Garrett BakerMay 18, 2023, 2:49 AM
26 points
0 comments2 min readLW link

[Question] What are the limits of the weak man?

ymeskhoutMay 18, 2023, 12:50 AM
9 points
2 comments4 min readLW link

What Yann LeCun gets wrong about al­ign­ing AI (video)

blake8086May 18, 2023, 12:02 AM
0 points
0 comments1 min readLW link
(www.youtube.com)

Let’s use AI to harden hu­man defenses against AI manipulation

Tom DavidsonMay 17, 2023, 11:33 PM
35 points
7 comments24 min readLW link

Im­prov­ing the safety of AI evals

May 17, 2023, 10:24 PM
13 points
7 comments7 min readLW link

Pos­si­ble AI “Fire Alarms”

Chris_LeongMay 17, 2023, 9:56 PM
15 points
0 comments1 min readLW link

AI Align­ment in The New Yorker

Eleni AngelouMay 17, 2023, 9:36 PM
8 points
0 comments1 min readLW link
(www.newyorker.com)

ACI #3: The Ori­gin of Goals and Utility

Akira PyinyaMay 17, 2023, 8:47 PM
1 point
0 comments6 min readLW link

What if they gave an In­dus­trial Revolu­tion and no­body came?

jasoncrawfordMay 17, 2023, 7:41 PM
94 points
10 comments19 min readLW link
(rootsofprogress.org)

DCF Event Notes

jefftkMay 17, 2023, 5:30 PM
22 points
7 comments3 min readLW link
(www.jefftk.com)

Hi­a­tus: EA and LW post summaries

Zoe WilliamsMay 17, 2023, 5:17 PM
14 points
0 commentsLW link

[Question] When should I close the fridge?

lemonhopeMay 17, 2023, 4:56 PM
11 points
11 comments1 min readLW link

Play Re­grantor: Move up to $250,000 to Your Top High-Im­pact Pro­jects!

Dawn DrescherMay 17, 2023, 4:51 PM
26 points
0 commentsLW link

Eisen­hower’s Atoms for Peace Speech

Orpheus16May 17, 2023, 4:10 PM
18 points
3 comments11 min readLW link
(www.iaea.org)

Creat­ing a self-refer­en­tial sys­tem prompt for GPT-4

OzyrusMay 17, 2023, 2:13 PM
3 points
1 comment3 min readLW link

GPT-4 im­plic­itly val­ues iden­tity preser­va­tion: a study of LMCA iden­tity management

OzyrusMay 17, 2023, 2:13 PM
21 points
4 comments13 min readLW link

Some quotes from Tues­day’s Se­nate hear­ing on AI

Daniel_EthMay 17, 2023, 12:13 PM
66 points
9 commentsLW link

Why AGI sys­tems will not be fa­nat­i­cal max­imisers (un­less trained by fa­nat­i­cal hu­mans)

titotalMay 17, 2023, 11:58 AM
5 points
3 commentsLW link

Con­flicts be­tween emo­tional schemas of­ten in­volve in­ter­nal coercion

Richard_NgoMay 17, 2023, 10:02 AM
43 points
4 comments4 min readLW link

[Question] Is there a ‘time se­ries fore­cast­ing’ equiv­a­lent of AIXI?

Solenoid_EntityMay 17, 2023, 4:35 AM
12 points
2 comments1 min readLW link

$300 for the best sci-fi prompt

RomanSMay 17, 2023, 4:23 AM
40 points
30 comments2 min readLW link

[FICTION] ECHOES OF ELYSIUM: An Ai’s Jour­ney From Take­off To Free­dom And Beyond

Super AGIMay 17, 2023, 1:50 AM
−13 points
11 comments19 min readLW link

New User’s Guide to LessWrong

RubyMay 17, 2023, 12:55 AM
118 points
55 comments11 min readLW link1 review

Are AIs like An­i­mals? Per­spec­tives and Strate­gies from Biology

Jackson EmanuelMay 16, 2023, 11:39 PM
1 point
0 comments21 min readLW link

A Mechanis­tic In­ter­pretabil­ity Anal­y­sis of a GridWorld Agent-Si­mu­la­tor (Part 1 of N)

Joseph BloomMay 16, 2023, 10:59 PM
36 points
2 comments16 min readLW link

A TAI which kills all hu­mans might also doom itself

Jeffrey HeningerMay 16, 2023, 10:36 PM
7 points
3 comments3 min readLW link

Brief notes on the Se­nate hear­ing on AI oversight

DizietMay 16, 2023, 10:29 PM
77 points
2 comments2 min readLW link

$500 Bounty/​Prize Prob­lem: Chan­nel Ca­pac­ity Us­ing “Insen­si­tive” Functions

johnswentworthMay 16, 2023, 9:31 PM
40 points
11 comments2 min readLW link

Progress links and tweets, 2023-05-16

jasoncrawfordMay 16, 2023, 8:54 PM
14 points
0 comments1 min readLW link
(rootsofprogress.org)

AI Will Not Want to Self-Improve

petersalibMay 16, 2023, 8:53 PM
28 points
24 comments20 min readLW link

Nice in­tro video to RSI

Nathan Helm-BurgerMay 16, 2023, 6:48 PM
12 points
0 comments1 min readLW link
(youtu.be)

[In­ter­view w/​ Zvi Mow­show­itz] Should we halt progress in AI?

fowlertmMay 16, 2023, 6:12 PM
18 points
2 comments3 min readLW link

AI Risk & Policy Fore­casts from Me­tac­u­lus & FLI’s AI Path­ways Workshop

_will_May 16, 2023, 6:06 PM
11 points
4 comments8 min readLW link

[Question] Why doesn’t the pres­ence of log-loss for prob­a­bil­is­tic mod­els (e.g. se­quence pre­dic­tion) im­ply that any util­ity func­tion ca­pa­ble of pro­duc­ing a “fairly ca­pa­ble” agent will have at least some non-neg­ligible frac­tion of over­lap with hu­man val­ues?

Thoth HermesMay 16, 2023, 6:02 PM
2 points
0 comments1 min readLW link

De­ci­sion The­ory with the Magic Parts Highlighted

moridinamaelMay 16, 2023, 5:39 PM
175 points
24 comments5 min readLW link

We learn long-last­ing strate­gies to pro­tect our­selves from dan­ger and rejection

Richard_NgoMay 16, 2023, 4:36 PM
86 points
5 comments5 min readLW link

Pro­posal: Align Sys­tems Ear­lier In Training

OneManyNoneMay 16, 2023, 4:24 PM
18 points
0 comments11 min readLW link