Ruin­ing an ex­pected-log-money maximizer

philhAug 20, 2023, 9:20 PM
33 points
33 comments1 min readLW link1 review
(reasonableapproximation.net)

Steven Wolfram on AI Alignment

Bill BenzonAug 20, 2023, 7:49 PM
66 points
15 comments4 min readLW link

[Question] What value does per­sonal pre­dic­tion track­ing have?

fxAug 20, 2023, 6:43 PM
7 points
3 comments1 min readLW link

Jan Kul­veit’s Cor­rigi­bil­ity Thoughts Distilled

brookAug 20, 2023, 5:52 PM
22 points
1 comment5 min readLW link

Memetic Judo #3: The In­tel­li­gence of Stochas­tic Par­rots v.2

Max TKAug 20, 2023, 3:18 PM
8 points
33 comments6 min readLW link

ACX/​SSC Boulder meetup- Septem­ber 23

Josh SacksAug 20, 2023, 2:16 PM
1 point
4 comments1 min readLW link

“Dirty con­cepts” in AI al­ign­ment dis­courses, and some guesses for how to deal with them

Aug 20, 2023, 9:13 AM
66 points
4 comments3 min readLW link

Call for Papers on Global AI Gover­nance from the UN

Chris_LeongAug 20, 2023, 8:56 AM
19 points
0 commentsLW link
(www.linkedin.com)

How do I read things on the internet

Vlad SitaloAug 20, 2023, 5:43 AM
16 points
2 comments8 min readLW link
(vlad.roam.garden)

AI Fore­cast­ing: Two Years In

jsteinhardtAug 19, 2023, 11:40 PM
72 points
15 comments11 min readLW link
(bounded-regret.ghost.io)

Four man­age­ment/​lead­er­ship book summaries

Nikola JurkovicAug 19, 2023, 11:38 PM
25 points
2 comments7 min readLW link

In­ter­pret­ing a di­men­sion­al­ity re­duc­tion of a col­lec­tion of ma­tri­ces as two pos­i­tive semidefinite block di­ag­o­nal matrices

Joseph Van NameAug 19, 2023, 7:52 PM
16 points
2 comments5 min readLW link

Will AI kill ev­ery­one? Here’s what the god­fathers of AI have to say [RA video]

WriterAug 19, 2023, 5:29 PM
58 points
8 commentsLW link
(youtu.be)

Ten vari­a­tions on red-pill-blue-pill

Richard_KennawayAug 19, 2023, 4:34 PM
23 points
34 comments3 min readLW link

Are we run­ning out of new mu­sic/​movies/​art from a meta­phys­i­cal per­spec­tive? (up­dated)

stephen_sAug 19, 2023, 4:24 PM
4 points
23 comments1 min readLW link

[Question] Any ideas for a pre­dic­tion mar­ket ob­serv­able that quan­tifies “cul­ture-wari­sa­tion”?

PpauAug 19, 2023, 3:11 PM
6 points
1 comment1 min readLW link

[Question] Clar­ify­ing how mis­al­ign­ment can arise from scal­ing LLMs

UtilAug 19, 2023, 2:16 PM
3 points
1 comment1 min readLW link

Chess as a case study in hid­den ca­pa­bil­ities in ChatGPT

AdamYedidiaAug 19, 2023, 6:35 AM
47 points
32 comments6 min readLW link

We can do bet­ter than DoWhatIMean (in­ex­tri­ca­bly kind AI)

lemonhopeAug 19, 2023, 5:41 AM
25 points
8 comments2 min readLW link

Su­per­vised Pro­gram for Align­ment Re­search (SPAR) at UC Berkeley: Spring 2023 summary

Aug 19, 2023, 2:27 AM
23 points
2 comments6 min readLW link

Could fabs own AI?

lemonhopeAug 19, 2023, 12:16 AM
15 points
0 comments3 min readLW link

Is Chi­nese to­tal fac­tor pro­duc­tivity lower to­day than it was in 1956?

Ege ErdilAug 18, 2023, 10:33 PM
43 points
0 comments26 min readLW link

Ra­tion­al­ity-ish Mee­tups Show­case: 2019-2021

jennAug 18, 2023, 10:22 PM
10 points
0 comments5 min readLW link

The U.S. is be­com­ing less stable

lcAug 18, 2023, 9:13 PM
149 points
68 comments2 min readLW link

Meetup Tip: Board Games

ScrewtapeAug 18, 2023, 6:11 PM
10 points
4 comments7 min readLW link

[Question] AI labs’ re­quests for input

Zach Stein-PerlmanAug 18, 2023, 5:00 PM
29 points
0 comments1 min readLW link

6 non-ob­vi­ous men­tal health is­sues spe­cific to AI safety

Igor IvanovAug 18, 2023, 3:46 PM
147 points
24 comments4 min readLW link

When dis­cussing AI doom bar­ri­ers pro­pose spe­cific plau­si­ble scenarios

anithiteAug 18, 2023, 4:06 AM
5 points
0 comments3 min readLW link

Risks from AI Overview: Summary

Aug 18, 2023, 1:21 AM
25 points
1 comment13 min readLW link
(www.safe.ai)

Manag­ing risks of our own work

Beth BarnesAug 18, 2023, 12:41 AM
66 points
0 comments2 min readLW link

ACI#5: From Hu­man-AI Co-evolu­tion to the Evolu­tion of Value Systems

Akira PyinyaAug 18, 2023, 12:38 AM
0 points
0 comments9 min readLW link

Memetic Judo #1: On Dooms­day Prophets v.3

Max TKAug 18, 2023, 12:14 AM
25 points
17 comments3 min readLW link

Look­ing for judges for cri­tiques of Align­ment Plans

IknownothingAug 17, 2023, 10:35 PM
6 points
0 comments1 min readLW link

How is ChatGPT’s be­hav­ior chang­ing over time?

worseAug 17, 2023, 8:54 PM
3 points
0 comments1 min readLW link
(arxiv.org)

Progress links di­gest, 2023-08-17: Cloud seed­ing, robotic sculp­tors, and rogue planets

jasoncrawfordAug 17, 2023, 8:29 PM
15 points
1 comment4 min readLW link
(rootsofprogress.org)

Model of psy­chosis, take 2

Steven ByrnesAug 17, 2023, 7:11 PM
34 points
13 comments4 min readLW link

[Linkpost] Ro­bus­tified ANNs Re­veal Worm­holes Between Hu­man Cat­e­gory Percepts

Bogdan Ionut CirsteaAug 17, 2023, 7:10 PM
6 points
2 comments1 min readLW link

Against Al­most Every The­ory of Im­pact of Interpretability

Charbel-RaphaëlAug 17, 2023, 6:44 PM
329 points
91 comments26 min readLW link2 reviews

Goldilocks and the Three Optimisers

dkl9Aug 17, 2023, 6:15 PM
−10 points
0 comments5 min readLW link
(dkl9.net)

An­nounc­ing Fore­sight In­sti­tute’s AI Safety Grants Program

Allison DuettmannAug 17, 2023, 5:34 PM
35 points
2 comments1 min readLW link

The Ne­gen­tropy Cliff

mephistophelesAug 17, 2023, 5:08 PM
6 points
10 comments1 min readLW link

“AI Wel­lbe­ing” and the On­go­ing De­bate on Phenom­e­nal Consciousness

FlorianHAug 17, 2023, 3:47 PM
10 points
6 comments7 min readLW link

AI #25: In­flec­tion Point

ZviAug 17, 2023, 2:40 PM
59 points
9 comments36 min readLW link
(thezvi.wordpress.com)

[Question] Why might Gen­eral In­tel­li­gences have long term goals?

yrimonAug 17, 2023, 2:10 PM
3 points
17 comments1 min readLW link

Un­der­stand­ing Coun­ter­bal­anced Sub­trac­tions for Bet­ter Ac­ti­va­tion Additions

ojorgensenAug 17, 2023, 1:53 PM
21 points
0 comments14 min readLW link

Reflec­tions on “Mak­ing the Atomic Bomb”

boazbarakAug 17, 2023, 2:48 AM
51 points
7 comments8 min readLW link

Au­tonomous repli­ca­tion and adap­ta­tion: an at­tempt at a con­crete dan­ger threshold

Hjalmar_WijkAug 17, 2023, 1:31 AM
45 points
0 comments13 min readLW link

[Question] (Thought ex­per­i­ment) If you had to choose, which would you pre­fer?

kuiraAug 17, 2023, 12:57 AM
9 points
2 comments1 min readLW link

Some rules for life (v.0,0)

Neil Aug 17, 2023, 12:43 AM
43 points
13 comments12 min readLW link
(neilwarren.substack.com)

When AI cri­tique works even with mis­al­igned models

Fabien RogerAug 17, 2023, 12:12 AM
23 points
0 comments2 min readLW link