AI Fore­cast­ing: Two Years In

jsteinhardt19 Aug 2023 23:40 UTC
65 points
15 comments11 min readLW link
(bounded-regret.ghost.io)

Four man­age­ment/​lead­er­ship book summaries

nikola19 Aug 2023 23:38 UTC
11 points
0 comments7 min readLW link

In­ter­pret­ing a di­men­sion­al­ity re­duc­tion of a col­lec­tion of ma­tri­ces as two pos­i­tive semidefinite block di­ag­o­nal matrices

Joseph Van Name19 Aug 2023 19:52 UTC
15 points
2 comments5 min readLW link

Will AI kill ev­ery­one? Here’s what the god­fathers of AI have to say [RA video]

Writer19 Aug 2023 17:29 UTC
56 points
8 comments1 min readLW link
(youtu.be)

Ten vari­a­tions on red-pill-blue-pill

Richard_Kennaway19 Aug 2023 16:34 UTC
21 points
34 comments3 min readLW link

Are we run­ning out of new mu­sic/​movies/​art from a meta­phys­i­cal per­spec­tive? (up­dated)

stephen_s19 Aug 2023 16:24 UTC
4 points
23 comments1 min readLW link

[Question] Any ideas for a pre­dic­tion mar­ket ob­serv­able that quan­tifies “cul­ture-wari­sa­tion”?

Ppau19 Aug 2023 15:11 UTC
6 points
1 comment1 min readLW link

[Question] Clar­ify­ing how mis­al­ign­ment can arise from scal­ing LLMs

Util19 Aug 2023 14:16 UTC
3 points
1 comment1 min readLW link

Chess as a case study in hid­den ca­pa­bil­ities in ChatGPT

AdamYedidia19 Aug 2023 6:35 UTC
45 points
32 comments6 min readLW link

We can do bet­ter than DoWhatIMean (in­ex­tri­ca­bly kind AI)

lukehmiles19 Aug 2023 5:41 UTC
25 points
8 comments2 min readLW link

Su­per­vised Pro­gram for Align­ment Re­search (SPAR) at UC Berkeley: Spring 2023 summary

19 Aug 2023 2:27 UTC
20 points
2 comments6 min readLW link

Could fabs own AI?

lukehmiles19 Aug 2023 0:16 UTC
15 points
0 comments3 min readLW link

Is Chi­nese to­tal fac­tor pro­duc­tivity lower to­day than it was in 1956?

Ege Erdil18 Aug 2023 22:33 UTC
43 points
0 comments26 min readLW link

Ra­tion­al­ity-ish Mee­tups Show­case: 2019-2021

jenn18 Aug 2023 22:22 UTC
10 points
0 comments5 min readLW link

The U.S. is be­com­ing less stable

lc18 Aug 2023 21:13 UTC
135 points
66 comments2 min readLW link

Meetup Tip: Board Games

Screwtape18 Aug 2023 18:11 UTC
9 points
4 comments7 min readLW link

[Question] AI labs’ re­quests for input

Zach Stein-Perlman18 Aug 2023 17:00 UTC
29 points
0 comments1 min readLW link

6 non-ob­vi­ous men­tal health is­sues spe­cific to AI safety

Igor Ivanov18 Aug 2023 15:46 UTC
144 points
24 comments4 min readLW link

That time I went a lit­tle bit insane

belkarx18 Aug 2023 6:53 UTC
37 points
4 comments4 min readLW link

When dis­cussing AI doom bar­ri­ers pro­pose spe­cific plau­si­ble scenarios

anithite18 Aug 2023 4:06 UTC
5 points
0 comments3 min readLW link

Risks from AI Overview: Summary

18 Aug 2023 1:21 UTC
25 points
0 comments13 min readLW link
(www.safe.ai)

Manag­ing risks of our own work

Beth Barnes18 Aug 2023 0:41 UTC
66 points
0 comments2 min readLW link

ACI#5: From Hu­man-AI Co-evolu­tion to the Evolu­tion of Value Systems

Akira Pyinya18 Aug 2023 0:38 UTC
0 points
0 comments9 min readLW link

Memetic Judo #1: On Dooms­day Prophets v.3

Max TK18 Aug 2023 0:14 UTC
25 points
17 comments3 min readLW link

Look­ing for judges for cri­tiques of Align­ment Plans

Iknownothing17 Aug 2023 22:35 UTC
5 points
0 comments1 min readLW link

How is ChatGPT’s be­hav­ior chang­ing over time?

Phib17 Aug 2023 20:54 UTC
3 points
0 comments1 min readLW link
(arxiv.org)

Progress links di­gest, 2023-08-17: Cloud seed­ing, robotic sculp­tors, and rogue planets

jasoncrawford17 Aug 2023 20:29 UTC
15 points
1 comment4 min readLW link
(rootsofprogress.org)

Model of psy­chosis, take 2

Steven Byrnes17 Aug 2023 19:11 UTC
32 points
2 comments7 min readLW link

[Linkpost] Ro­bus­tified ANNs Re­veal Worm­holes Between Hu­man Cat­e­gory Percepts

Bogdan Ionut Cirstea17 Aug 2023 19:10 UTC
6 points
2 comments1 min readLW link

Against Al­most Every The­ory of Im­pact of Interpretability

Charbel-Raphaël17 Aug 2023 18:44 UTC
315 points
83 comments26 min readLW link

Goldilocks and the Three Optimisers

dkl917 Aug 2023 18:15 UTC
−10 points
0 comments5 min readLW link
(dkl9.net)

An­nounc­ing Fore­sight In­sti­tute’s AI Safety Grants Program

Allison Duettmann17 Aug 2023 17:34 UTC
35 points
2 comments1 min readLW link

The Ne­gen­tropy Cliff

mephistopheles17 Aug 2023 17:08 UTC
6 points
10 comments1 min readLW link

“AI Wel­lbe­ing” and the On­go­ing De­bate on Phenom­e­nal Consciousness

FlorianH17 Aug 2023 15:47 UTC
10 points
6 comments7 min readLW link

AI #25: In­flec­tion Point

Zvi17 Aug 2023 14:40 UTC
59 points
9 comments36 min readLW link
(thezvi.wordpress.com)

[Question] Why might Gen­eral In­tel­li­gences have long term goals?

yrimon17 Aug 2023 14:10 UTC
3 points
17 comments1 min readLW link

Un­der­stand­ing Coun­ter­bal­anced Sub­trac­tions for Bet­ter Ac­ti­va­tion Additions

ojorgensen17 Aug 2023 13:53 UTC
21 points
0 comments14 min readLW link

Reflec­tions on “Mak­ing the Atomic Bomb”

boazbarak17 Aug 2023 2:48 UTC
51 points
7 comments8 min readLW link

Au­tonomous repli­ca­tion and adap­ta­tion: an at­tempt at a con­crete dan­ger threshold

Hjalmar_Wijk17 Aug 2023 1:31 UTC
42 points
0 comments13 min readLW link

[Question] (Thought ex­per­i­ment) If you had to choose, which would you pre­fer?

kuira17 Aug 2023 0:57 UTC
9 points
2 comments1 min readLW link

Some rules for life (v.0,0)

Neil 17 Aug 2023 0:43 UTC
33 points
13 comments12 min readLW link
(neilwarren.substack.com)

When AI cri­tique works even with mis­al­igned models

Fabien Roger17 Aug 2023 0:12 UTC
23 points
0 comments2 min readLW link

Book Launch: “The Carv­ing of Real­ity,” Best of LessWrong vol. III

Raemon16 Aug 2023 23:52 UTC
131 points
22 comments5 min readLW link

One ex­am­ple of how LLM pro­pa­ganda at­tacks can hack the brain

trevor16 Aug 2023 21:41 UTC
24 points
8 comments4 min readLW link

If we had known the at­mo­sphere would ignite

Jeffs16 Aug 2023 20:28 UTC
53 points
49 comments2 min readLW link

Stampy’s AI Safety Info—New Distil­la­tions #4 [July 2023]

markov16 Aug 2023 19:03 UTC
22 points
10 comments1 min readLW link
(aisafety.info)

A Proof of Löb’s The­o­rem us­ing Com­putabil­ity Theory

jessicata16 Aug 2023 18:57 UTC
71 points
0 comments17 min readLW link
(unstableontology.com)

Sum­mary of and Thoughts on the Hotz/​Yud­kowsky Debate

Zvi16 Aug 2023 16:50 UTC
105 points
47 comments9 min readLW link
(thezvi.wordpress.com)

Red Pill vs Blue Pill, Bayes style

ErickBall16 Aug 2023 15:23 UTC
26 points
33 comments1 min readLW link

What does it mean to “trust sci­ence”?

jasoncrawford16 Aug 2023 14:56 UTC
34 points
9 comments1 min readLW link
(rootsofprogress.org)