A rank­ing scale for how se­vere the side effects of solu­tions to AI x-risk are

Christopher King8 Mar 2023 22:53 UTC
3 points
0 comments2 min readLW link

Progress links and tweets, 2023-03-08

jasoncrawford8 Mar 2023 20:37 UTC
16 points
0 comments1 min readLW link
(rootsofprogress.org)

Pro­ject “MIRI as a Ser­vice”

RomanS8 Mar 2023 19:22 UTC
42 points
4 comments1 min readLW link

2022 Sur­vey Results

Screwtape8 Mar 2023 19:16 UTC
48 points
8 comments20 min readLW link

Use the Nato Alphabet

Cedar8 Mar 2023 19:14 UTC
6 points
10 comments1 min readLW link

LessWrong needs a sage mechanic

lc8 Mar 2023 18:57 UTC
34 points
5 comments1 min readLW link

[Question] Math­e­mat­i­cal mod­els of Ethics

Victors8 Mar 2023 17:40 UTC
4 points
2 comments1 min readLW link

Against LLM Reductionism

Erich_Grunewald8 Mar 2023 15:52 UTC
138 points
17 comments18 min readLW link
(www.erichgrunewald.com)

Agency, LLMs and AI Safety—A First Pass

Giulio8 Mar 2023 15:42 UTC
2 points
0 comments4 min readLW link
(www.giuliostarace.com)

Why Un­con­trol­lable AI Looks More Likely Than Ever

8 Mar 2023 15:41 UTC
18 points
0 comments4 min readLW link
(time.com)

Univer­sal Modelers

George3d68 Mar 2023 15:39 UTC
6 points
4 comments20 min readLW link
(epistem.ink)

The Kids are Not Okay

Zvi8 Mar 2023 13:30 UTC
85 points
43 comments32 min readLW link
(thezvi.wordpress.com)

Align­ment Tar­gets and The Nat­u­ral Ab­strac­tion Hypothesis

Stephen Fowler8 Mar 2023 11:45 UTC
10 points
0 comments3 min readLW link

Com­puter In­put Sucks—A Brain Dump

Johannes C. Mayer8 Mar 2023 11:06 UTC
14 points
11 comments3 min readLW link

Un­der-Ap­pre­ci­ated Ways to Use Flash­cards—Part II

Florence Hinder8 Mar 2023 9:54 UTC
25 points
6 comments4 min readLW link
(blog.thoughtsaver.com)

Squeez­ing foun­da­tions re­search as­sis­tance out of for­mal logic nar­row AI.

Donald Hobson8 Mar 2023 9:38 UTC
16 points
1 comment2 min readLW link

Monthly Shorts 1&2/​23

Celer8 Mar 2023 7:10 UTC
9 points
0 comments2 min readLW link
(keller.substack.com)

Chap­ter 1: Pur­su­ing Understanding

Xavier Shrier8 Mar 2023 6:40 UTC
2 points
0 comments10 min readLW link

[Question] Is re­li­gion lo­cally cor­rect for con­se­quen­tial­ists in some in­stances?

Robert Feinstein8 Mar 2023 4:02 UTC
4 points
8 comments1 min readLW link

A Polemic

Wofsen8 Mar 2023 3:51 UTC
−15 points
1 comment1 min readLW link

AI Safety in a World of Vuln­er­a­ble Ma­chine Learn­ing Systems

8 Mar 2023 2:40 UTC
70 points
27 comments29 min readLW link
(far.ai)

[Question] Ed­u­cat­ing peo­ple about ra­tio­nal­ity: where are we?

plurple8 Mar 2023 1:59 UTC
5 points
3 comments1 min readLW link

[Question] What are MIRI’s big achieve­ments in AI al­ign­ment?

tailcalled7 Mar 2023 21:30 UTC
29 points
7 comments1 min readLW link

Lan­guage mod­els are not in­her­ently safe

Olli Järviniemi7 Mar 2023 21:15 UTC
11 points
1 comment3 min readLW link

A Brief Defense of Ath­let­i­cism

Wofsen7 Mar 2023 20:48 UTC
46 points
5 comments1 min readLW link

[Question] How “grifty” is the Fore­sight In­sti­tute? Are they mak­ing but­ton soup?

Cedar7 Mar 2023 19:43 UTC
7 points
3 comments1 min readLW link

[Question] What‘s in your list of un­solved prob­lems in AI al­ign­ment?

jacquesthibs7 Mar 2023 18:58 UTC
60 points
9 comments1 min readLW link

In­tro­duc­ing AI Align­ment Inc., a Cal­ifor­nia pub­lic benefit cor­po­ra­tion...

TherapistAI7 Mar 2023 18:47 UTC
1 point
4 comments1 min readLW link

Abuse in LessWrong and ra­tio­nal­ist com­mu­ni­ties in Bloomberg News

whistleblower677 Mar 2023 18:45 UTC
−2 points
72 comments7 min readLW link
(www.bloomberg.com)

Test post for formatting

Solenoid_Entity7 Mar 2023 17:48 UTC
0 points
2 comments1 min readLW link

The Pinnacle

nem7 Mar 2023 17:07 UTC
11 points
0 comments8 min readLW link

Pod­cast Tran­script: Daniela and Dario Amodei on Anthropic

remember7 Mar 2023 16:47 UTC
46 points
2 comments79 min readLW link
(futureoflife.org)

The View from 30,000 Feet: Pre­face to the Se­cond EleutherAI Retrospective

7 Mar 2023 16:22 UTC
14 points
0 comments4 min readLW link
(blog.eleuther.ai)

Break­ing Rank (Cal­ibra­tion Game)

jenn7 Mar 2023 15:40 UTC
11 points
0 comments2 min readLW link

Ou­trangeous (Cal­ibra­tion Game)

jenn7 Mar 2023 15:29 UTC
32 points
3 comments9 min readLW link

[Linkpost] Some high-level thoughts on the Deep­Mind al­ign­ment team’s strategy

7 Mar 2023 11:55 UTC
128 points
13 comments5 min readLW link
(drive.google.com)

Align­ment works both ways

Karl von Wendt7 Mar 2023 10:41 UTC
22 points
21 comments2 min readLW link

Google’s PaLM-E: An Em­bod­ied Mul­ti­modal Lan­guage Model

SandXbox7 Mar 2023 4:11 UTC
86 points
7 comments1 min readLW link
(palm-e.github.io)

GÖDEL GOING DOWN

Jimdrix_Hendri6 Mar 2023 23:06 UTC
−9 points
3 comments1 min readLW link

Against ubiquitous al­ign­ment taxes

beren6 Mar 2023 19:50 UTC
56 points
10 comments2 min readLW link

Ad­den­dum: ba­sic facts about lan­guage mod­els dur­ing training

beren6 Mar 2023 19:24 UTC
22 points
2 comments5 min readLW link

Un­der­stand­ing The Roots Of Math­e­mat­ics Be­fore Find­ing The Roots Of A Func­tion.

LiesLaris6 Mar 2023 18:47 UTC
2 points
0 comments1 min readLW link

Dis­cus­sion: LLaMA Leak & Whistle­blow­ing in pre-AGI era

jirahim6 Mar 2023 18:47 UTC
1 point
4 comments1 min readLW link

[Question] Are we too con­fi­dent about un­al­igned AGI kil­ling off hu­man­ity?

RomanS6 Mar 2023 16:19 UTC
21 points
63 comments1 min readLW link

In­tro­duc­ing Leap Labs, an AI in­ter­pretabil­ity startup

Jessica Rumbelow6 Mar 2023 16:16 UTC
99 points
11 comments1 min readLW link

Monthly Roundup #4: March 2023

Zvi6 Mar 2023 14:10 UTC
31 points
0 comments24 min readLW link
(thezvi.wordpress.com)

Fun­da­men­tal Uncer­tainty: Chap­ter 6 - How can we be cer­tain about the truth?

Gordon Seidoh Worley6 Mar 2023 13:52 UTC
10 points
18 comments16 min readLW link

The idea

JNS6 Mar 2023 13:42 UTC
3 points
0 comments9 min readLW link

Hon­esty, Open­ness, Trust­wor­thi­ness, and Secrets

NormanPerlmutter6 Mar 2023 9:03 UTC
13 points
0 comments9 min readLW link

EA & LW Fo­rum Weekly Sum­mary (27th Feb − 5th Mar 2023)

Zoe Williams6 Mar 2023 3:18 UTC
12 points
0 comments1 min readLW link