Why Not Just Out­source Align­ment Re­search To An AI?

johnswentworthMar 9, 2023, 9:49 PM
151 points
50 comments9 min readLW link1 review

What’s Not Our Problem

Jacob FalkovichMar 9, 2023, 8:07 PM
22 points
6 comments9 min readLW link

Ques­tions about Con­je­cure’s CoEm proposal

Mar 9, 2023, 7:32 PM
51 points
4 comments2 min readLW link

What Ja­son has been read­ing, March 2023

jasoncrawfordMar 9, 2023, 6:46 PM
12 points
0 comments6 min readLW link
(rootsofprogress.org)

[Question] “Provide C++ code for a func­tion that out­puts a Fibonacci se­quence of n terms, where n is pro­vided as a pa­ram­e­ter to the function

Thembeka99Mar 9, 2023, 6:37 PM
−21 points
2 comments1 min readLW link

An­thropic: Core Views on AI Safety: When, Why, What, and How

jonmenasterMar 9, 2023, 5:34 PM
17 points
1 comment22 min readLW link
(www.anthropic.com)

Why do we as­sume there is a “real” shog­goth be­hind the LLM? Why not masks all the way down?

Robert_AIZIMar 9, 2023, 5:28 PM
63 points
48 comments2 min readLW link

An­thropic’s Core Views on AI Safety

Zac Hatfield-DoddsMar 9, 2023, 4:55 PM
172 points
39 comments2 min readLW link
(www.anthropic.com)

Some ML-Re­lated Math I Now Un­der­stand Better

Fabien RogerMar 9, 2023, 4:35 PM
50 points
6 comments4 min readLW link

The Translu­cent Thoughts Hy­pothe­ses and Their Implications

Fabien RogerMar 9, 2023, 4:30 PM
142 points
7 comments19 min readLW link

IRL in Gen­eral Environments

michaelcohenMar 9, 2023, 1:32 PM
8 points
20 comments1 min readLW link

Utility un­cer­tainty vs. ex­pected in­for­ma­tion gain

michaelcohenMar 9, 2023, 1:32 PM
13 points
9 comments1 min readLW link

Value Learn­ing is only Asymp­tot­i­cally Safe

michaelcohenMar 9, 2023, 1:32 PM
5 points
19 comments1 min readLW link

Im­pact Mea­sure Test­ing with Honey Pots and Myopia

michaelcohenMar 9, 2023, 1:32 PM
13 points
9 comments1 min readLW link

Just Imi­tate Hu­mans?

michaelcohenMar 9, 2023, 1:31 PM
11 points
72 comments1 min readLW link

Build a Causal De­ci­sion Theorist

michaelcohenMar 9, 2023, 1:31 PM
−2 points
14 comments4 min readLW link

ChatGPT ex­plores the se­man­tic differential

Bill BenzonMar 9, 2023, 1:09 PM
7 points
2 comments7 min readLW link

AI #3

ZviMar 9, 2023, 12:20 PM
55 points
12 comments62 min readLW link
(thezvi.wordpress.com)

The Scien­tific Ap­proach To Any­thing and Everything

Rami RustomMar 9, 2023, 11:27 AM
6 points
5 comments16 min readLW link

Paper Sum­mary: The Effec­tive­ness of AI Ex­is­ten­tial Risk Com­mu­ni­ca­tion to the Amer­i­can and Dutch Public

otto.bartenMar 9, 2023, 10:47 AM
14 points
6 comments4 min readLW link

Speed run­ning ev­ery­one through the bad al­ign­ment bingo. $5k bounty for a LW con­ver­sa­tional agent

ArthurBMar 9, 2023, 9:26 AM
140 points
33 comments2 min readLW link

Chom­sky on ChatGPT (link)

mukashiMar 9, 2023, 7:00 AM
2 points
6 comments1 min readLW link

How bad a fu­ture do ML re­searchers ex­pect?

KatjaGraceMar 9, 2023, 4:50 AM
122 points
8 comments2 min readLW link
(aiimpacts.org)

Challenge: con­struct a Gra­di­ent Hacker

Mar 9, 2023, 2:38 AM
39 points
10 comments1 min readLW link

Ba­sic Facts Beanbag

ScrewtapeMar 9, 2023, 12:05 AM
6 points
0 comments4 min readLW link

A rank­ing scale for how se­vere the side effects of solu­tions to AI x-risk are

Christopher KingMar 8, 2023, 10:53 PM
3 points
0 comments2 min readLW link

Progress links and tweets, 2023-03-08

jasoncrawfordMar 8, 2023, 8:37 PM
16 points
0 comments1 min readLW link
(rootsofprogress.org)

Pro­ject “MIRI as a Ser­vice”

RomanSMar 8, 2023, 7:22 PM
42 points
4 comments1 min readLW link

2022 Sur­vey Results

ScrewtapeMar 8, 2023, 7:16 PM
48 points
8 comments20 min readLW link

Use the Nato Alphabet

CedarMar 8, 2023, 7:14 PM
6 points
10 comments1 min readLW link

LessWrong needs a sage mechanic

lcMar 8, 2023, 6:57 PM
34 points
5 comments1 min readLW link

[Question] Math­e­mat­i­cal mod­els of Ethics

VictorsMar 8, 2023, 5:40 PM
4 points
2 comments1 min readLW link

Against LLM Reductionism

Erich_GrunewaldMar 8, 2023, 3:52 PM
140 points
17 comments18 min readLW link
(www.erichgrunewald.com)

Agency, LLMs and AI Safety—A First Pass

GiulioMar 8, 2023, 3:42 PM
2 points
0 comments4 min readLW link
(www.giuliostarace.com)

Why Un­con­trol­lable AI Looks More Likely Than Ever

Mar 8, 2023, 3:41 PM
18 points
0 comments4 min readLW link
(time.com)

Univer­sal Modelers

George3d6Mar 8, 2023, 3:39 PM
6 points
4 comments20 min readLW link
(epistem.ink)

The Kids are Not Okay

ZviMar 8, 2023, 1:30 PM
85 points
43 comments32 min readLW link
(thezvi.wordpress.com)

Align­ment Tar­gets and The Nat­u­ral Ab­strac­tion Hypothesis

Stephen FowlerMar 8, 2023, 11:45 AM
10 points
0 comments3 min readLW link

Com­puter In­put Sucks—A Brain Dump

Johannes C. MayerMar 8, 2023, 11:06 AM
14 points
11 comments3 min readLW link

Un­der-Ap­pre­ci­ated Ways to Use Flash­cards—Part II

Florence HinderMar 8, 2023, 9:54 AM
25 points
6 comments4 min readLW link
(blog.thoughtsaver.com)

Squeez­ing foun­da­tions re­search as­sis­tance out of for­mal logic nar­row AI.

Donald HobsonMar 8, 2023, 9:38 AM
16 points
1 comment2 min readLW link

Monthly Shorts 1&2/​23

CelerMar 8, 2023, 7:10 AM
9 points
0 comments2 min readLW link
(keller.substack.com)

[Question] Is re­li­gion lo­cally cor­rect for con­se­quen­tial­ists in some in­stances?

Robert FeinsteinMar 8, 2023, 4:02 AM
4 points
8 comments1 min readLW link

A Polemic

WofsenMar 8, 2023, 3:51 AM
−15 points
1 comment1 min readLW link

AI Safety in a World of Vuln­er­a­ble Ma­chine Learn­ing Systems

Mar 8, 2023, 2:40 AM
70 points
28 comments29 min readLW link
(far.ai)

[Question] Ed­u­cat­ing peo­ple about ra­tio­nal­ity: where are we?

plurpleMar 8, 2023, 1:59 AM
5 points
3 comments1 min readLW link

[Question] What are MIRI’s big achieve­ments in AI al­ign­ment?

tailcalledMar 7, 2023, 9:30 PM
29 points
7 comments1 min readLW link

A Brief Defense of Ath­let­i­cism

WofsenMar 7, 2023, 8:48 PM
46 points
5 comments1 min readLW link

[Question] How “grifty” is the Fore­sight In­sti­tute? Are they mak­ing but­ton soup?

CedarMar 7, 2023, 7:43 PM
7 points
3 comments1 min readLW link

[Question] What‘s in your list of un­solved prob­lems in AI al­ign­ment?

jacquesthibsMar 7, 2023, 6:58 PM
60 points
9 comments1 min readLW link