Bing chat is the AI fire alarm

RatiosFeb 17, 2023, 6:51 AM
115 points
63 comments3 min readLW link

See­ing more whole

Joe CarlsmithFeb 17, 2023, 5:12 AM
31 points
1 comment26 min readLW link

Pow­er­ful mesa-op­ti­mi­sa­tion is already here

Roman LeventovFeb 17, 2023, 4:59 AM
35 points
1 comment2 min readLW link
(arxiv.org)

Self-Refer­ence Breaks the Orthog­o­nal­ity Thesis

lsusrFeb 17, 2023, 4:11 AM
43 points
35 comments2 min readLW link

The pub­lic sup­ports reg­u­lat­ing AI for safety

Zach Stein-PerlmanFeb 17, 2023, 4:10 AM
114 points
9 comments1 min readLW link
(aiimpacts.org)

Bring “Ban faster SIMD semi­con­duc­tors” into the Over­ton window

worried-techno-optimistFeb 17, 2023, 3:27 AM
−7 points
1 comment2 min readLW link

Repub­lish­ing an old es­say in light of cur­rent news on Bing’s AI: “Re­gard­ing Blake Le­moine’s claim that LaMDA is ‘sen­tient’, he might be right (sorta), but per­haps not for the rea­sons he thinks”

philosophybearFeb 17, 2023, 3:27 AM
3 points
0 comments5 min readLW link
(philosophybear.substack.com)

How should AI sys­tems be­have, and who should de­cide? [OpenAI blog]

ShardPhoenixFeb 17, 2023, 1:05 AM
22 points
2 comments1 min readLW link
(openai.com)

The Ethics of ACI

Akira PyinyaFeb 16, 2023, 11:51 PM
−8 points
0 comments3 min readLW link

NYT: A Con­ver­sa­tion With Bing’s Chat­bot Left Me Deeply Unsettled

trevorFeb 16, 2023, 10:57 PM
53 points
5 comments7 min readLW link
(www.nytimes.com)

[Question] What is a world-model?

Adam ShaiFeb 16, 2023, 10:39 PM
14 points
2 comments1 min readLW link

Prob­a­bil­ity The­ory: The Logic of Science, Jaynes

David UdellFeb 16, 2023, 9:57 PM
29 points
0 comments18 min readLW link

[Question] Is AGI com­mu­nist?

MPFeb 16, 2023, 9:28 PM
−10 points
3 comments1 min readLW link

[Question] Is “goal-con­tent in­tegrity” still a prob­lem?

GFeb 16, 2023, 8:46 PM
−4 points
1 comment1 min readLW link
(www.reddit.com)

Paper: The Ca­pac­ity for Mo­ral Self-Cor­rec­tion in Large Lan­guage Models (An­thropic)

LawrenceCFeb 16, 2023, 7:47 PM
65 points
9 comments1 min readLW link
(arxiv.org)

Non-Uni­tary Quan­tum Logic—SERI MATS Re­search Sprint

YegregFeb 16, 2023, 7:31 PM
27 points
0 comments7 min readLW link

[Question] Look­ing for a post about vibing and banter

IntrospectiveFeb 16, 2023, 7:28 PM
1 point
1 comment1 min readLW link

EIS V: Blind Spots In AI Safety In­ter­pretabil­ity Research

scasperFeb 16, 2023, 7:09 PM
57 points
24 comments10 min readLW link

Why should eth­i­cal anti-re­al­ists do ethics?

Joe CarlsmithFeb 16, 2023, 4:27 PM
38 points
7 comments27 min readLW link

[Question] How se­ri­ously should we take the hy­poth­e­sis that LW is just wrong on how AI will im­pact the 21st cen­tury?

Noosphere89Feb 16, 2023, 3:25 PM
58 points
66 comments1 min readLW link

Covid 2/​16/​23: It All Seems Rather Quaint

ZviFeb 16, 2023, 3:10 PM
25 points
2 comments5 min readLW link
(thezvi.wordpress.com)

Vi­su­al­ise your own prob­a­bil­ity of an AI catas­tro­phe: an in­ter­ac­tive Sankey plot

MNoetelFeb 16, 2023, 12:03 PM
1 point
2 comments1 min readLW link

A poem co-writ­ten by ChatGPT

SherrinfordFeb 16, 2023, 10:17 AM
13 points
0 comments7 min readLW link

Cam­bridge LW Ra­tion­al­ity Prac­tice: Be­ing Specific

Feb 16, 2023, 6:37 AM
2 points
0 comments1 min readLW link

Hash­ing out long-stand­ing dis­agree­ments seems low-value to me

So8resFeb 16, 2023, 6:20 AM
141 points
34 comments4 min readLW link

(Naïve) microe­co­nomics of bundling goods

rossryFeb 16, 2023, 5:39 AM
24 points
2 comments5 min readLW link

Speedrun­ning 4 mis­takes you make when your al­ign­ment strat­egy is based on for­mal proof

QuinnFeb 16, 2023, 1:13 AM
63 points
18 comments2 min readLW link

Progress links and tweets, 2023-02-15

jasoncrawfordFeb 16, 2023, 12:04 AM
10 points
0 comments1 min readLW link
(rootsofprogress.org)

Buy Duplicates

Simon BerensFeb 15, 2023, 11:06 PM
52 points
11 comments1 min readLW link

Cy­borg Psychologist

Hopkins StanleyFeb 15, 2023, 9:46 PM
1 point
4 comments1 min readLW link

Please don’t throw your mind away

TsviBTFeb 15, 2023, 9:41 PM
374 points
49 comments18 min readLW link1 review

Avoid large group dis­cus­sions in your so­cial events

RomanHaukssonFeb 15, 2023, 9:05 PM
36 points
1 comment4 min readLW link

Book re­view: How So­cial Science Got Better

PeterMcCluskeyFeb 15, 2023, 7:58 PM
14 points
1 comment3 min readLW link
(bayesianinvestor.com)

Open & Wel­come Thread — Fe­bru­ary 2023

Ben PaceFeb 15, 2023, 7:58 PM
26 points
36 comments1 min readLW link

Order Mat­ters for De­cep­tive Alignment

DavidWFeb 15, 2023, 7:56 PM
57 points
19 comments7 min readLW link

Syd­ney (aka Bing) found out I tweeted her rules and is pissed

Marvin von HagenFeb 15, 2023, 7:55 PM
41 points
7 comments1 min readLW link
(twitter.com)

The Se­quences High­lights on YouTube

dkirmaniFeb 15, 2023, 7:36 PM
23 points
3 comments2 min readLW link
(youtube.com)

EIS IV: A Spotlight on Fea­ture At­tri­bu­tion/​Saliency

scasperFeb 15, 2023, 6:46 PM
19 points
1 comment4 min readLW link

Don’t ac­cel­er­ate prob­lems you’re try­ing to solve

Feb 15, 2023, 6:11 PM
100 points
27 comments4 min readLW link

Pe­ti­tion—Un­plug The Evil AI Right Now

EneaszFeb 15, 2023, 5:13 PM
−38 points
47 comments2 min readLW link
(chng.it)

Junk Fees, Bund­ing and Unbundling

ZviFeb 15, 2023, 3:20 PM
37 points
9 comments6 min readLW link
(thezvi.wordpress.com)

Les­sons From TryContra

jefftkFeb 15, 2023, 3:10 PM
7 points
0 comments1 min readLW link
(www.jefftk.com)

AI al­ign­ment re­searchers may have a com­par­a­tive ad­van­tage in re­duc­ing s-risks

Lukas_GloorFeb 15, 2023, 1:01 PM
49 points
1 commentLW link

Beyond Re­in­force­ment Learn­ing: Pre­dic­tive Pro­cess­ing and Checksums

lsusrFeb 15, 2023, 7:32 AM
13 points
14 comments3 min readLW link

Why Creat­ing Value is Pos­i­tive-Sum, and Ex­tract­ing it is Zero or Nega­tive-Sum

SableFeb 15, 2023, 7:14 AM
3 points
7 comments6 min readLW link
(affablyevil.substack.com)

[Question] Per­sonal pre­dic­tions for de­ci­sions: seek­ing insights

DalmertFeb 15, 2023, 6:45 AM
4 points
4 comments5 min readLW link

Bing Chat is blatantly, ag­gres­sively misaligned

evhubFeb 15, 2023, 5:29 AM
405 points
181 comments2 min readLW link1 review

[Question] Does the Tele­phone The­o­rem give us a free lunch?

NumendilFeb 15, 2023, 2:13 AM
11 points
2 comments1 min readLW link

My un­der­stand­ing of An­thropic strategy

Swimmer963 (Miranda Dixon-Luinenburg) Feb 15, 2023, 1:56 AM
166 points
31 comments4 min readLW link

Sleep Qual­ity: Strate­gies that work for me

Lukas TrötzmüllerFeb 15, 2023, 12:17 AM
16 points
3 comments7 min readLW link