OpenAI’s new Pre­pared­ness team is hiring

leopoldOct 26, 2023, 8:42 PM
60 points
2 comments1 min readLW link

Fake Deeply

Zack_M_DavisOct 26, 2023, 7:55 PM
33 points
7 comments1 min readLW link
(unremediatedgender.space)

Sym­bol/​Refer­ent Con­fu­sions in Lan­guage Model Align­ment Experiments

johnswentworthOct 26, 2023, 7:49 PM
116 points
50 comments6 min readLW link1 review

Un­su­per­vised Meth­ods for Con­cept Dis­cov­ery in AlphaZero

aogOct 26, 2023, 7:05 PM
9 points
0 comments1 min readLW link
(arxiv.org)

[Question] Non­lin­ear limi­ta­tions of ReLUs

magfrumpOct 26, 2023, 6:51 PM
13 points
1 comment1 min readLW link

AI Align­ment Prob­lem: Re­quire­ment not op­tional (A Crit­i­cal Anal­y­sis through Mass Effect Tril­ogy)

TAWSIF AHMEDOct 26, 2023, 6:02 PM
−9 points
0 comments4 min readLW link

[Thought Ex­per­i­ment] To­mor­row’s Echo—The fu­ture of syn­thetic com­pan­ion­ship.

Vimal NaranOct 26, 2023, 5:54 PM
−7 points
2 comments2 min readLW link

Disagree­ments over the pri­ori­ti­za­tion of ex­is­ten­tial risk from AI

Olivier CoutuOct 26, 2023, 5:54 PM
10 points
0 comments6 min readLW link

[Question] What if AGI had its own uni­verse to maybe wreck?

msealeOct 26, 2023, 5:49 PM
−1 points
2 comments1 min readLW link

Chang­ing Con­tra Dialects

jefftkOct 26, 2023, 5:30 PM
25 points
2 comments1 min readLW link
(www.jefftk.com)

5 psy­cholog­i­cal rea­sons for dis­miss­ing x-risks from AGI

Igor IvanovOct 26, 2023, 5:21 PM
24 points
6 comments4 min readLW link

5. Risks from pre­vent­ing le­gi­t­i­mate value change (value col­lapse)

Nora_AmmannOct 26, 2023, 2:38 PM
13 points
1 comment9 min readLW link

4. Risks from caus­ing ille­gi­t­i­mate value change (perfor­ma­tive pre­dic­tors)

Nora_AmmannOct 26, 2023, 2:38 PM
8 points
3 comments5 min readLW link

3. Premise three & Con­clu­sion: AI sys­tems can af­fect value change tra­jec­to­ries & the Value Change Problem

Nora_AmmannOct 26, 2023, 2:38 PM
28 points
4 comments4 min readLW link

2. Premise two: Some cases of value change are (il)legitimate

Nora_AmmannOct 26, 2023, 2:36 PM
24 points
7 comments6 min readLW link

1. Premise one: Values are malleable

Nora_AmmannOct 26, 2023, 2:36 PM
21 points
1 comment15 min readLW link

0. The Value Change Prob­lem: in­tro­duc­tion, overview and motivations

Nora_AmmannOct 26, 2023, 2:36 PM
32 points
0 comments5 min readLW link

EPUBs of MIRI Blog Archives and se­lected LW Sequences

mesaoptimizerOct 26, 2023, 2:17 PM
44 points
5 comments1 min readLW link
(git.sr.ht)

UK Govern­ment pub­lishes “Fron­tier AI: ca­pa­bil­ities and risks” Dis­cus­sion Paper

A.H.Oct 26, 2023, 1:55 PM
5 points
0 comments2 min readLW link
(www.gov.uk)

AI #35: Re­spon­si­ble Scal­ing Policies

ZviOct 26, 2023, 1:30 PM
66 points
10 comments55 min readLW link
(thezvi.wordpress.com)

RA Bounty: Look­ing for feed­back on screen­play about AI Risk

WriterOct 26, 2023, 1:23 PM
32 points
6 comments1 min readLW link

Sen­sor Ex­po­sure can Com­pro­mise the Hu­man Brain in the 2020s

trevorOct 26, 2023, 3:31 AM
17 points
6 comments10 min readLW link

Notes on “How do we be­come con­fi­dent in the safety of a ma­chine learn­ing sys­tem?”

RohanSOct 26, 2023, 3:13 AM
4 points
0 comments13 min readLW link

Ap­ply to the Con­stel­la­tion Visit­ing Re­searcher Pro­gram and As­tra Fel­low­ship, in Berkeley this Winter

Nate ThomasOct 26, 2023, 3:07 AM
42 points
10 comments1 min readLW link

CHAI in­tern­ship ap­pli­ca­tions are open (due Nov 13)

Erik JennerOct 26, 2023, 12:53 AM
34 points
0 comments3 min readLW link

Ar­chi­tects of Our Own Demise: We Should Stop Devel­op­ing AI Carelessly

RokoOct 26, 2023, 12:36 AM
170 points
75 comments3 min readLW link

EA In­fras­truc­ture Fund: June 2023 grant recommendations

LinchOct 26, 2023, 12:35 AM
21 points
0 commentsLW link

Re­spon­si­ble Scal­ing Poli­cies Are Risk Man­age­ment Done Wrong

simeon_cOct 25, 2023, 11:46 PM
123 points
35 comments22 min readLW link1 review
(www.navigatingrisks.ai)

AI as a sci­ence, and three ob­sta­cles to al­ign­ment strategies

So8resOct 25, 2023, 9:00 PM
193 points
80 comments11 min readLW link

My hopes for al­ign­ment: Sin­gu­lar learn­ing the­ory and whole brain emulation

Garrett BakerOct 25, 2023, 6:31 PM
61 points
5 comments12 min readLW link

[Question] Ly­ing to chess play­ers for alignment

ZaneOct 25, 2023, 5:47 PM
97 points
54 comments1 min readLW link

An­thropic, Google, Microsoft & OpenAI an­nounce Ex­ec­u­tive Direc­tor of the Fron­tier Model Fo­rum & over $10 mil­lion for a new AI Safety Fund

Zach Stein-PerlmanOct 25, 2023, 3:20 PM
31 points
8 comments4 min readLW link
(www.frontiermodelforum.org)

“The Eco­nomics of Time Travel”—call for re­view­ers (Seeds of Science)

rogersbaconOct 25, 2023, 3:13 PM
4 points
2 comments1 min readLW link

Com­po­si­tional prefer­ence mod­els for al­ign­ing LMs

Tomek KorbakOct 25, 2023, 12:17 PM
18 points
2 comments5 min readLW link

[Question] Should the US House of Rep­re­sen­ta­tives adopt rank choice vot­ing for lead­er­ship po­si­tions?

jmhOct 25, 2023, 11:16 AM
16 points
6 comments1 min readLW link

Re­searchers be­lieve they have found a way for artists to fight back against AI style capture

vernamcipherOct 25, 2023, 10:54 AM
3 points
1 comment1 min readLW link
(finance.yahoo.com)

Why We Disagree

zulupineappleOct 25, 2023, 10:50 AM
7 points
2 comments2 min readLW link

Beyond the Data: Why aid to poor doesn’t work

LyrongolemOct 25, 2023, 5:03 AM
2 points
31 comments12 min readLW link

An­nounc­ing Epoch’s newly ex­panded Pa­ram­e­ters, Com­pute and Data Trends in Ma­chine Learn­ing database

Oct 25, 2023, 2:55 AM
18 points
0 comments1 min readLW link
(epochai.org)

What is a Se­quenc­ing Read?

jefftkOct 25, 2023, 2:10 AM
17 points
2 comments2 min readLW link
(www.jefftk.com)

Ver­ifi­able pri­vate ex­e­cu­tion of ma­chine learn­ing mod­els with Risc0?

mako yassOct 25, 2023, 12:44 AM
30 points
2 comments2 min readLW link

[Question] How to Re­solve Fore­casts With No Cen­tral Author­ity?

Nathan YoungOct 25, 2023, 12:28 AM
17 points
6 comments1 min readLW link

Thoughts on re­spon­si­ble scal­ing poli­cies and regulation

paulfchristianoOct 24, 2023, 10:21 PM
221 points
33 comments6 min readLW link

The Screen­play Method

Yeshua GodOct 24, 2023, 5:41 PM
−15 points
0 comments25 min readLW link

Blunt Razor

fryolysisOct 24, 2023, 5:27 PM
3 points
0 comments2 min readLW link

Hal­loween Problem

Saint BlasphemerOct 24, 2023, 4:46 PM
−10 points
1 comment1 min readLW link

Who is Harry Pot­ter? Some pre­dic­tions.

Donald HobsonOct 24, 2023, 4:14 PM
23 points
7 comments2 min readLW link

Book Re­view: Go­ing Infinite

ZviOct 24, 2023, 3:00 PM
244 points
113 comments97 min readLW link1 review
(thezvi.wordpress.com)

[In­ter­view w/​ Quintin Pope] Evolu­tion, val­ues, and AI Safety

fowlertmOct 24, 2023, 1:53 PM
11 points
0 comments1 min readLW link

Ly­ing is Cowardice, not Strategy

24 Oct 2023 13:24 UTC
29 points
73 comments5 min readLW link
(cognition.cafe)