Top YouTube chan­nel Ver­i­ta­sium re­leases video on Sleep­ing Beauty Problem

Alex_AltairFeb 11, 2023, 8:36 PM
25 points
22 comments1 min readLW link
(www.youtube.com)

Short­en­ing Timelines: There’s No Buffer Anymore

Jeff RoseFeb 11, 2023, 7:53 PM
10 points
5 comments1 min readLW link

We Found An Neu­ron in GPT-2

Feb 11, 2023, 6:27 PM
143 points
23 comments7 min readLW link
(clementneo.com)

The Prac­ti­tioner’s Path 2.0: the Prag­ma­tist Archetype

EvenflairFeb 11, 2023, 3:48 PM
21 points
0 comments2 min readLW link
(guildoftherose.org)

The Illu­sion of Sim­plic­ity: Mone­tary Policy as a Prob­lem of Com­plex­ity and Alignment

Edward P. KöningsFeb 11, 2023, 3:04 PM
8 points
0 comments8 min readLW link
(edwardknings.substack.com)

In Defense of Chat­bot Romance

Kaj_SotalaFeb 11, 2023, 2:30 PM
124 points
53 comments11 min readLW link
(kajsotala.fi)

Threat­en­ing to do the im­pos­si­ble: A solu­tion to spu­ri­ous coun­ter­fac­tu­als for func­tional de­ci­sion the­ory via proof theory

Christopher KingFeb 11, 2023, 7:57 AM
5 points
4 comments5 min readLW link

Ra­tion­al­ity-re­lated things I don’t know as of 2023

Adam ZernerFeb 11, 2023, 6:04 AM
64 points
59 comments3 min readLW link

A note on ‘semiotic physics’

metasemiFeb 11, 2023, 5:12 AM
11 points
13 comments6 min readLW link

Inequal­ity Penalty: Mo­ral­ity in Many Worlds

ShmiFeb 11, 2023, 4:08 AM
11 points
17 comments6 min readLW link

The Im­por­tance of AI Align­ment, ex­plained in 5 points

Daniel_EthFeb 11, 2023, 2:56 AM
33 points
2 commentsLW link

Act­ing Nor­mal is Good, Actually

Gordon Seidoh WorleyFeb 10, 2023, 11:35 PM
14 points
5 comments3 min readLW link

[S] D&D.Sci: All the D8a. Allllllll of it.

aphyerFeb 10, 2023, 9:14 PM
43 points
17 comments6 min readLW link

A Differ­ent Kind of Ark: My failed at­tempt to build a bridge be­tween universes

ChrisMFeb 10, 2023, 8:49 PM
2 points
2 comments6 min readLW link
(www.vesselproject.io)

Prizes for the 2021 Review

RaemonFeb 10, 2023, 7:47 PM
69 points
2 comments4 min readLW link

A pro­posed method for fore­cast­ing trans­for­ma­tive AI

Matthew BarnettFeb 10, 2023, 7:34 PM
121 points
21 comments10 min readLW link

The best way so far to ex­plain AI risk: The Precipice (p. 137-149)

trevorFeb 10, 2023, 7:33 PM
53 points
2 comments17 min readLW link

Is this a weak pivotal act: cre­at­ing nanobots that eat evil AGIs (but noth­ing else)?

Christopher KingFeb 10, 2023, 7:26 PM
0 points
3 comments1 min readLW link

Why I’m not work­ing on {de­bate, RRM, ELK, nat­u­ral ab­strac­tions}

Steven ByrnesFeb 10, 2023, 7:22 PM
71 points
19 comments10 min readLW link

Con­di­tion­ing Pre­dic­tive Models: Open prob­lems, Con­clu­sion, and Appendix

Feb 10, 2023, 7:21 PM
36 points
3 comments11 min readLW link

Jobs that can help with the most im­por­tant century

HoldenKarnofskyFeb 10, 2023, 6:20 PM
24 points
0 comments19 min readLW link
(www.cold-takes.com)

[Question] Is it a co­in­ci­dence that GPT-3 re­quires roughly the same amount of com­pute as is nec­es­sary to em­u­late the hu­man brain?

RomanSFeb 10, 2023, 4:26 PM
11 points
10 comments1 min readLW link

Con­tra: Chang­ing Role Terms

jefftkFeb 10, 2023, 3:00 PM
8 points
0 comments3 min readLW link
(www.jefftk.com)

Cyborgism

Feb 10, 2023, 2:47 PM
332 points
46 comments35 min readLW link2 reviews

FLI Pod­cast: Con­nor Leahy on AI Progress, Chimps, Memes, and Mar­kets (Part 1/​3)

Feb 10, 2023, 1:55 PM
39 points
0 comments43 min readLW link

[Question] What’s ac­tu­ally go­ing on in the “mind” of the model when we fine-tune GPT-3 to In­struc­tGPT?

rpglover64Feb 10, 2023, 7:57 AM
18 points
3 comments1 min readLW link

Mechanism De­sign for AI Safety—Agenda Creation Retreat

Rubi J. HudsonFeb 10, 2023, 3:05 AM
24 points
2 commentsLW link

[Question] On util­ity functions

jodaruFeb 10, 2023, 1:22 AM
11 points
10 comments1 min readLW link

Se­cu­rity Mind­set—Fire Alarms and Trig­ger Signatures

elspoodFeb 9, 2023, 9:15 PM
23 points
0 comments4 min readLW link

Im­pos­tor syn­drome: how to cure it with spread­sheets and med­i­ta­tion

KatWoodsFeb 9, 2023, 9:04 PM
31 points
2 comments19 min readLW link

Con­di­tion­ing Pre­dic­tive Models: De­ploy­ment strategy

Feb 9, 2023, 8:59 PM
28 points
0 comments10 min readLW link

Make Con­flict of In­ter­est Poli­cies Public

jefftkFeb 9, 2023, 7:30 PM
33 points
7 comments2 min readLW link
(www.jefftk.com)

Cu­rated blind auc­tion pre­dic­tion mar­kets and a rep­u­ta­tion sys­tem as an al­ter­na­tive to ed­i­to­rial re­view in news pub­li­ca­tion.

ciaran Feb 9, 2023, 6:48 PM
2 points
0 comments2 min readLW link

Tools for find­ing in­for­ma­tion on the internet

RomanHaukssonFeb 9, 2023, 5:05 PM
79 points
11 comments2 min readLW link
(roman.computer)

Covid 2/​9/​23: In­terferon λ

ZviFeb 9, 2023, 4:50 PM
48 points
8 comments12 min readLW link
(thezvi.wordpress.com)

EIS II: What is “In­ter­pretabil­ity”?

scasperFeb 9, 2023, 4:48 PM
28 points
6 comments4 min readLW link

The Eng­ineer’s In­ter­pretabil­ity Se­quence (EIS) I: Intro

scasperFeb 9, 2023, 4:28 PM
46 points
24 comments3 min readLW link

[Question] Do the Safety Prop­er­ties of Pow­er­ful AI Sys­tems Need to be Ad­ver­sar­i­ally Ro­bust? Why?

DragonGodFeb 9, 2023, 1:36 PM
22 points
42 comments2 min readLW link

Which ML skills are use­ful for find­ing a new AIS re­search agenda?

Yonatan CaleFeb 9, 2023, 1:09 PM
16 points
1 comment1 min readLW link

When To Stop

Alok SinghFeb 9, 2023, 9:10 AM
31 points
5 comments1 min readLW link
(alok.github.io)

The Per­va­sive Illu­sion of See­ing the Com­plete World

ShmiFeb 9, 2023, 6:47 AM
39 points
1 comment2 min readLW link

Reli­gion is Good, Actually

Gordon Seidoh WorleyFeb 9, 2023, 6:34 AM
−1 points
39 comments4 min readLW link

Us­ing PICT against Pas­taGPT Jailbreaking

Quentin FEUILLADE--MONTIXIFeb 9, 2023, 4:30 AM
26 points
0 comments9 min readLW link

Notes on the Math­e­mat­ics of LLM Architectures

carboniferous_umbraculum Feb 9, 2023, 1:45 AM
12 points
2 comments1 min readLW link
(drive.google.com)

On Devel­op­ing a Math­e­mat­i­cal The­ory of In­ter­pretabil­ity

carboniferous_umbraculum Feb 9, 2023, 1:45 AM
64 points
8 comments6 min readLW link

Ano­ma­lous to­kens re­veal the origi­nal iden­tities of In­struct models

Feb 9, 2023, 1:30 AM
140 points
16 comments9 min readLW link
(generative.ink)

[Question] How would you use video gamey tech to help with AI safety?

porbyFeb 9, 2023, 12:20 AM
9 points
5 comments1 min readLW link

A (EtA: quick) note on ter­minol­ogy: AI Align­ment != AI x-safety

David Scott Krueger (formerly: capybaralet)Feb 8, 2023, 10:33 PM
46 points
20 comments1 min readLW link

GPT-175bee

Feb 8, 2023, 6:58 PM
122 points
14 comments1 min readLW link

Ei­genKarma: trust at scale

Henrik KarlssonFeb 8, 2023, 6:52 PM
186 points
52 comments5 min readLW link