A (EtA: quick) note on ter­minol­ogy: AI Align­ment != AI x-safety

David Scott Krueger (formerly: capybaralet)8 Feb 2023 22:33 UTC
46 points
20 comments1 min readLW link

GPT-175bee

8 Feb 2023 18:58 UTC
119 points
13 comments1 min readLW link

Ei­genKarma: trust at scale

Henrik Karlsson8 Feb 2023 18:52 UTC
182 points
50 comments5 min readLW link

Con­di­tion­ing Pre­dic­tive Models: In­ter­ac­tions with other approaches

8 Feb 2023 18:19 UTC
32 points
2 comments11 min readLW link

Wanted: Tech­ni­cal an­i­ma­tor and/​or front-end de­vel­oper for in­ter­ac­tive di­a­grams of invention

jasoncrawford8 Feb 2023 17:14 UTC
30 points
3 comments1 min readLW link
(rootsofprogress.org)

A multi-dis­ci­plinary view on AI safety research

Roman Leventov8 Feb 2023 16:50 UTC
43 points
4 comments26 min readLW link

Com­mu­nity build­ing: Les­sons from ten years of fa­cil­i­ta­tion experience

Severin T. Seehrich8 Feb 2023 16:26 UTC
17 points
0 comments1 min readLW link

Progress links and tweets, 2023-02-08

jasoncrawford8 Feb 2023 15:52 UTC
10 points
0 comments1 min readLW link
(rootsofprogress.org)

A Par­tic­u­lar Equilibrium

Algon8 Feb 2023 15:16 UTC
13 points
0 comments2 min readLW link
(algon-33.github.io)

Self-Aware­ness (and pos­si­ble mode col­lapse around it) in ChatGPT

Yitz8 Feb 2023 9:57 UTC
18 points
2 comments2 min readLW link

Drugs are Some­times Good, Actually

Gordon Seidoh Worley8 Feb 2023 2:24 UTC
12 points
8 comments4 min readLW link

House Covid In­fec­tion Retrospective

jefftk8 Feb 2023 2:20 UTC
25 points
1 comment2 min readLW link
(www.jefftk.com)

Not­ing an er­ror in Inad­e­quate Equilibria

Matthew Barnett8 Feb 2023 1:33 UTC
359 points
56 comments2 min readLW link

Liv­ing No­mad­i­cally: My 80/​20 Guide

KatWoods8 Feb 2023 1:31 UTC
35 points
18 comments1 min readLW link

OpenAI/​Microsoft an­nounce “next gen­er­a­tion lan­guage model” in­te­grated into Bing/​Edge

LawrenceC7 Feb 2023 20:38 UTC
79 points
4 comments1 min readLW link
(blogs.microsoft.com)

How evals might (or might not) pre­vent catas­trophic risks from AI

Akash7 Feb 2023 20:16 UTC
43 points
0 comments9 min readLW link

Con­di­tion­ing Pre­dic­tive Models: Mak­ing in­ner al­ign­ment as easy as possible

7 Feb 2023 20:04 UTC
27 points
2 comments19 min readLW link

On The Cur­rent Sta­tus Of AI Dating

Nikita Brancatisano7 Feb 2023 20:00 UTC
52 points
8 comments6 min readLW link

Fram­ing AI strategy

Zach Stein-Perlman7 Feb 2023 19:20 UTC
33 points
1 comment18 min readLW link
(aiimpacts.org)

Re­view of AI Align­ment Progress

PeterMcCluskey7 Feb 2023 18:57 UTC
72 points
32 comments7 min readLW link
(bayesianinvestor.com)

The Eco­nomics of Contracts

Edward P. Könings7 Feb 2023 13:52 UTC
21 points
3 comments8 min readLW link
(edwardknings.substack.com)

Two very differ­ent ex­pe­riences with ChatGPT

Sherrinford7 Feb 2023 13:09 UTC
38 points
15 comments5 min readLW link

[About Me] Cin­era’s Home Page

DragonGod7 Feb 2023 12:56 UTC
30 points
2 comments9 min readLW link

Stuff I Recom­mend You Use

Arjun Panickssery7 Feb 2023 12:18 UTC
16 points
2 comments2 min readLW link
(arjunpanickssery.substack.com)

AXRP: Store, Pa­treon, Video

DanielFilan7 Feb 2023 4:50 UTC
12 points
0 comments1 min readLW link

Duck­bill Masks Are Great

jefftk7 Feb 2023 3:00 UTC
22 points
14 comments1 min readLW link
(www.jefftk.com)

EA & LW Fo­rum Weekly Sum­mary (30th Jan − 5th Feb 2023)

Zoe Williams7 Feb 2023 2:13 UTC
3 points
3 comments1 min readLW link

so you think you’re not qual­ified to do tech­ni­cal al­ign­ment re­search?

Tamsin Leake7 Feb 2023 1:54 UTC
55 points
7 comments1 min readLW link
(carado.moe)

[ASoT] Policy Tra­jec­tory Visualization

Ulisse Mini7 Feb 2023 0:13 UTC
9 points
2 comments1 min readLW link

English is a Ter­rible Pro­gram­ming Lan­guage—And other rea­sons AI won’t dis­place programmers

dawsoneliasen6 Feb 2023 22:12 UTC
26 points
8 comments8 min readLW link
(orbistertius.substack.com)

Afri­can Wild Dogs Vote By Sneez­ing—Can AI Help Us Do Bet­ter?

Augmented Assembly6 Feb 2023 21:09 UTC
10 points
6 comments4 min readLW link

In defense of the MBTI

ZZZZZZ6 Feb 2023 21:08 UTC
−13 points
22 comments4 min readLW link

Early situ­a­tional aware­ness and its im­pli­ca­tions, a story

Jacob Pfau6 Feb 2023 20:45 UTC
29 points
6 comments3 min readLW link

Con­di­tion­ing Pre­dic­tive Models: The case for competitiveness

6 Feb 2023 20:08 UTC
20 points
3 comments11 min readLW link

Google an­nounces ‘Bard’ pow­ered by LaMDA

M. Y. Zuo6 Feb 2023 19:40 UTC
31 points
3 comments2 min readLW link

SolidGoldMag­ikarp II: tech­ni­cal de­tails and more re­cent findings

6 Feb 2023 19:09 UTC
109 points
45 comments13 min readLW link

Ad­den­dum: More Effi­cient FFNs via Attention

Robert_AIZI6 Feb 2023 18:55 UTC
10 points
2 comments5 min readLW link
(aizi.substack.com)

Here’s Why I’m He­si­tant To Re­spond In More Depth

DirectedEvolution6 Feb 2023 18:36 UTC
63 points
7 comments4 min readLW link

Child­hoods of ex­cep­tional people

Henrik Karlsson6 Feb 2023 17:27 UTC
324 points
62 comments15 min readLW link
(escapingflatland.substack.com)

Food­pairing and Embeddings

jurabrazdil6 Feb 2023 15:09 UTC
13 points
2 comments5 min readLW link

Monthly Roundup #3

Zvi6 Feb 2023 13:00 UTC
41 points
9 comments27 min readLW link
(thezvi.wordpress.com)

Pro­ject Idea: Lots of Cause-area-spe­cific On­line Un­con­fer­ences

Linda Linsefors6 Feb 2023 11:05 UTC
27 points
1 comment1 min readLW link

Oxford Es­say Writing

6 Feb 2023 8:24 UTC
5 points
0 comments1 min readLW link

De­ci­sion Trans­former Interpretability

6 Feb 2023 7:29 UTC
84 points
13 comments24 min readLW link

Why is Every­one So Bor­ing? By Robin Hanson

trevor6 Feb 2023 4:17 UTC
54 points
10 comments1 min readLW link
(www.overcomingbias.com)

Gra­di­ent sur­fing: the hid­den role of regularization

Jesse Hoogland6 Feb 2023 3:50 UTC
35 points
6 comments14 min readLW link
(www.jessehoogland.com)

Why Are Bac­te­ria So Sim­ple?

aysja6 Feb 2023 3:00 UTC
171 points
33 comments10 min readLW link

The Law of Identity

Chris_Leong6 Feb 2023 2:59 UTC
5 points
5 comments4 min readLW link

Robin Han­son on “Ex­plain­ing the Sa­cred”

Raemon6 Feb 2023 0:50 UTC
13 points
4 comments3 min readLW link
(www.overcomingbias.com)

In­ter­view Daniel Mur­fet on Univer­sal Phenom­ena in Learn­ing Machines

Alexander Gietelink Oldenziel6 Feb 2023 0:00 UTC
44 points
1 comment16 min readLW link