Get an Elec­tric Tooth­brush.

CerveraJan 5, 2023, 9:08 PM
21 points
4 comments1 min readLW link

Dis­cur­sive Com­pe­tence in ChatGPT, Part 1: Talk­ing with Dragons

Bill BenzonJan 5, 2023, 9:01 PM
2 points
0 comments6 min readLW link

Trans­for­ma­tive AI is­sues (not just mis­al­ign­ment): an overview

HoldenKarnofskyJan 5, 2023, 8:20 PM
34 points
6 comments18 min readLW link
(www.cold-takes.com)

How to slow down sci­en­tific progress, ac­cord­ing to Leo Szilard

jasoncrawfordJan 5, 2023, 6:26 PM
134 points
18 comments2 min readLW link
(rootsofprogress.org)

Paper: Su­per­po­si­tion, Me­moriza­tion, and Dou­ble Des­cent (An­thropic)

LawrenceCJan 5, 2023, 5:54 PM
53 points
11 comments1 min readLW link
(transformer-circuits.pub)

Col­lapse Might Not Be Desirable

DzoldzayaJan 5, 2023, 5:29 PM
−2 points
9 comments2 min readLW link

Sin­ga­pore—Small ca­sual din­ner in Chi­na­town #6

Joe RoccaJan 5, 2023, 5:00 PM
2 points
1 comment1 min readLW link

[Question] Image gen­er­a­tion and al­ign­ment

rpglover64Jan 5, 2023, 4:05 PM
3 points
3 comments1 min readLW link

[Question] Ma­chine Learn­ing vs Differ­en­tial Privacy

IlioJan 5, 2023, 3:14 PM
10 points
10 comments1 min readLW link

Covid 1/​5/​23: Var­i­ous XBB Takes

ZviJan 5, 2023, 2:20 PM
21 points
18 comments15 min readLW link
(thezvi.wordpress.com)

Run­ning by Default

jefftkJan 5, 2023, 1:50 PM
113 points
40 comments1 min readLW link
(www.jefftk.com)

PSA: re­ward is part of the habit loop too

Alok SinghJan 5, 2023, 11:00 AM
22 points
2 comments1 min readLW link
(alok.github.io)

In­fo­haz­ards vs Fork Hazards

jimrandomhJan 5, 2023, 9:45 AM
68 points
16 comments1 min readLW link

Monthly Shorts 12/​22

CelerJan 5, 2023, 7:20 AM
5 points
2 comments1 min readLW link
(keller.substack.com)

The 2021 Re­view Phase

RaemonJan 5, 2023, 7:12 AM
34 points
7 comments3 min readLW link

Illu­sion of truth effect and Am­bi­guity effect: Bias in Eval­u­at­ing AGI X-Risks

RemmeltJan 5, 2023, 4:05 AM
−13 points
2 commentsLW link

When you plan ac­cord­ing to your AI timelines, should you put more weight on the me­dian fu­ture, or the me­dian fu­ture | even­tual AI al­ign­ment suc­cess? ⚖️

Jeffrey LadishJan 5, 2023, 1:21 AM
25 points
10 comments2 min readLW link

Why I’m join­ing Anthropic

evhubJan 5, 2023, 1:12 AM
118 points
4 comments2 min readLW link

Con­tra Com­mon Knowledge

abramdemskiJan 4, 2023, 10:50 PM
52 points
31 comments16 min readLW link

Ad­di­tional space com­plex­ity isn’t always a use­ful met­ric

Brendan LongJan 4, 2023, 9:53 PM
4 points
3 comments3 min readLW link
(www.brendanlong.com)

List of links for get­ting into AI safety

zefJan 4, 2023, 7:45 PM
6 points
0 comments1 min readLW link

Open­ing Face­book Links Externally

jefftkJan 4, 2023, 7:00 PM
12 points
3 comments1 min readLW link
(www.jefftk.com)

Con­ver­sa­tional canyons

Henrik KarlssonJan 4, 2023, 6:55 PM
59 points
4 comments7 min readLW link
(escapingflatland.substack.com)

Progress links and tweets, 2023-01-04

jasoncrawfordJan 4, 2023, 6:23 PM
15 points
0 comments1 min readLW link
(rootsofprogress.org)

200 COP in MI: Analysing Train­ing Dynamics

Neel NandaJan 4, 2023, 4:08 PM
16 points
0 comments14 min readLW link

What’s up with ChatGPT and the Tur­ing Test?

Jan 4, 2023, 3:37 PM
13 points
19 comments3 min readLW link

2022 was the year AGI ar­rived (Just don’t call it that)

Logan ZoellnerJan 4, 2023, 3:19 PM
102 points
60 comments3 min readLW link

From Si­mon’s ant to ma­chine learn­ing, a parable

Bill BenzonJan 4, 2023, 2:37 PM
6 points
5 comments2 min readLW link

Ba­sic Facts about Lan­guage Model Internals

Jan 4, 2023, 1:01 PM
130 points
19 comments9 min readLW link

Ri­tual as the only tool for over­writ­ing val­ues and goals

mrcbarbierJan 4, 2023, 11:11 AM
41 points
24 comments32 min readLW link

Nor­malcy bias and Base rate ne­glect: Bias in Eval­u­at­ing AGI X-Risks

RemmeltJan 4, 2023, 3:16 AM
−16 points
0 commentsLW link

Causal rep­re­sen­ta­tion learn­ing as a tech­nique to pre­vent goal misgeneralization

PabloAMCJan 4, 2023, 12:07 AM
21 points
0 comments8 min readLW link

What makes a prob­a­bil­ity ques­tion “well-defined”? (Part II: Ber­trand’s Para­dox)

Noah TopperJan 3, 2023, 10:39 PM
7 points
3 comments9 min readLW link
(naivebayes.substack.com)

“AI” is an indexical

TW123Jan 3, 2023, 10:00 PM
10 points
0 comments6 min readLW link
(aiwatchtower.substack.com)

An ML in­ter­pre­ta­tion of Shard Theory

berenJan 3, 2023, 8:30 PM
39 points
5 comments4 min readLW link

Talk­ing to God

abramdemskiJan 3, 2023, 8:14 PM
30 points
7 comments2 min readLW link

My Ad­vice for In­com­ing SERI MATS Scholars

Johannes C. MayerJan 3, 2023, 7:25 PM
58 points
6 comments4 min readLW link

Touch re­al­ity as soon as pos­si­ble (when do­ing ma­chine learn­ing re­search)

LawrenceCJan 3, 2023, 7:11 PM
117 points
9 comments8 min readLW link1 review

Kolb’s: an ap­proach to con­sciously get bet­ter at anything

jacquesthibsJan 3, 2023, 6:16 PM
12 points
1 comment6 min readLW link

[Question] {M|Im|Am}oral Mazes—any large-scale coun­terex­am­ples?

DagonJan 3, 2023, 4:43 PM
24 points
4 comments1 min readLW link

Effec­tively self-study­ing over the Internet

libaiJan 3, 2023, 4:23 PM
11 points
1 comment4 min readLW link

Set-like math­e­mat­ics in type theory

Thomas KehrenbergJan 3, 2023, 2:33 PM
5 points
1 comment13 min readLW link

Monthly Roundup #2

ZviJan 3, 2023, 12:50 PM
23 points
3 comments23 min readLW link
(thezvi.wordpress.com)

Whisper’s Wild Implications

Ollie J3 Jan 2023 12:17 UTC
19 points
6 comments5 min readLW link

How to eat potato chips while typing

KatjaGrace3 Jan 2023 11:50 UTC
45 points
12 comments1 min readLW link
(worldspiritsockpuppet.com)

[Question] I have thou­sands of copies of HPMOR in Rus­sian. How to use them with the most im­pact?

Mikhail Samin3 Jan 2023 10:21 UTC
26 points
3 comments1 min readLW link

Is re­cur­sive self-al­ign­ment pos­si­ble?

No77e3 Jan 2023 9:15 UTC
5 points
5 comments1 min readLW link

On the nat­u­ral­is­tic study of the lin­guis­tic be­hav­ior of ar­tifi­cial intelligence

Bill Benzon3 Jan 2023 9:06 UTC
1 point
0 comments4 min readLW link

SF Se­vere Weather Warning

stavros3 Jan 2023 6:04 UTC
3 points
3 comments1 min readLW link
(news.ycombinator.com)

Sta­tus quo bias; Sys­tem jus­tifi­ca­tion: Bias in Eval­u­at­ing AGI X-Risks

3 Jan 2023 2:50 UTC
−11 points
0 comments1 min readLW link