Why it’s nec­es­sary to shoot your­self in the foot

Jacob G-WJul 11, 2023, 9:17 PM
39 points
7 comments2 min readLW link
(g-w1.github.io)

How do low level hy­pothe­ses con­strain high level ones? The mys­tery of the dis­ap­pear­ing di­a­mond.

Christopher KingJul 11, 2023, 7:27 PM
17 points
11 comments2 min readLW link

[Question] Do we au­to­mat­i­cally ac­cept propo­si­tions?

Aaron GraifmanJul 11, 2023, 5:45 PM
10 points
5 comments1 min readLW link

fMRI LIKE APPROACH TO AI ALIGNMENT /​ DECEPTIVE BEHAVIOUR

Escaque 66Jul 11, 2023, 5:17 PM
−1 points
3 comments2 min readLW link

In­tro­duc­ing Fate­book: the fastest way to make and track predictions

Jul 11, 2023, 3:28 PM
132 points
41 comments1 min readLW link2 reviews
(fatebook.io)

My Weirdest Experience

Bridgett KayJul 11, 2023, 2:44 PM
38 points
19 comments1 min readLW link
(dxmrevealed.wordpress.com)

An­nounc­ing The Roots of Progress Blog-Build­ing Intensive

jasoncrawfordJul 11, 2023, 2:04 PM
10 points
0 comments1 min readLW link
(rootsofprogress.org)

OpenAI Launches Su­per­al­ign­ment Taskforce

ZviJul 11, 2023, 1:00 PM
150 points
40 comments49 min readLW link
(thezvi.wordpress.com)

Cri­tiquing Risks From Learned Op­ti­miza­tion, and Avoid­ing Cached Theories

ProofBySonnetJul 11, 2023, 11:38 AM
1 point
0 comments6 min readLW link

[UPDATE: dead­line ex­tended to July 24!] New wind in ra­tio­nal­ity’s sails: Ap­pli­ca­tions for Epistea Res­i­dency 2023 are now open

Jul 11, 2023, 11:02 AM
80 points
7 comments3 min readLW link

Two Hot Takes about Quine

Charlie SteinerJul 11, 2023, 6:42 AM
17 points
0 comments2 min readLW link

Dis­in­cen­tiviz­ing de­cep­tion in mesa op­ti­miz­ers with Model Tampering

martinkunevJul 11, 2023, 12:44 AM
3 points
0 comments2 min readLW link

Drawn Out: a story

Richard_NgoJul 11, 2023, 12:08 AM
80 points
2 comments8 min readLW link

Defi­ni­tions are about effi­ciency and con­sis­tency with com­mon lan­guage.

Nacruno96Jul 10, 2023, 11:46 PM
1 point
0 comments4 min readLW link

Refram­ing Evolu­tion—An in­for­ma­tion wavefront trav­el­ing through time

Joshua ClancyJul 10, 2023, 10:36 PM
1 point
0 comments5 min readLW link
(midflip.org)

GPT-7: The Tale of the Big Com­puter (An Ex­per­i­men­tal Story)

Justin BullockJul 10, 2023, 8:22 PM
4 points
4 comments5 min readLW link

Cost-effec­tive­ness of pro­fes­sional field-build­ing pro­grams for AI safety research

Dan HJul 10, 2023, 6:28 PM
8 points
5 commentsLW link

Cost-effec­tive­ness of stu­dent pro­grams for AI safety research

Dan HJul 10, 2023, 6:28 PM
15 points
2 commentsLW link

Model­ing the im­pact of AI safety field-build­ing programs

Dan HJul 10, 2023, 6:27 PM
21 points
0 commentsLW link

I think Michael Bailey’s dis­mis­sal of my au­to­g­y­nephilia ques­tions for Scott Alexan­der and Aella makes very lit­tle sense

tailcalledJul 10, 2023, 5:39 PM
46 points
45 comments2 min readLW link

In­cen­tives from a causal perspective

Jul 10, 2023, 5:16 PM
27 points
0 comments6 min readLW link

Is the En­dow­ment Effect Due to In­com­pa­ra­bil­ity?

Kevin DorstJul 10, 2023, 4:26 PM
21 points
10 comments7 min readLW link
(kevindorst.substack.com)

Fron­tier AI Regulation

Zach Stein-PerlmanJul 10, 2023, 2:30 PM
21 points
4 comments8 min readLW link
(arxiv.org)

Why is it so hard to change peo­ple’s minds? Well, imag­ine if it wasn’t...

CelarixJul 10, 2023, 1:55 PM
6 points
9 comments6 min readLW link

Con­sider Join­ing the UK Foun­da­tion Model Taskforce

ZviJul 10, 2023, 1:50 PM
105 points
12 comments1 min readLW link
(thezvi.wordpress.com)

“Refram­ing Su­per­in­tel­li­gence” + LLMs + 4 years

Eric DrexlerJul 10, 2023, 1:42 PM
118 points
9 comments12 min readLW link

Open-minded updatelessness

Jul 10, 2023, 11:08 AM
66 points
21 comments12 min readLW link

Con­scious­ness as a con­fla­tion­ary al­li­ance term for in­trin­si­cally val­ued in­ter­nal experiences

Andrew_CritchJul 10, 2023, 8:09 AM
214 points
54 comments11 min readLW link2 reviews

The world where LLMs are possible

Ape in the coatJul 10, 2023, 8:00 AM
20 points
10 comments3 min readLW link

The virtue of determination

Richard_NgoJul 10, 2023, 5:11 AM
65 points
5 comments4 min readLW link

Some rea­sons to not say “Doomer”

RubyJul 9, 2023, 9:05 PM
46 points
18 comments4 min readLW link

The Seeker’s Game – Vignettes from the Bay

YuliaJul 9, 2023, 7:32 PM
141 points
19 comments16 min readLW link

[Question] Why have ex­po­sure no­tifi­ca­tion apps been (mostly) dis­con­tinued?

VipulNaikJul 9, 2023, 7:07 PM
10 points
5 comments2 min readLW link

[Question] The Ne­ces­sity of Pri­vacy: A Con­di­tion for So­cial Change and Ex­per­i­men­ta­tion?

BlakeJul 9, 2023, 6:42 PM
−8 points
1 comment1 min readLW link

At­tempt­ing to De­con­struct “Real”

herschelJul 9, 2023, 4:40 PM
21 points
23 comments2 min readLW link

Quick pro­posal: De­ci­sion mar­ket re­grantor us­ing man­i­fund (please im­prove)

Nathan YoungJul 9, 2023, 12:49 PM
10 points
5 comments5 min readLW link

[Question] Where are the peo­ple build­ing AGI in the non-dumb way?

Johannes C. MayerJul 9, 2023, 11:39 AM
10 points
19 comments2 min readLW link

[Question] What to read on the “in­for­mal multi-world model”?

mishkaJul 9, 2023, 4:48 AM
13 points
23 comments1 min readLW link

Whether LLMs “un­der­stand” any­thing is mostly a ter­minolog­i­cal dispute

RobertMJul 9, 2023, 3:31 AM
10 points
1 comment1 min readLW link

Ta­boo Truth

Tomás B.Jul 8, 2023, 11:23 PM
36 points
16 comments2 min readLW link

“View”

herschelJul 8, 2023, 11:19 PM
6 points
0 comments2 min readLW link

[Question] H5N1. Just how bad is the situ­a­tion?

Q HomeJul 8, 2023, 10:09 PM
16 points
8 comments1 min readLW link

A Two-Part Sys­tem for Prac­ti­cal Self-Care

Jonathan MoregårdJul 8, 2023, 9:23 PM
11 points
0 comments3 min readLW link
(honestliving.substack.com)

Really Strong Fea­tures Found in Resi­d­ual Stream

Logan RiggsJul 8, 2023, 7:40 PM
69 points
6 comments2 min readLW link

Eight Strate­gies for Tack­ling the Hard Part of the Align­ment Problem

scasperJul 8, 2023, 6:55 PM
42 points
11 comments7 min readLW link

“Con­cepts of Agency in Biol­ogy” (Okasha, 2023) - Brief Paper Summary

Nora_AmmannJul 8, 2023, 6:22 PM
40 points
3 comments7 min readLW link

Blan­chard’s Danger­ous Idea and the Plight of the Lu­cid Crossdreamer

Zack_M_DavisJul 8, 2023, 6:03 PM
38 points
135 comments72 min readLW link
(unremediatedgender.space)

Con­tin­u­ous Ad­ver­sar­ial Qual­ity As­surance: Ex­tend­ing RLHF and Con­sti­tu­tional AI

Benaya KorenJul 8, 2023, 5:32 PM
6 points
0 comments9 min readLW link

Com­mentless down­vot­ing is not a good way to fight infohazards

DirectedEvolutionJul 8, 2023, 5:29 PM
6 points
9 comments3 min readLW link

[Question] Why does anx­iety (?) make me dumb?

TeaTieAndHatJul 8, 2023, 4:13 PM
18 points
14 comments3 min readLW link