Us­ing Con­sen­sus Mechanisms as an ap­proach to Alignment

PrometheusJun 10, 2023, 11:38 PM
11 points
2 comments6 min readLW link

Hu­man­i­ties first math prob­lem, The shal­low gene pool.

archeonJun 10, 2023, 11:09 PM
−2 points
0 comments1 min readLW link

I can see how I am Dumb

Johannes C. MayerJun 10, 2023, 7:18 PM
46 points
11 comments5 min readLW link

Etho­dy­nam­ics of Omelas

dr_sJun 10, 2023, 4:24 PM
83 points
18 comments9 min readLW link1 review

Deal­ing with UFO claims

ChristianKlJun 10, 2023, 3:45 PM
3 points
32 comments1 min readLW link

A The­ory of Un­su­per­vised Trans­la­tion Mo­ti­vated by Un­der­stand­ing An­i­mal Communication

jsdJun 10, 2023, 3:44 PM
19 points
0 comments1 min readLW link
(arxiv.org)

[Question] What are brains?

ValentineJun 10, 2023, 2:46 PM
10 points
22 comments2 min readLW link

EY in the New York Times

BlueberryJun 10, 2023, 12:21 PM
6 points
14 comments1 min readLW link
(www.nytimes.com)

Goal-mis­gen­er­al­iza­tion is ELK-hard

rokosbasiliskJun 10, 2023, 9:32 AM
2 points
0 comments1 min readLW link

[Question] What do benefi­cial TDT trades for hu­man­ity con­cretely look like?

Stephen FowlerJun 10, 2023, 6:50 AM
4 points
0 comments1 min readLW link

cloud seed­ing doesn’t work

bhauthJun 10, 2023, 5:14 AM
7 points
2 comments1 min readLW link

[FICTION] Un­box­ing Ely­sium: An AI’S Escape

Super AGIJun 10, 2023, 4:41 AM
−16 points
4 comments14 min readLW link

[FICTION] Prometheus Ris­ing: The Emer­gence of an AI Consciousness

Super AGIJun 10, 2023, 4:41 AM
−14 points
0 comments9 min readLW link

for­mal­iz­ing the QACI al­ign­ment for­mal-goal

Jun 10, 2023, 3:28 AM
54 points
6 comments13 min readLW link
(carado.moe)

Ex­pert trap: Why is it hap­pen­ing? (Part 2 of 3) – how hind­sight, hi­er­ar­chy, and con­fir­ma­tion bi­ases break con­duc­tivity and ac­cu­racy of knowledge

Paweł SysiakJun 9, 2023, 11:00 PM
3 points
0 comments7 min readLW link

Ex­pert trap: What is it? (Part 1 of 3) – how hind­sight, hi­er­ar­chy, and con­fir­ma­tion bi­ases break con­duc­tivity and ac­cu­racy of knowledge

Paweł SysiakJun 9, 2023, 11:00 PM
6 points
2 comments8 min readLW link

[Question] How ac­cu­rate is data about past earth tem­per­a­tures?

tailcalledJun 9, 2023, 9:29 PM
10 points
2 comments1 min readLW link

Proxi-An­tipodes: A Geo­met­ri­cal In­tu­ition For The Difficulty Of Align­ing AI With Mul­ti­tudi­nous Hu­man Values

Matthew_OpitzJun 9, 2023, 9:21 PM
7 points
0 comments5 min readLW link

Why AI may not save the World

Alberto ZannoniJun 9, 2023, 5:42 PM
0 points
0 comments4 min readLW link
(a16z.com)

You can now listen to the “AI Safety Fun­da­men­tals” courses

PeterHJun 9, 2023, 4:45 PM
6 points
0 comments1 min readLW link
(forum.effectivealtruism.org)

Ex­plor­ing Con­cept-Spe­cific Slices in Weight Ma­tri­ces for Net­work Interpretability

DuncanFowlerJun 9, 2023, 4:39 PM
1 point
0 comments6 min readLW link

A plea for solu­tion­ism on AI safety

jasoncrawfordJun 9, 2023, 4:29 PM
72 points
6 comments6 min readLW link
(rootsofprogress.org)

Michael Shel­len­berger: US Has 12 Or More Alien Space­craft, Say Mili­tary And In­tel­li­gence Contractors

lcJun 9, 2023, 4:11 PM
11 points
31 comments3 min readLW link
(public.substack.com)

Im­prove­ment on MIRI’s Corrigibility

Jun 9, 2023, 4:10 PM
54 points
8 comments13 min readLW link

D&D.Sci 5E: Re­turn of the League of Defen­ders Eval­u­a­tion & Ruleset

aphyerJun 9, 2023, 3:25 PM
30 points
8 comments6 min readLW link

In­ternLM—China’s Best (Un­ver­ified)

Lao MeinJun 9, 2023, 7:39 AM
51 points
4 comments1 min readLW link

[Question] Mark for fol­low up?

JNSJun 9, 2023, 5:59 AM
5 points
4 comments2 min readLW link

Bring­ing Lit­tle Kids to Con­tra Dances

jefftkJun 9, 2023, 2:20 AM
22 points
0 comments2 min readLW link
(www.jefftk.com)

[Question] (solved) how do i find oth­ers’ short­form posts?

kuiraJun 9, 2023, 2:15 AM
1 point
1 comment1 min readLW link

[Question] AI Rights: In your view, what would be re­quired for an AGI to gain rights and pro­tec­tions from the var­i­ous Govern­ments of the World?

Super AGIJun 9, 2023, 1:24 AM
10 points
26 comments1 min readLW link

A com­par­i­son of causal scrub­bing, causal ab­strac­tions, and re­lated methods

Jun 8, 2023, 11:40 PM
73 points
3 comments22 min readLW link

Up­dates and Reflec­tions on Op­ti­mal Ex­er­cise af­ter Nearly a Decade

romeostevensitJun 8, 2023, 11:02 PM
213 points
57 comments2 min readLW link1 review

Take­aways from the Mechanis­tic In­ter­pretabil­ity Challenges

scasperJun 8, 2023, 6:56 PM
94 points
5 comments6 min readLW link

Leave an Emo­tional Line of Retreat

Johannes C. MayerJun 8, 2023, 6:36 PM
23 points
1 comment1 min readLW link

Cur­rent AI harms are also sci-fi

Christopher KingJun 8, 2023, 5:49 PM
26 points
3 comments1 min readLW link

Two Ways To Re­duce Un­hap­piness That Comes From Dis­torted Views of Reality

Anne HsuJun 8, 2023, 5:43 PM
3 points
0 comments7 min readLW link

Col­lab­o­ra­tion in Science: Hap­pier Peo­ple ↔ Bet­ter Research

nadinespyJun 8, 2023, 5:42 PM
3 points
0 comments32 min readLW link

Biomimetic al­ign­ment: Align­ment be­tween an­i­mal genes and an­i­mal brains as a model for al­ign­ment be­tween hu­mans and AI sys­tems

geoffreymillerJun 8, 2023, 4:05 PM
10 points
1 comment16 min readLW link

A po­ten­tially high im­pact differ­en­tial tech­nolog­i­cal de­vel­op­ment area

Noosphere89Jun 8, 2023, 2:33 PM
5 points
2 comments2 min readLW link

[Question] Ques­tion for Pre­dic­tion Mar­ket peo­ple: where is the money sup­posed to come from?

Robert_AIZIJun 8, 2023, 1:58 PM
25 points
26 comments1 min readLW link

AI #15: The Prin­ci­ple of Charity

ZviJun 8, 2023, 12:10 PM
73 points
16 comments44 min readLW link
(thezvi.wordpress.com)

if you’re read­ing this it’s too late (a new the­ory on what is caus­ing the Great Stag­na­tion)

rogersbaconJun 8, 2023, 11:49 AM
−10 points
2 comments13 min readLW link
(www.secretorum.life)

[Linkpost] Scal­ing laws for lan­guage en­cod­ing mod­els in fMRI

Bogdan Ionut CirsteaJun 8, 2023, 10:52 AM
30 points
0 comments1 min readLW link

Trans­for­ma­tive AI is a pro­cess

meijer1973Jun 8, 2023, 8:57 AM
2 points
0 comments5 min readLW link

Cri­sis of Faith case study: be­yond re­duc­tion­ism?

MalcolmOceanJun 8, 2023, 6:11 AM
6 points
9 comments19 min readLW link

I wrote this be­cause of watermelon

ArtiJun 8, 2023, 3:55 AM
4 points
2 comments1 min readLW link

Learn­ing Trans­former Pro­grams [Linkpost]

aogJun 8, 2023, 12:16 AM
7 points
0 comments1 min readLW link
(arxiv.org)

What will GPT-2030 look like?

jsteinhardtJun 7, 2023, 11:40 PM
185 points
43 comments23 min readLW link
(bounded-regret.ghost.io)

Progress links and tweets, 2023-06-07

jasoncrawfordJun 7, 2023, 11:26 PM
11 points
0 comments1 min readLW link
(rootsofprogress.org)

LEAst-squares Con­cept Era­sure (LEACE)

tricky_labyrinthJun 7, 2023, 9:51 PM
68 points
10 comments1 min readLW link
(twitter.com)