Notes on Love

David Gross13 Jul 2022 23:35 UTC
17 points
3 comments29 min readLW link

Deep learn­ing cur­ricu­lum for large lan­guage model alignment

Jacob_Hilton13 Jul 2022 21:58 UTC
57 points
3 comments1 min readLW link
(github.com)

Ar­tifi­cial Sand­wich­ing: When can we test scal­able al­ign­ment pro­to­cols with­out hu­mans?

Sam Bowman13 Jul 2022 21:14 UTC
41 points
6 comments5 min readLW link

[Question] Any tips for elic­it­ing one’s own la­tent knowl­edge?

MSRayne13 Jul 2022 21:12 UTC
16 points
20 comments2 min readLW link

Goal Align­ment Is Ro­bust To the Sharp Left Turn

Thane Ruthenis13 Jul 2022 20:23 UTC
47 points
16 comments4 min readLW link

Mak­ing de­ci­sions us­ing mul­ti­ple worldviews

Richard_Ngo13 Jul 2022 19:15 UTC
50 points
10 comments11 min readLW link

[Question] App idea to help with read­ing STEM text­books (feed­back re­quest)

DirectedEvolution13 Jul 2022 18:28 UTC
16 points
8 comments2 min readLW link

MIRI Con­ver­sa­tions: Tech­nol­ogy Fore­cast­ing & Grad­u­al­ism (Distil­la­tion)

CallumMcDougall13 Jul 2022 15:55 UTC
31 points
1 comment20 min readLW link

Pass­ing Up Pay

jefftk13 Jul 2022 14:10 UTC
29 points
8 comments5 min readLW link
(www.jefftk.com)

[Question] How could the uni­verse be in­finitely large?

amarai13 Jul 2022 13:45 UTC
0 points
8 comments1 min readLW link

John von Neu­mann on how to safely progress with technology

Dalton Mabery13 Jul 2022 11:07 UTC
14 points
0 comments1 min readLW link

Every­one is an Im­poster

Tharin13 Jul 2022 8:46 UTC
19 points
1 comment9 min readLW link
(echoesandchimes.com)

[Question] Which AI Safety re­search agen­das are the most promis­ing?

Chris_Leong13 Jul 2022 7:54 UTC
27 points
5 comments1 min readLW link

Straw-Steelmanning

Chris van Merwijk13 Jul 2022 5:48 UTC
30 points
2 comments1 min readLW link

Alien Mes­sage Con­test: Solution

DaemonicSigil13 Jul 2022 4:07 UTC
29 points
2 comments4 min readLW link

[Question] What is wrong with this ap­proach to cor­rigi­bil­ity?

Rafael Cosman12 Jul 2022 22:55 UTC
7 points
8 comments1 min readLW link

Ac­cept­abil­ity Ver­ifi­ca­tion: A Re­search Agenda

12 Jul 2022 20:11 UTC
50 points
0 comments1 min readLW link
(docs.google.com)

Progress links and tweets, 2022-07-12

jasoncrawford12 Jul 2022 15:30 UTC
12 points
0 comments1 min readLW link
(rootsofprogress.org)

Re­sponse to Blake Richards: AGI, gen­er­al­ity, al­ign­ment, & loss functions

Steven Byrnes12 Jul 2022 13:56 UTC
62 points
9 comments15 min readLW link

Three Min­i­mum Pivotal Acts Pos­si­ble by Nar­row AI

Michael Soareverix12 Jul 2022 9:51 UTC
0 points
4 comments2 min readLW link

Mo­saic and Pal­impsests: Two Shapes of Research

adamShimi12 Jul 2022 9:05 UTC
39 points
3 comments9 min readLW link

[Question] How do you con­cisely com­mu­ni­cate & nav­i­gate the poli­tics /​ cul­ture at your job work­ing at a large cor­po­ra­tion or in­sti­tu­tion?

Willa12 Jul 2022 3:22 UTC
10 points
6 comments1 min readLW link

On how var­i­ous plans miss the hard bits of the al­ign­ment challenge

So8res12 Jul 2022 2:49 UTC
302 points
88 comments29 min readLW link3 reviews

Rainmaking

WalterL12 Jul 2022 0:42 UTC
25 points
5 comments1 min readLW link
(www.youtube.com)

Book Re­view: Neal Stephen­son’s “Ter­mi­na­tion Shock”

Tyler Simmons12 Jul 2022 0:07 UTC
13 points
0 comments30 min readLW link
(www.words-and-dirt.com)

An­nounc­ing Fu­ture Fo­rum—Ap­ply Now

11 Jul 2022 22:57 UTC
8 points
0 comments4 min readLW link
(forum.effectivealtruism.org)

Defin­ing Op­ti­miza­tion in a Deeper Way Part 2

J Bostock11 Jul 2022 20:29 UTC
7 points
0 comments4 min readLW link

Mar­riage, the Giv­ing What We Can Pledge, and the dam­age caused by vague pub­lic commitments

Jeffrey Ladish11 Jul 2022 19:38 UTC
98 points
27 comments6 min readLW link1 review

Systemization

CFAR!Duncan11 Jul 2022 18:39 UTC
40 points
5 comments12 min readLW link

How Can I Max­i­mize My Hap­piness?

UtilityMonster11 Jul 2022 17:40 UTC
6 points
2 comments6 min readLW link

[Question] How do AI timelines af­fect how you live your life?

Quadratic Reciprocity11 Jul 2022 13:54 UTC
80 points
50 comments1 min readLW link

Cam­bridge LW Meetup: Free Speech

Darmani11 Jul 2022 4:36 UTC
7 points
0 comments1 min readLW link

Check­sum Sen­sor Alignment

lsusr11 Jul 2022 3:31 UTC
12 points
2 comments1 min readLW link

The Align­ment Problem

lsusr11 Jul 2022 3:03 UTC
46 points
18 comments3 min readLW link

Im­manuel Kant and the De­ci­sion The­ory App Store

Daniel Kokotajlo10 Jul 2022 16:04 UTC
88 points
12 comments5 min readLW link

Me­tac­u­lus is seek­ing ex­pe­rienced lead­ers, re­searchers & op­er­a­tors for high-im­pact roles

ChristianWilliams10 Jul 2022 14:27 UTC
9 points
0 comments1 min readLW link
(apply.workable.com)

Avoid the ab­bre­vi­a­tion “FLOPs” – use “FLOP” or “FLOP/​s” instead

Daniel_Eth10 Jul 2022 10:44 UTC
69 points
13 comments1 min readLW link

My Op­por­tu­nity Costs

abstractapplic10 Jul 2022 10:14 UTC
21 points
3 comments3 min readLW link

Why Portland

Adam Zerner10 Jul 2022 7:20 UTC
25 points
18 comments9 min readLW link

Hes­sian and Basin volume

Vivek Hebbar10 Jul 2022 6:59 UTC
35 points
10 comments4 min readLW link

Taste & Shaping

CFAR!Duncan10 Jul 2022 5:50 UTC
64 points
1 comment16 min readLW link

Com­ment on “Propo­si­tions Con­cern­ing Digi­tal Minds and So­ciety”

Zack_M_Davis10 Jul 2022 5:48 UTC
99 points
12 comments8 min readLW link

Heaven: The last part of dystopia

Existism9 Jul 2022 22:36 UTC
−1 points
1 comment6 min readLW link

Hope Can = Heaven

Existism9 Jul 2022 22:35 UTC
−2 points
0 comments3 min readLW link

Re­port from a civ­i­liza­tional ob­server on Earth

owencb9 Jul 2022 17:26 UTC
49 points
12 comments6 min readLW link

Grouped Loss may dis­fa­vor dis­con­tin­u­ous capabilities

Adam Jermyn9 Jul 2022 17:22 UTC
14 points
2 comments4 min readLW link

Train first VS prune first in neu­ral net­works.

Donald Hobson9 Jul 2022 15:53 UTC
20 points
5 comments2 min readLW link

Vi­su­al­iz­ing Neu­ral net­works, how to blame the bias

Donald Hobson9 Jul 2022 15:52 UTC
7 points
1 comment6 min readLW link

Us­ing Ngram to es­ti­mate de­pres­sion prevalence over time

David Gross9 Jul 2022 14:57 UTC
10 points
3 comments2 min readLW link
(www.pnas.org)

Mak­ing it harder for an AGI to “trick” us, with STVs

Tor Økland Barstad9 Jul 2022 14:42 UTC
15 points
5 comments22 min readLW link