Count­ing ar­gu­ments provide no ev­i­dence for AI doom

Feb 27, 2024, 11:03 PM
101 points
188 comments14 min readLW link

Which an­i­mals re­al­ize which types of sub­jec­tive welfare?

MichaelStJulesFeb 27, 2024, 7:31 PM
4 points
0 commentsLW link

Biose­cu­rity and AI: Risks and Opportunities

Steve NewmanFeb 27, 2024, 6:45 PM
11 points
1 comment7 min readLW link
(www.safe.ai)

The Gem­ini In­ci­dent Continues

ZviFeb 27, 2024, 4:00 PM
45 points
6 comments48 min readLW link
(thezvi.wordpress.com)

How I in­ter­nal­ized my achieve­ments to bet­ter deal with nega­tive feelings

Raymond KoopmanschapFeb 27, 2024, 3:10 PM
42 points
7 comments6 min readLW link

On Frus­tra­tion and Regret

silentbobFeb 27, 2024, 12:19 PM
8 points
0 comments4 min readLW link

San Fran­cisco ACX Meetup “Third Satur­day”

Feb 27, 2024, 7:07 AM
7 points
0 comments1 min readLW link

Ex­am­in­ing Lan­guage Model Perfor­mance with Re­con­structed Ac­ti­va­tions us­ing Sparse Au­toen­coders

Feb 27, 2024, 2:43 AM
43 points
16 comments15 min readLW link

Pro­ject idea: an iter­ated pris­oner’s dilemma com­pe­ti­tion/​game

Adam ZernerFeb 26, 2024, 11:06 PM
8 points
0 comments5 min readLW link

Act­ing Wholesomely

owencbFeb 26, 2024, 9:49 PM
59 points
64 commentsLW link

Get­ting ra­tio­nal now or later: nav­i­gat­ing pro­cras­ti­na­tion and time-in­con­sis­tent prefer­ences for new ra­tio­nal­ists

milo_thoughtsFeb 26, 2024, 7:38 PM
1 point
0 comments8 min readLW link

[Question] Whom Do You Trust?

JackOfAllTradesFeb 26, 2024, 7:38 PM
1 point
0 comments1 min readLW link

Boundary Vio­la­tions vs Boundary Dissolution

ChipmonkFeb 26, 2024, 6:59 PM
8 points
4 comments1 min readLW link

[Question] Can we get an AI to “do our al­ign­ment home­work for us”?

Chris_LeongFeb 26, 2024, 7:56 AM
53 points
33 comments1 min readLW link

How I build and run be­hav­ioral interviews

benkuhnFeb 26, 2024, 5:50 AM
32 points
6 comments4 min readLW link
(www.benkuhn.net)

Hid­den Cog­ni­tion De­tec­tion Meth­ods and Bench­marks

Paul CologneseFeb 26, 2024, 5:31 AM
22 points
11 comments4 min readLW link

Cel­lu­lar res­pi­ra­tion as a steam engine

dkl9Feb 25, 2024, 8:17 PM
24 points
1 comment1 min readLW link
(dkl9.net)

[Question] Ra­tion­al­ism and Depen­dent Origi­na­tion?

BaometrusFeb 25, 2024, 6:16 PM
2 points
3 comments1 min readLW link

China-AI forecasts

NathanBarnardFeb 25, 2024, 4:49 PM
40 points
29 comments6 min readLW link

Ide­olog­i­cal Bayesians

Kevin DorstFeb 25, 2024, 2:17 PM
96 points
4 comments10 min readLW link
(kevindorst.substack.com)

De­con­fus­ing In-Con­text Learning

Arjun PanicksseryFeb 25, 2024, 9:48 AM
37 points
1 comment2 min readLW link

Everett branches, in­ter-light cone trade and other alien mat­ters: Ap­pendix to “An ECL ex­plainer”

Feb 24, 2024, 11:09 PM
17 points
0 commentsLW link

Co­op­er­at­ing with aliens and AGIs: An ECL explainer

Feb 24, 2024, 10:58 PM
55 points
8 commentsLW link

Choos­ing My Quest (Part 2 of “The Sense Of Phys­i­cal Ne­ces­sity”)

LoganStrohlFeb 24, 2024, 9:31 PM
40 points
7 comments12 min readLW link

Ra­tion­al­ity Re­search Re­port: Towards 10x OODA Loop­ing?

RaemonFeb 24, 2024, 9:06 PM
117 points
25 comments15 min readLW link

Let’s ask some of the largest LLMs for tips and ideas on how to take over the world

Super AGIFeb 24, 2024, 8:35 PM
1 point
0 comments7 min readLW link

Ex­er­cise: Plan­mak­ing, Sur­prise An­ti­ci­pa­tion, and “Baba is You”

RaemonFeb 24, 2024, 8:33 PM
67 points
31 comments6 min readLW link

In search of God.

Spiritus DeiFeb 24, 2024, 6:59 PM
−19 points
3 comments7 min readLW link

Im­pos­si­bil­ity of An­thro­pocen­tric-Alignment

False NameFeb 24, 2024, 6:31 PM
−8 points
2 comments39 min readLW link

The In­ner Align­ment Problem

Jakub HalmešFeb 24, 2024, 5:55 PM
1 point
1 comment3 min readLW link
(jakubhalmes.substack.com)

We Need Ma­jor, But Not Rad­i­cal, FDA Reform

Maxwell TabarrokFeb 24, 2024, 4:54 PM
42 points
12 comments7 min readLW link
(www.maximum-progress.com)

After Over­mor­row: Scat­tered Mus­ings on the Im­me­di­ate Post-AGI World

Yuli_BanFeb 24, 2024, 3:49 PM
−3 points
0 comments26 min readLW link

[Question] CDT vs. EDT on Deterrence

Terence CoelhoFeb 24, 2024, 3:41 PM
1 point
9 comments1 min readLW link

Balanc­ing Games

jefftkFeb 24, 2024, 2:40 PM
62 points
18 comments1 min readLW link
(www.jefftk.com)

How well do truth probes gen­er­al­ise?

mishajwFeb 24, 2024, 2:12 PM
93 points
11 comments9 min readLW link

Rawls’s Veil of Ig­no­rance Doesn’t Make Any Sense

Arjun PanicksseryFeb 24, 2024, 1:18 PM
10 points
9 comments1 min readLW link

[Question] Can some­one ex­plain to me what went wrong with ChatGPT?

Valentin BaltadzhievFeb 24, 2024, 11:50 AM
9 points
1 comment1 min readLW link

The Sense Of Phys­i­cal Ne­ces­sity: A Nat­u­ral­ism Demo (In­tro­duc­tion)

LoganStrohlFeb 24, 2024, 2:56 AM
59 points
1 comment6 min readLW link

In­stru­men­tal de­cep­tion and ma­nipu­la­tion in LLMs—a case study

Olli JärviniemiFeb 24, 2024, 2:07 AM
39 points
13 comments12 min readLW link

A start­ing point for mak­ing sense of task struc­ture (in ma­chine learn­ing)

Feb 24, 2024, 1:51 AM
45 points
2 comments12 min readLW link

Why you, per­son­ally, should want a larger hu­man population

jasoncrawfordFeb 23, 2024, 7:48 PM
32 points
32 comments5 min readLW link
(rootsofprogress.org)

De­liber­a­tive Cog­ni­tive Al­gorithms as Scaffolding

Cole WyethFeb 23, 2024, 5:15 PM
20 points
4 comments3 min readLW link

The Shut­down Prob­lem: In­com­plete Prefer­ences as a Solution

EJTFeb 23, 2024, 4:01 PM
53 points
33 comments42 min readLW link

In set the­ory, ev­ery­thing is a set

Jacob G-WFeb 23, 2024, 2:35 PM
11 points
9 comments2 min readLW link

The role of philo­soph­i­cal think­ing in un­der­stand­ing large lan­guage mod­els: Cal­ibrat­ing and clos­ing the gap be­tween first-per­son ex­pe­rience and un­der­ly­ing mechanisms

Bill BenzonFeb 23, 2024, 12:19 PM
4 points
0 comments10 min readLW link

Deep and ob­vi­ous points in the gap be­tween your thoughts and your pic­tures of thought

KatjaGraceFeb 23, 2024, 7:30 AM
42 points
6 comments1 min readLW link
(worldspiritsockpuppet.com)

Paraso­cial re­la­tion­ship logic

KatjaGraceFeb 23, 2024, 7:30 AM
20 points
1 comment1 min readLW link
(worldspiritsockpuppet.com)

Sham­ing with and with­out naming

KatjaGraceFeb 23, 2024, 7:30 AM
17 points
5 comments2 min readLW link
(worldspiritsockpuppet.com)

Com­plex­ity of value but not dis­value im­plies more fo­cus on s-risk. Mo­ral un­cer­tainty and prefer­ence util­i­tar­i­anism also do.

Chi NguyenFeb 23, 2024, 6:10 AM
52 points
18 commentsLW link

[Question] Does in­creas­ing the power of a mul­ti­modal LLM get you an agen­tic AI?

yanni kyriacosFeb 23, 2024, 4:14 AM
3 points
3 comments1 min readLW link