An EA used de­cep­tive mes­sag­ing to ad­vance their pro­ject; we need mechanisms to avoid de­on­tolog­i­cally du­bi­ous plans

Mikhail SaminFeb 13, 2024, 11:15 PM
24 points
1 commentLW link

Use­ful start­ing code for interpretability

eggsyntaxFeb 13, 2024, 11:13 PM
26 points
2 comments1 min readLW link

Masterpiece

Richard_NgoFeb 13, 2024, 11:10 PM
166 points
21 comments4 min readLW link
(www.narrativeark.xyz)

A Bridge Between Utili­tar­i­anism & Stoicism

Jonathan MoregårdFeb 13, 2024, 10:46 PM
5 points
0 comments5 min readLW link
(honestliving.substack.com)

The “con­text win­dow” anal­ogy for hu­man minds

RubyFeb 13, 2024, 7:29 PM
38 points
0 comments2 min readLW link

More on the Ap­ple Vi­sion Pro

ZviFeb 13, 2024, 5:40 PM
33 points
5 comments8 min readLW link
(thezvi.wordpress.com)

Lin­ear White

Teja PrabhuFeb 13, 2024, 4:31 PM
−3 points
3 comments3 min readLW link
(krez.expert)

Causal­ity is Everywhere

silentbobFeb 13, 2024, 1:44 PM
26 points
12 comments8 min readLW link

Tech­nolo­gies and Ter­minol­ogy: AI isn’t Soft­ware, it’s… Deep­ware?

Feb 13, 2024, 1:37 PM
40 points
10 comments8 min readLW link

[Question] LessWrong Is Very Wrong: Ul­ti­mately All So­cial Me­dia Plat­forms Are The Same

Amritesh KumarFeb 13, 2024, 6:53 AM
−16 points
2 comments1 min readLW link

Lsusr’s Ra­tion­al­ity Dojo

lsusrFeb 13, 2024, 5:52 AM
103 points
17 comments2 min readLW link

[Question] Where is the Town Square?

Gretta DulebaFeb 13, 2024, 3:53 AM
46 points
8 comments1 min readLW link

My cover story in Ja­cobin on AI cap­i­tal­ism and the x-risk debates

garrisonFeb 12, 2024, 11:34 PM
98 points
5 commentsLW link
(jacobin.com)

What is On­tol­ogy?

martinkunevFeb 12, 2024, 11:01 PM
4 points
0 comments4 min readLW link

Thank you for trig­ger­ing me

CissyFeb 12, 2024, 8:09 PM
6 points
1 comment6 min readLW link
(www.moremyself.xyz)

In­ter­pret­ing Quan­tum Me­chan­ics in In­fra-Bayesian Physicalism

YegregFeb 12, 2024, 6:56 PM
30 points
6 comments43 min readLW link

I played the AI box game as the Gate­keeper — and lost

datawitchFeb 12, 2024, 6:39 PM
33 points
54 comments4 min readLW link

The Last Laugh: Ex­plor­ing the Role of Hu­mor as a Bench­mark for Large Lan­guage Models

Greg RobisonFeb 12, 2024, 6:34 PM
4 points
6 comments11 min readLW link

Nat­u­ral ab­strac­tions are ob­server-de­pen­dent: a con­ver­sa­tion with John Wentworth

Martín SotoFeb 12, 2024, 5:28 PM
39 points
13 comments7 min readLW link

Tort Law Can Play an Im­por­tant Role in Miti­gat­ing AI Risk

Gabriel WeilFeb 12, 2024, 5:17 PM
39 points
9 comments5 min readLW link

On the Pro­posed Cal­ifor­nia SB 1047

ZviFeb 12, 2024, 4:40 PM
46 points
18 comments12 min readLW link
(thezvi.wordpress.com)

Thoughts on “The Offense-Defense Balance Rarely Changes”

CullenFeb 12, 2024, 3:26 AM
46 points
4 commentsLW link

Skep­ti­cism About Deep­Mind’s “Grand­mas­ter-Level” Chess Without Search

Arjun PanicksseryFeb 12, 2024, 12:56 AM
57 points
13 comments3 min readLW link

[Question] What are the known difficul­ties with this al­ign­ment ap­proach?

tailcalledFeb 11, 2024, 10:52 PM
18 points
24 comments1 min readLW link

[Question] What are the de­cid­ing fac­tors of hu­man cog­ni­tive en­durance?

koratkarFeb 11, 2024, 9:56 PM
22 points
3 comments1 min readLW link

Carl Shul­man On Dwarkesh Pod­cast June 2023

MoonickerFeb 11, 2024, 9:02 PM
18 points
0 comments159 min readLW link

How do you ac­tu­ally ob­tain and re­port a like­li­hood func­tion for sci­en­tific re­search?

Peter BerggrenFeb 11, 2024, 5:42 PM
55 points
4 comments1 min readLW link

The en­tropy maxim for bi­nary questions

dkl9Feb 11, 2024, 5:17 PM
2 points
1 comment1 min readLW link
(dkl9.net)

GPT2XL_RLLMv3 vs. Bet­terDAN, AI Machi­avelli & Oppo Jailbreaks

MiguelDevFeb 11, 2024, 11:03 AM
16 points
4 comments14 min readLW link

[Question] What’s the the­ory of im­pact for ac­ti­va­tion vec­tors?

Chris_LeongFeb 11, 2024, 7:34 AM
61 points
12 comments1 min readLW link

Ex­per­i­ment­ing With Foot­board Piezos

jefftkFeb 11, 2024, 3:00 AM
11 points
2 comments2 min readLW link
(www.jefftk.com)

The Core Values of Life—A pro­posal for a uni­ver­sal the­ory of ethics

Thomas GjøstølFeb 10, 2024, 9:48 PM
2 points
4 comments18 min readLW link

And All the Shog­goths Merely Players

Zack_M_DavisFeb 10, 2024, 7:56 PM
170 points
57 comments12 min readLW link

Sam Alt­man’s Chip Am­bi­tions Un­der­cut OpenAI’s Safety Strategy

garrisonFeb 10, 2024, 7:52 PM
198 points
52 commentsLW link
(garrisonlovely.substack.com)

The lat­tice of par­tial updatelessness

Martín SotoFeb 10, 2024, 5:34 PM
23 points
5 comments5 min readLW link

A Strange ACH Corner Case

jefftkFeb 10, 2024, 3:00 AM
27 points
2 comments2 min readLW link
(www.jefftk.com)

Dreams of AI al­ign­ment: The dan­ger of sug­ges­tive names

TurnTroutFeb 10, 2024, 1:22 AM
103 points
59 comments4 min readLW link

Sce­nario plan­ning for AI x-risk

Corin KatzkeFeb 10, 2024, 12:14 AM
24 points
12 comments14 min readLW link
(forum.effectivealtruism.org)

Close the Gates to an In­hu­man Fu­ture: How and why we should choose to not de­velop su­per­hu­man gen­eral-pur­pose ar­tifi­cial intelligence

aaguirreFeb 9, 2024, 8:25 PM
13 points
0 comments1 min readLW link
(arxiv.org)

[Cross­post] Deep Dive: The Com­ing Tech­nolog­i­cal Sin­gu­lar­ity—How to sur­vive in a Post-hu­man Era

simulacra.exeFeb 9, 2024, 6:49 PM
2 points
2 comments9 min readLW link

The Ideal Speech Si­tu­a­tion as a Tool for AI Eth­i­cal Reflec­tion: A Frame­work for Alignment

kenneth myersFeb 9, 2024, 6:40 PM
6 points
12 comments3 min readLW link

What’s ChatGPT’s Fa­vorite Ice Cream Fla­vor? An In­ves­ti­ga­tion Into Syn­thetic Respondents

Greg RobisonFeb 9, 2024, 6:38 PM
19 points
4 comments15 min readLW link

OpenAI wants to raise 5-7 trillion

O OFeb 9, 2024, 4:15 PM
13 points
29 comments1 min readLW link
(decrypt.co)

[Question] Con­stituency-sized AI congress?

Nathan Helm-BurgerFeb 9, 2024, 4:01 PM
11 points
5 comments1 min readLW link

One True Love

ZviFeb 9, 2024, 3:10 PM
34 points
7 comments10 min readLW link
(thezvi.wordpress.com)

[Question] Ex­ec­u­tive func­tion ad­vice from peo­ple who are good at it?

TeaTieAndHatFeb 9, 2024, 10:11 AM
7 points
1 comment1 min readLW link

[Question] Do you want to make an AI Align­ment song?

Kabir KumarFeb 9, 2024, 8:22 AM
4 points
0 comments1 min readLW link

Skills I’d like my col­lab­o­ra­tors to have

RaemonFeb 9, 2024, 8:20 AM
106 points
9 comments8 min readLW link

Trans­fer learn­ing and gen­er­al­iza­tion-qua-ca­pa­bil­ity in Bab­bage and Davinci (or, why di­vi­sion is bet­ter than Span­ish)

RP and agg
Feb 9, 2024, 7:00 AM
50 points
6 comments3 min readLW link

Bi­den-Har­ris Ad­minis­tra­tion An­nounces First-Ever Con­sor­tium Ded­i­cated to AI Safety

Ben SmithFeb 9, 2024, 6:40 AM
22 points
0 commentsLW link
(www.nist.gov)