RSS

Some per­spec­tives on the dis­ci­pline of Physics

Tahp20 May 2024 18:19 UTC
12 points
2 comments13 min readLW link
(quark.rodeo)

In­ter­pretabil­ity: In­te­grated Gra­di­ents is a de­cent at­tri­bu­tion method

20 May 2024 17:55 UTC
9 points
3 comments6 min readLW link

The Lo­cal In­ter­ac­tion Ba­sis: Iden­ti­fy­ing Com­pu­ta­tion­ally-Rele­vant and Sparsely In­ter­act­ing Fea­tures in Neu­ral Networks

20 May 2024 17:53 UTC
60 points
1 comment3 min readLW link

In­fra-Bayesian haggling

hannagabor20 May 2024 12:23 UTC
10 points
0 comments20 min readLW link

Jaan Tal­linn’s 2023 Philan­thropy Overview

jaan20 May 2024 12:11 UTC
102 points
2 comments1 min readLW link
(jaan.info)

D&D.Sci (Easy Mode): On The Con­struc­tion Of Im­pos­si­ble Struc­tures [Eval­u­a­tion and Rule­set]

abstractapplic20 May 2024 9:38 UTC
22 points
1 comment1 min readLW link

Why I find Davi­dad’s plan interesting

Paul W20 May 2024 8:13 UTC
17 points
0 comments6 min readLW link

An­thropic: Reflec­tions on our Re­spon­si­ble Scal­ing Policy

Zac Hatfield-Dodds20 May 2024 4:14 UTC
45 points
11 comments10 min readLW link
(www.anthropic.com)

The con­sis­tent guess­ing prob­lem is eas­ier than the halt­ing problem

jessicata20 May 2024 4:02 UTC
28 points
5 comments4 min readLW link
(unstableontology.com)

Against Com­put­ers (in­finite play)

rogersbacon20 May 2024 0:43 UTC
−12 points
0 comments14 min readLW link
(www.secretorum.life)

[Question] Can en­vi­ron­men­tal laws/​NEPA be used for de­celism?

Alex K. Chen (parrot)19 May 2024 18:43 UTC
−4 points
0 comments1 min readLW link

Test­ing for par­allel rea­son­ing in LLMs

19 May 2024 15:28 UTC
2 points
7 comments9 min readLW link

Some “meta-cruxes” for AI x-risk debates

Aryeh Englander19 May 2024 0:21 UTC
14 points
2 comments3 min readLW link

On Privilege

shminux18 May 2024 22:36 UTC
15 points
10 comments2 min readLW link

To Limit Im­pact, Limit KL-Divergence

J Bostock18 May 2024 18:52 UTC
7 points
1 comment5 min readLW link

[Cross­post] In­tro­duc­ing the Save State Paradox

Suzie. EXE18 May 2024 17:00 UTC
−1 points
0 comments7 min readLW link

Scien­tific No­ta­tion Options

jefftk18 May 2024 15:10 UTC
23 points
10 comments1 min readLW link
(www.jefftk.com)

“If we go ex­tinct due to mis­al­igned AI, at least na­ture will con­tinue, right? … right?”

plex18 May 2024 14:09 UTC
46 points
23 comments2 min readLW link
(aisafety.info)

What Are Non-Zero-Sum Games?—A Primer

James Stephen Brown18 May 2024 9:19 UTC
4 points
1 comment3 min readLW link

Deep­Mind’s “​​Fron­tier Safety Frame­work” is weak and unambitious

Zach Stein-Perlman18 May 2024 3:00 UTC
141 points
13 comments4 min readLW link