That Alien Mes­sage—The Animation

WriterSep 7, 2024, 2:53 PM
144 points
10 comments8 min readLW link
(youtu.be)

Jonothan Go­rard:The ter­ri­tory is iso­mor­phic to an equiv­alence class of its maps

Daniel CSep 7, 2024, 10:04 AM
19 points
18 comments2 min readLW link
(x.com)

Pay Risk Eval­u­a­tors in Cash, Not Equity

Adam SchollSep 7, 2024, 2:37 AM
215 points
19 comments1 min readLW link

Ex­cerpts from “A Reader’s Man­i­festo”

Arjun PanicksserySep 6, 2024, 10:37 PM
72 points
1 comment13 min readLW link
(arjunpanickssery.substack.com)

Fun With CellxGene

sarahconstantinSep 6, 2024, 10:00 PM
30 points
2 comments7 min readLW link
(sarahconstantin.substack.com)

[Question] Is this vot­ing sys­tem strat­egy proof?

Donald HobsonSep 6, 2024, 8:44 PM
17 points
9 comments1 min readLW link

Adam Op­ti­mizer Causes Priv­ileged Ba­sis in Trans­former LM Resi­d­ual Stream

Sep 6, 2024, 5:55 PM
70 points
7 comments4 min readLW link

Back­doors as an anal­ogy for de­cep­tive alignment

Sep 6, 2024, 3:30 PM
104 points
2 comments8 min readLW link
(www.alignment.org)

A Cable Holder for 2 Cent

Johannes C. MayerSep 6, 2024, 11:01 AM
1 point
1 comment1 min readLW link

Per­haps Try a Lit­tle Ther­apy, As a Treat?

segfault Sep 6, 2024, 8:51 AM
−188 points
61 comments16 min readLW link

In­ves­ti­gat­ing Sen­si­tive Direc­tions in GPT-2: An Im­proved Baseline and Com­par­a­tive Anal­y­sis of SAEs

Sep 6, 2024, 2:28 AM
28 points
0 comments12 min readLW link

Dist­in­guish worst-case anal­y­sis from in­stru­men­tal train­ing-gaming

Sep 5, 2024, 7:13 PM
38 points
0 comments5 min readLW link

AI x Hu­man Flour­ish­ing: In­tro­duc­ing the Cos­mos Institute

Brendan McCordSep 5, 2024, 6:23 PM
14 points
5 comments6 min readLW link
(cosmosinstitute.substack.com)

What is SB 1047 *for*?

RaemonSep 5, 2024, 5:39 PM
61 points
8 comments3 min readLW link

in­struc­tion tun­ing and au­tore­gres­sive dis­tri­bu­tion shift

nostalgebraistSep 5, 2024, 4:53 PM
40 points
5 comments5 min readLW link

Con­flat­ing value al­ign­ment and in­tent al­ign­ment is caus­ing confusion

Seth HerdSep 5, 2024, 4:39 PM
49 points
18 comments5 min readLW link

A bet for Samo Burja

Nathan Helm-BurgerSep 5, 2024, 4:01 PM
14 points
2 comments2 min readLW link

Univer­sal ba­sic in­come isn’t always AGI-proof

Kevin KohlerSep 5, 2024, 3:39 PM
5 points
3 comments7 min readLW link
(machinocene.substack.com)

Why Reflec­tive Sta­bil­ity is Important

Johannes C. MayerSep 5, 2024, 3:28 PM
19 points
2 comments1 min readLW link

Why Swiss watches and Tay­lor Swift are AGI-proof

Kevin KohlerSep 5, 2024, 1:23 PM
18 points
11 comments6 min readLW link
(machinocene.substack.com)

Is Redis­tribu­tive Tax­a­tion Jus­tifi­able? Part 1: Do the Rich De­serve their Wealth?

Alexander de VriesSep 5, 2024, 10:23 AM
7 points
20 comments10 min readLW link
(2ndhandecon.substack.com)

What pro­gram struc­tures en­able effi­cient in­duc­tion?

Daniel CSep 5, 2024, 10:12 AM
23 points
5 comments3 min readLW link

How to Fake Decryption

ohmurphySep 5, 2024, 9:18 AM
12 points
0 comments4 min readLW link
(ohmurphy.substack.com)

We Should Try to Directly Mea­sure the Value of Scien­tific Papers

ohmurphySep 5, 2024, 9:08 AM
1 point
0 comments5 min readLW link
(ohmurphy.substack.com)

on Science Beak­ers and DDT

bhauthSep 5, 2024, 3:21 AM
23 points
13 comments9 min readLW link
(bhauth.com)

Mas­sive Ac­ti­va­tions and why <bos> is im­por­tant in To­k­enized SAE Unigrams

Louka Ewington-PitsosSep 5, 2024, 2:19 AM
1 point
0 comments3 min readLW link

The Forg­ing of the Great Minds: An Un­finished Tale

Aryeh EnglanderSep 5, 2024, 12:58 AM
−3 points
0 comments5 min readLW link

The Chat­bot of Babble

Aryeh EnglanderSep 5, 2024, 12:56 AM
−3 points
0 comments7 min readLW link

[Question] Is it Le­gal to Main­tain Tur­ing Tests us­ing Data Poi­son­ing, and would it work?

DoubleSep 5, 2024, 12:35 AM
8 points
9 comments1 min readLW link

Ex­e­cutable philos­o­phy as a failed to­tal­iz­ing meta-worldview

jessicataSep 4, 2024, 10:50 PM
93 points
40 comments4 min readLW link
(unstableontology.com)

Against Ex­plo­sive Growth

c.troutSep 4, 2024, 9:45 PM
14 points
1 comment5 min readLW link

The Frag­ility of Life Hy­poth­e­sis and the Evolu­tion of Cooperation

KristianRonnSep 4, 2024, 9:04 PM
50 points
6 comments11 min readLW link

Emo­tion-In­formed Valu­a­tion Mechanism for Im­proved AI Align­ment in Large Lan­guage Models

Javier Marin ValenzuelaSep 4, 2024, 5:00 PM
2 points
4 comments6 min readLW link

What hap­pens if you pre­sent 500 peo­ple with an ar­gu­ment that AI is risky?

Sep 4, 2024, 4:40 PM
110 points
8 comments3 min readLW link
(blog.aiimpacts.org)

Au­tomat­ing LLM Au­dit­ing with Devel­op­men­tal Interpretability

Sep 4, 2024, 3:50 PM
19 points
0 comments3 min readLW link

Michael Dick­ens’ Caf­feine Tol­er­ance Research

niplavSep 4, 2024, 3:41 PM
46 points
5 comments2 min readLW link
(mdickens.me)

[Question] Are UV-C Air puri­fiers so use­ful?

SebastianG Sep 4, 2024, 2:16 PM
9 points
0 comments1 min readLW link

AI and the Tech­nolog­i­cal Richter Scale

ZviSep 4, 2024, 2:00 PM
52 points
9 comments13 min readLW link
(thezvi.wordpress.com)

[Question] Is there any rigor­ous work on us­ing an­thropic un­cer­tainty to pre­vent situ­a­tional aware­ness /​ de­cep­tion?

David Scott Krueger (formerly: capybaralet)Sep 4, 2024, 12:40 PM
19 points
7 comments1 min readLW link

A Com­par­i­son Between The Prag­mato­sphere And Less Wrong

Zero ContradictionsSep 4, 2024, 9:39 AM
−19 points
10 comments2 min readLW link
(zerocontradictions.net)

An­nounc­ing the Ul­ti­mate Jailbreak­ing Championship

InnerHufflepuffSep 4, 2024, 12:35 AM
15 points
1 comment1 min readLW link

AI Safety at the Fron­tier: Paper High­lights, Au­gust ’24

gasteigerjoSep 3, 2024, 7:17 PM
28 points
0 comments6 min readLW link
(aisafetyfrontier.substack.com)

The Check­list: What Suc­ceed­ing at AI Safety Will In­volve

Sam BowmanSep 3, 2024, 6:18 PM
151 points
49 comments22 min readLW link
(sleepinyourhat.github.io)

Democ­racy be­yond majoritarianism

Arturo MaciasSep 3, 2024, 3:10 PM
5 points
2 comments4 min readLW link

On the UBI Paper

ZviSep 3, 2024, 2:50 PM
60 points
6 comments19 min readLW link
(thezvi.wordpress.com)

An Opinionated Look at In­fer­ence Rules

Gianluca CalcagniSep 3, 2024, 1:32 PM
−5 points
2 comments13 min readLW link

An­nounc­ing the PIBBSS Sym­po­sium ’24!

Sep 3, 2024, 11:19 AM
19 points
0 comments3 min readLW link

Re­duc­ing global AI com­pe­ti­tion through the Com­merce Con­trol List and Im­mi­gra­tion re­form: a dual-pronged approach

Ben SmithSep 3, 2024, 5:28 AM
16 points
2 commentsLW link

How I got 4.2M YouTube views with­out mak­ing a sin­gle video

Closed Limelike CurvesSep 3, 2024, 3:52 AM
396 points
36 comments1 min readLW link

Duped: AI and the Mak­ing of a Global Suicide Cult

izzynessSep 2, 2024, 6:51 PM
−8 points
0 comments1 min readLW link