RSS

The Quan­tum Mars Tele­porter: An Em­piri­cal Test Of Per­sonal Iden­tity Theories

avturchin22 Jan 2025 11:48 UTC
6 points
0 comments2 min readLW link

Bayesian Rea­son­ing on Maps

Sjlver22 Jan 2025 10:45 UTC
2 points
0 comments4 min readLW link
(blog.purpureus.net)

Against blan­ket ar­gu­ments against interpretability

Dmitry Vaintrob22 Jan 2025 9:46 UTC
21 points
1 comment7 min readLW link

Fe­bru­ary 2025 meetup

jn222 Jan 2025 9:41 UTC
1 point
0 comments1 min readLW link

The real poli­ti­cal spectrum

Hzn22 Jan 2025 8:55 UTC
−4 points
0 comments1 min readLW link

Evolu­tion and the Low Road to Nash

22 Jan 2025 7:06 UTC
5 points
0 comments9 min readLW link

The Hu­man Align­ment Prob­lem for AIs

rife22 Jan 2025 4:06 UTC
6 points
4 comments3 min readLW link

When does ca­pa­bil­ity elic­i­ta­tion bound risk?

joshc22 Jan 2025 3:42 UTC
5 points
0 comments17 min readLW link
(redwoodresearch.substack.com)

[Question] Pop­u­lar ma­te­ri­als about en­vi­ron­men­tal goals/​agent foun­da­tions? Peo­ple want­ing to dis­cuss such top­ics?

Q Home22 Jan 2025 3:30 UTC
5 points
0 comments1 min readLW link

Kitchen Air Puri­fier Comparison

jefftk22 Jan 2025 3:20 UTC
27 points
2 comments3 min readLW link
(www.jefftk.com)

Novem­ber-De­cem­ber 2024 Progress in Guaran­teed Safe AI

Quinn22 Jan 2025 1:20 UTC
16 points
0 comments4 min readLW link
(gsai.substack.com)

Quotes from the Star­gate press conference

Nikola Jurkovic22 Jan 2025 0:50 UTC
108 points
1 comment1 min readLW link
(www.c-span.org)

Tell me about your­self: LLMs are aware of their im­plicit behaviors

22 Jan 2025 0:47 UTC
56 points
0 comments6 min readLW link
(bit.ly)

Us­ing the prob­a­bil­is­tic method to bound the perfor­mance of toy transformers

Alex Gibson21 Jan 2025 23:01 UTC
1 point
0 comments3 min readLW link

Austin LW On­line Meetup/​Game Night, 1/​21/​2025

SilasBarta21 Jan 2025 22:17 UTC
8 points
0 comments1 min readLW link

Train­ing on Doc­u­ments About Re­ward Hack­ing In­duces Re­ward Hacking

evhub21 Jan 2025 21:32 UTC
83 points
7 comments2 min readLW link
(alignment.anthropic.com)

Veo-2 Can Pro­duce Real­is­tic Ads

Logan Riggs21 Jan 2025 19:13 UTC
12 points
0 comments1 min readLW link

Com­pu­ta­tional Limits on Efficiency

vibhumeh21 Jan 2025 18:29 UTC
1 point
0 comments5 min readLW link

De­moc­ra­tiz­ing AI Gover­nance: Balanc­ing Ex­per­tise and Public Participation

Lucile Ter-Minassian21 Jan 2025 18:29 UTC
1 point
0 comments15 min readLW link

Hitler was not a monster

halgir21 Jan 2025 18:21 UTC
−3 points
0 comments1 min readLW link