Lec­ture Series on Tiling Agents

abramdemski14 Jan 2025 21:34 UTC
38 points
14 comments1 min readLW link

Is AI Phys­i­cal?

Lauren Greenspan14 Jan 2025 21:21 UTC
23 points
6 comments7 min readLW link

Her­i­ta­bil­ity: Five Battles

Steven Byrnes14 Jan 2025 18:21 UTC
94 points
23 comments60 min readLW link

The Philo­soph­i­cal Glos­sary of AI

David Gross14 Jan 2025 17:36 UTC
11 points
0 comments1 min readLW link
(www.aiglossary.co.uk)

I’m offer­ing free math con­sul­ta­tions!

Gurkenglas14 Jan 2025 16:30 UTC
83 points
7 comments1 min readLW link

Why aban­don “prob­a­bil­ity is in the mind” when it comes to quan­tum dy­nam­ics?

Maxwell Peterson14 Jan 2025 15:53 UTC
23 points
24 comments1 min readLW link

How do you deal w/​ Su­per Stim­uli?

Logan Riggs14 Jan 2025 15:14 UTC
112 points
25 comments3 min readLW link

curate

technicalities14 Jan 2025 14:40 UTC
12 points
0 comments2 min readLW link

Our new video about goal mis­gen­er­al­iza­tion, plus an apology

Writer14 Jan 2025 14:07 UTC
33 points
0 comments7 min readLW link
(youtu.be)

NYC Conges­tion Pric­ing: Early Days

Zvi14 Jan 2025 14:00 UTC
29 points
0 comments15 min readLW link
(thezvi.wordpress.com)

Do hu­mans re­ally learn from “lit­tle” data?

Alice Wanderland14 Jan 2025 10:46 UTC
14 points
5 comments1 min readLW link
(aliceandbobinwanderland.substack.com)

Ba­sics of Bayesian learning

Dmitry Vaintrob14 Jan 2025 10:00 UTC
12 points
0 comments13 min readLW link

[Question] Why do fu­tur­ists care about the cul­ture war?

Knight Lee14 Jan 2025 7:35 UTC
23 points
22 comments2 min readLW link

Don’t Le­gal­ize Drugs

Mr. Keating14 Jan 2025 6:51 UTC
38 points
10 comments9 min readLW link

Mini Go: Gate­way Game

jefftk14 Jan 2025 3:30 UTC
32 points
1 comment1 min readLW link
(www.jefftk.com)

Find­ing Fea­tures Causally Up­stream of Refusal

14 Jan 2025 2:30 UTC
54 points
5 comments12 min readLW link

Im­pli­ca­tions of the in­fer­ence scal­ing paradigm for AI safety

Ryan Kidd14 Jan 2025 2:14 UTC
96 points
70 comments5 min readLW link

Bi­den ad­minis­tra­tion un­veils global AI ex­port con­trols aimed at China

Chris_Leong14 Jan 2025 1:01 UTC
9 points
0 comments1 min readLW link
(www.axios.com)

My lat­est at­tempt to un­der­stand de­ci­sion the­ory: I asked ChatGPT to de­bate me.

bokov13 Jan 2025 19:37 UTC
−8 points
5 comments19 min readLW link

AI mod­els in­her­ently al­ter “hu­man val­ues.” So, al­ign­ment-based AI safety ap­proaches must bet­ter ac­count for value drift

bfitzgerald313213 Jan 2025 19:22 UTC
5 points
2 comments13 min readLW link

Chance is in the Map, not the Territory

13 Jan 2025 19:17 UTC
67 points
18 comments7 min readLW link

Progress links and short notes, 2025-01-13

jasoncrawford13 Jan 2025 18:35 UTC
13 points
2 comments3 min readLW link
(newsletter.rootsofprogress.org)

Bet­ter an­ti­bod­ies by en­g­ineer­ing tar­gets, not en­g­ineer­ing an­ti­bod­ies (Nabla Bio)

Abhishaike Mahajan13 Jan 2025 15:05 UTC
4 points
0 comments14 min readLW link
(www.owlposting.com)

Zvi’s 2024 In Movies

Zvi13 Jan 2025 13:40 UTC
44 points
4 comments15 min readLW link
(thezvi.wordpress.com)

Paper club: He et al. on mod­u­lar ar­ith­metic (part I)

Dmitry Vaintrob13 Jan 2025 11:18 UTC
14 points
0 comments8 min readLW link

Cast it into the fire! De­stroy it!

Aram Panasenco13 Jan 2025 7:30 UTC
6 points
9 comments2 min readLW link

Moder­ately More Than You Wanted To Know: De­pres­sive Realism

JustisMills13 Jan 2025 2:57 UTC
73 points
4 comments6 min readLW link
(justismills.substack.com)

Ap­ply­ing tra­di­tional eco­nomic think­ing to AGI: a trilemma

Steven Byrnes13 Jan 2025 1:23 UTC
153 points
32 comments3 min readLW link

Build­ing AI Re­search Fleets

12 Jan 2025 18:23 UTC
132 points
11 comments5 min readLW link

Do An­tide­pres­sants work? (First Take)

Jacob Goldsmith12 Jan 2025 17:11 UTC
7 points
9 comments7 min readLW link

A Novel Idea for Har­ness­ing Mag­netic Re­con­nec­tion as an En­ergy Source

resonova12 Jan 2025 17:11 UTC
0 points
8 comments3 min readLW link

How quickly could robots scale up?

Benjamin_Todd12 Jan 2025 17:01 UTC
46 points
25 comments1 min readLW link
(benjamintodd.substack.com)

AGI Will Not Make La­bor Worthless

Maxwell Tabarrok12 Jan 2025 15:09 UTC
−8 points
16 comments5 min readLW link
(www.maximum-progress.com)

The pur­pose­ful drunkard

Dmitry Vaintrob12 Jan 2025 12:27 UTC
98 points
13 comments6 min readLW link

No one has the ball on 1500 Rus­sian olympiad win­ners who’ve re­ceived HPMOR

Mikhail Samin12 Jan 2025 11:43 UTC
81 points
21 comments1 min readLW link

Why mod­el­ling multi-ob­jec­tive home­osta­sis is es­sen­tial for AI al­ign­ment (and how it helps with AI safety as well). Subtleties and Open Challenges.

Roland Pihlakas12 Jan 2025 3:37 UTC
47 points
7 comments12 min readLW link

Ex­tend­ing con­trol eval­u­a­tions to non-schem­ing threats

joshc12 Jan 2025 1:42 UTC
30 points
1 comment12 min readLW link

Rol­ling Thresh­olds for AGI Scal­ing Regulation

Larks12 Jan 2025 1:30 UTC
40 points
6 comments6 min readLW link

AI Safety at the Fron­tier: Paper High­lights, De­cem­ber ’24

gasteigerjo11 Jan 2025 22:54 UTC
7 points
2 comments7 min readLW link
(aisafetyfrontier.substack.com)

Fluori­da­tion: The RCT We Still Haven’t Run (But Should)

ChristianKl11 Jan 2025 21:02 UTC
22 points
5 comments2 min readLW link

In Defense of a But­le­rian Jihad

sloonz11 Jan 2025 19:30 UTC
10 points
25 comments9 min readLW link

Near term dis­cus­sions need some­thing smaller and more con­crete than AGI

ryan_b11 Jan 2025 18:24 UTC
13 points
0 comments6 min readLW link

A pro­posal for iter­ated in­ter­pretabil­ity with known-in­ter­pretable nar­row AIs

Peter Berggren11 Jan 2025 14:43 UTC
6 points
0 comments2 min readLW link

Have fron­tier AI sys­tems sur­passed the self-repli­cat­ing red line?

nsage11 Jan 2025 5:31 UTC
4 points
0 comments4 min readLW link

We need a uni­ver­sal defi­ni­tion of ‘agency’ and re­lated words

CstineSublime11 Jan 2025 3:22 UTC
18 points
1 comment5 min readLW link

[Question] AI for med­i­cal care for hard-to-treat dis­eases?

CronoDAS10 Jan 2025 23:55 UTC
12 points
1 comment1 min readLW link

Beliefs and state of mind into 2025

RussellThor10 Jan 2025 22:07 UTC
18 points
10 comments7 min readLW link

Recom­men­da­tions for Tech­ni­cal AI Safety Re­search Directions

Sam Marks10 Jan 2025 19:34 UTC
64 points
1 comment17 min readLW link
(alignment.anthropic.com)

Is AI Align­ment Enough?

Aram Panasenco10 Jan 2025 18:57 UTC
30 points
6 comments6 min readLW link

[Question] What are some sce­nar­ios where an al­igned AGI ac­tu­ally helps hu­man­ity, but many/​most peo­ple don’t like it?

RomanS10 Jan 2025 18:13 UTC
14 points
6 comments3 min readLW link