(My) self-refer­en­tial rea­son to be­lieve in free will

jacekJan 6, 2025, 11:35 PM
12 points
6 comments1 min readLW link

Defi­ni­tion of al­ign­ment sci­ence I like

quetzal_rainbowJan 6, 2025, 8:40 PM
19 points
0 comments3 min readLW link

How will we up­date about schem­ing?

ryan_greenblattJan 6, 2025, 8:21 PM
171 points
20 comments37 min readLW link

What Indi­ca­tors Should We Watch to Disam­biguate AGI Timelines?

snewmanJan 6, 2025, 7:57 PM
139 points
57 comments13 min readLW link

Gen­er­at­ing Cog­nate­ful Sen­tences with Large Lan­guage Models

vkethanaJan 6, 2025, 6:40 PM
8 points
0 comments10 min readLW link

Really rad­i­cal empathy

MichaelStJulesJan 6, 2025, 5:46 PM
19 points
0 commentsLW link

In­de­pen­dent re­search ar­ti­cle an­a­lyz­ing con­sis­tent self-re­ports of ex­pe­rience in ChatGPT and Claude

rifeJan 6, 2025, 5:34 PM
4 points
20 comments1 min readLW link
(awakenmoon.ai)

[Question] Meal Re­place­ments in 2025?

alkjashJan 6, 2025, 3:37 PM
24 points
9 comments1 min readLW link

AI safety con­tent you could create

Adam JonesJan 6, 2025, 3:35 PM
19 points
0 comments5 min readLW link
(adamjones.me)

Child­hood and Ed­u­ca­tion #8: Deal­ing with the Internet

ZviJan 6, 2025, 2:00 PM
37 points
7 comments13 min readLW link
(thezvi.wordpress.com)

La­tent Ad­ver­sar­ial Train­ing (LAT) Im­proves the Rep­re­sen­ta­tion of Refusal

Jan 6, 2025, 10:24 AM
20 points
6 comments10 min readLW link

Alter­na­tive Cancer Care As Bio­hack­ing & Book Re­view: Sur­viv­ing “Ter­mi­nal” Cancer

DenizTJan 6, 2025, 7:43 AM
34 points
6 comments15 min readLW link

Es­ti­mat­ing the benefits of a new flu drug (BXM)

DirectedEvolutionJan 6, 2025, 4:31 AM
41 points
2 comments3 min readLW link

Mea­sur­ing Non­lin­ear Fea­ture In­ter­ac­tions in Sparse Cross­coders [Pro­ject Pro­posal]

Jan 6, 2025, 4:22 AM
19 points
0 comments12 min readLW link

“We know how to build AGI”—Sam Altman

Nikola JurkovicJan 6, 2025, 2:05 AM
62 points
5 comments1 min readLW link
(blog.samaltman.com)

[Question] Is “hid­den com­plex­ity of wishes prob­lem” solved?

Roman MalovJan 5, 2025, 10:59 PM
10 points
4 comments1 min readLW link

A Ground-Level Per­spec­tive on Ca­pac­ity Build­ing in In­ter­na­tional Development

Sean AubinJan 5, 2025, 8:36 PM
12 points
1 comment8 min readLW link

Why Lin­ear AI Safety Hits a Wall and How Frac­tal In­tel­li­gence Un­locks Non-Lin­ear Solutions

Andy E WilliamsJan 5, 2025, 5:08 PM
−5 points
6 comments5 min readLW link

How to Do a PhD (in AI Safety)

Lewis HammondJan 5, 2025, 4:57 PM
12 points
0 commentsLW link
(lewishammond.com)

Rea­sons for and against work­ing on tech­ni­cal AI safety at a fron­tier AI lab

bilalchughtaiJan 5, 2025, 2:49 PM
100 points
12 comments12 min readLW link

Op­pres­sion and pro­duc­tion are com­pet­ing ex­pla­na­tions for wealth in­equal­ity.

BenquoJan 5, 2025, 2:13 PM
45 points
16 comments8 min readLW link
(benjaminrosshoffman.com)

Max­i­miz­ing Com­mu­ni­ca­tion, not Traffic

jefftkJan 5, 2025, 1:00 PM
161 points
10 comments1 min readLW link
(www.jefftk.com)

Poli­cy­mak­ers don’t have ac­cess to pay­walled articles

Adam JonesJan 5, 2025, 10:56 AM
71 points
11 comments2 min readLW link
(adamjones.me)

Cap­i­tal Own­er­ship Will Not Prevent Hu­man Disempowerment

berenJan 5, 2025, 6:00 AM
150 points
18 comments14 min readLW link

Chi­nese Re­searchers Crack ChatGPT: Repli­cat­ing OpenAI’s Ad­vanced AI Model

Evan_GaensbauerJan 5, 2025, 3:50 AM
−8 points
1 comment1 min readLW link
(www.geeky-gadgets.com)

Orange and Straw­berry Truffles

jefftkJan 5, 2025, 1:50 AM
10 points
1 comment1 min readLW link
(www.jefftk.com)

AXRP Epi­sode 38.4 - Sha­keel Hashim on AI Journalism

DanielFilanJan 5, 2025, 12:20 AM
11 points
0 comments12 min readLW link

How i’m build­ing my ai sys­tem, how it’s go­ing so far, and my thoughts on it

ollie_Jan 4, 2025, 6:20 PM
−3 points
3 comments5 min readLW link

Park­in­son’s Law and the Ide­ol­ogy of Statistics

BenquoJan 4, 2025, 3:49 PM
127 points
7 comments8 min readLW link
(benjaminrosshoffman.com)

The Laws of Large Numbers

Dmitry VaintrobJan 4, 2025, 11:54 AM
38 points
11 comments12 min readLW link

The Golden Op­por­tu­nity for Amer­i­can AI

AnnapurnaJan 4, 2025, 10:26 AM
2 points
8 comments1 min readLW link
(blogs.microsoft.com)

A Gen­er­al­iza­tion of the Good Reg­u­la­tor Theorem

Alfred HarwoodJan 4, 2025, 9:55 AM
20 points
6 comments10 min readLW link

Logic vs in­tu­ition ⇔ al­gorithm vs ML

pchvykovJan 4, 2025, 9:06 AM
5 points
0 comments7 min readLW link

de­bat­ing buy­ing NVDA in 2019

bhauthJan 4, 2025, 5:06 AM
27 points
1 comment3 min readLW link
(bhauth.com)

Mak­ing progress bars for Alignment

Kabir KumarJan 3, 2025, 9:25 PM
2 points
0 comments1 min readLW link
(lu.ma)

The In­tel­li­gence Curse

lukedragoJan 3, 2025, 7:07 PM
133 points
27 comments18 min readLW link
(lukedrago.substack.com)

The case for pay-on-re­sults coaching

ChipmonkJan 3, 2025, 6:40 PM
16 points
3 comments1 min readLW link

In­tro­duc­ing Squig­gle AI

ozziegooenJan 3, 2025, 5:53 PM
92 points
15 commentsLW link

Hu­man study on AI spear phish­ing campaigns

Jan 3, 2025, 3:11 PM
79 points
8 comments5 min readLW link

The sub­set par­ity learn­ing prob­lem: much more than you wanted to know

Dmitry VaintrobJan 3, 2025, 9:13 AM
94 points
18 comments11 min readLW link

Build­ing AI safety bench­mark en­vi­ron­ments on themes of uni­ver­sal hu­man values

Roland PihlakasJan 3, 2025, 4:24 AM
18 points
3 comments8 min readLW link
(docs.google.com)

Emo­tional Superrationality

nullproxyJan 2, 2025, 10:54 PM
−6 points
4 comments11 min readLW link

Play­ing with Otamatones

jefftkJan 2, 2025, 7:50 PM
12 points
0 comments1 min readLW link
(www.jefftk.com)

7. Iter­ate the Game: Rac­ing Where?

Allison DuettmannJan 2, 2025, 7:06 PM
11 points
0 comments9 min readLW link

6. In­crease In­tel­li­gence: Wel­come AI Players

Allison DuettmannJan 2, 2025, 7:06 PM
6 points
1 comment19 min readLW link

5. Uphold Vol­un­tarism: Digi­tal Defense

Allison DuettmannJan 2, 2025, 7:05 PM
3 points
0 comments18 min readLW link

4. Uphold Vol­un­tarism: Phys­i­cal Defense

Allison DuettmannJan 2, 2025, 7:04 PM
6 points
2 comments23 min readLW link

3. Im­prove Co­op­er­a­tion: Bet­ter Technologies

Allison DuettmannJan 2, 2025, 7:03 PM
4 points
2 comments23 min readLW link

2. Skim the Man­ual: In­tel­li­gent Vol­un­tary Cooperation

Allison DuettmannJan 2, 2025, 7:02 PM
13 points
3 comments18 min readLW link

1. Meet the Play­ers: Value Diversity

Allison DuettmannJan 2, 2025, 7:00 PM
32 points
2 comments11 min readLW link