A mind needn’t be cu­ri­ous to reap the benefits of curiosity

Cos­mopoli­tan val­ues don’t come free

Sen­tience matters

Re­quest: stop ad­vanc­ing AI capabilities

Would we even want AI to solve all our prob­lems?

How could you pos­si­bly choose what an AI wants?

But why would the AI kill us?

Mis­gen­er­al­iza­tion as a misnomer

If in­ter­pretabil­ity re­search goes well, it may get dangerous

Hooray for step­ping out of the limelight

A rough and in­com­plete re­view of some of John Went­worth’s research

A stylized di­alogue on John Went­worth’s claims about mar­kets and optimization

Truth and Ad­van­tage: Re­sponse to a draft of “AI safety seems hard to mea­sure”

Deep Deceptiveness

Com­ments on OpenAI’s “Plan­ning for AGI and be­yond”

Ene­mies vs Malefactors

AI al­ign­ment re­searchers don’t (seem to) stack

Hash­ing out long-stand­ing dis­agree­ments seems low-value to me

Fo­cus on the places where you feel shocked ev­ery­one’s drop­ping the ball

What I mean by “al­ign­ment is in large part about mak­ing cog­ni­tion aimable at all”

