Senior research analyst at Open Philanthropy. Recently completed a doctorate in philosophy at the University of Oxford. Opinions my own.
Joe Carlsmith
Predictable updating about AI risk
[Linkpost] Shorter version of report on existential risk from power-seeking AI
A Stranger Priority? Topics at the Outer Reaches of Effective Altruism (my dissertation)
Oops! You’re right, this isn’t the right formulation of the relevant principle. Will edit to reflect.
Seeing more whole
Why should ethical anti-realists do ethics?
[Linkpost] Human-narrated audio version of “Is Power-Seeking AI an Existential Risk?”
Really appreciated this sequence overall, thanks for writing.
I really like this post. It’s a crisp, useful insight, made via a memorable concrete example (plus a few others), in a very efficient way. And it has stayed with me.
Thanks for these thoughtful comments, Paul.
I think the account you offer here is a plausible tack re: unification — I’ve added a link to it in the “empirical approaches” section.
“Facilitates a certain flavor of important engagement in the vicinity of persuasion, negotiation and trade” is a helpful handle, and another strong sincerity association for me (cf “a space that feels ready to collaborate, negotiate, figure stuff out, make stuff happen”).
I agree that it’s not necessarily desirable for sincerity (especially in your account’s sense) to permeate your whole life (though on my intuitive notion, it’s possible for some underlying sincerity to co-exist with things like play, joking around, etc), and that you can’t necessarily get to sincerity by some simple move like “just letting go of pretense.”
This stuff about encouraging more effective delusion by probing for sincerity via introspection is interesting, as are these questions about whether I’m underestimating the costs of sincerity. In this latter respect, maybe worth distinguishing the stakes of “thoughts in the privacy of your own head” (e.g., questions about the value of self-deception, non-attention to certain things, etc) from more mundane costs re: e.g., sincerity takes effort, it’s not always the most fun thing, and so on. Sounds like you’ve got the former especially in mind, and they seem like the most salient source of possible disagreement. I agree it’s a substantive question how the trade-offs here shake out, and at some point would be curious to hear more about your take.
Glad to hear you liked it :)
On sincerity
Against meta-ethical hedonism
Against the normative realist’s wager
:) -- nice glasses
Video and Transcript of Presentation on Existential Risk from Power-Seeking AI
On expected utility, part 4: Dutch books, Cox, and Complete Class
Oops! Yep, thanks for catching.
Thanks! Fixed.
Re: “0.00002 would be one in five hundred thousand, but with the percent sign it’s one in fifty million.”—thanks, edited.
Re: volatility—thanks, that sounds right to me, and like a potentially useful dynamic to have in mind.