So overall, despite the limitations on the role of philosophy in AI alignment that I discussed above, I do think it’s nevertheless important that our AIs do philosophy in the ways we’d endorse.
I find the tone or vibe of this essay fluctuates. Sometimes the tone is ‘powerful AI is coming and we better make sure it wants to do the right kind of philosophy’, which imo seems incredibly fraught. The world where the manipulation example is a live problem is absurdly dangerous.
Other times—especially in section 3, I get the vibe that philosophy is less relevant than not building strong AI in the first place (limiting the extent of optimization, or keeping AI as a tool, or confined to local contexts &c).
The effect is disconcerting. I think I am confused because the background model of AI progress is missing. I.e: do you think a pause is ideal but impossible, so this is the next best thing?
I find the tone or vibe of this essay fluctuates. Sometimes the tone is ‘powerful AI is coming and we better make sure it wants to do the right kind of philosophy’, which imo seems incredibly fraught. The world where the manipulation example is a live problem is absurdly dangerous.
Other times—especially in section 3, I get the vibe that philosophy is less relevant than not building strong AI in the first place (limiting the extent of optimization, or keeping AI as a tool, or confined to local contexts &c).
The effect is disconcerting. I think I am confused because the background model of AI progress is missing. I.e: do you think a pause is ideal but impossible, so this is the next best thing?