“even ‘ASI’ may be bad at [...] moral philosophy and social planning.”
To be clear, this thought was mine, I don’t think it appeared in the linked post. There are people thinking about this sort of thing at Redwood (or rather, conceptual reasoning, which I claim includes moral philosophy and social planning among other things). The most up-to-date link I can point to is Current AIs seem pretty misaligned to me; also, this post gives some idea of a mitigation.
I could imagine a world where AI “truly generalizes” enough to run a company, but its attempts at moral philosophy and social planning are still mostly slop. Successfully running a company is in fact measurable, even if it involves pretty long timescales (just maximize revenue and/or the stock price). But for other domains, we have no good way to measure success. I feel very uncertain about the relationship between these two types of capabilities.
Social planning is very measurable. Could you achieve goals in a complicated social situation?
I agree moral philosophy is inherently different—when we say we want an AI to be “good” at moral reasoning, we’re self-consciously referencing our own vague human standards for good moral reasoning. The problem an AI has to solve to get good at moral reasoning is not just about induction, but about communication, and we might build AI that does the former but not the latter.
To be clear, this thought was mine, I don’t think it appeared in the linked post. There are people thinking about this sort of thing at Redwood (or rather, conceptual reasoning, which I claim includes moral philosophy and social planning among other things). The most up-to-date link I can point to is Current AIs seem pretty misaligned to me; also, this post gives some idea of a mitigation.
I could imagine a world where AI “truly generalizes” enough to run a company, but its attempts at moral philosophy and social planning are still mostly slop. Successfully running a company is in fact measurable, even if it involves pretty long timescales (just maximize revenue and/or the stock price). But for other domains, we have no good way to measure success. I feel very uncertain about the relationship between these two types of capabilities.
Social planning is very measurable. Could you achieve goals in a complicated social situation?
I agree moral philosophy is inherently different—when we say we want an AI to be “good” at moral reasoning, we’re self-consciously referencing our own vague human standards for good moral reasoning. The problem an AI has to solve to get good at moral reasoning is not just about induction, but about communication, and we might build AI that does the former but not the latter.
I meant the kind of social planning described on this (SEO slop) webpage. Basically “how do we develop prosocial government policies?”