A short complaint (which I hope to expand upon at some later point): there are a lot of definitions floating around which refer to outcomes rather than processes. In most cases I think that the corresponding concepts would be much better understood if we worked in terms of process definitions.
Some examples: Legg’s definition of intelligence; Karnofsky’s definition of “transformative AI”; Critch and Krueger’s definition of misalignment (from ARCHES).
Sure, these definitions pin down what you’re talking about more clearly—but that comes at the cost of understanding how and why it might come about.
E.g. when we hypothesise that AGI will be built, we know roughly what the key variables are. Whereas transformative AI could refer to all sorts of things, and what counts as transformative could depend on many different political, economic, and societal factors.
If we do not fully understand the mechanism of (e.g. human) intelligence, isn’t referring to the outcome preferable to a made-up story about the process?
(Of course, it would be even better if we understood the process and then referred to it.)
Do you think that these are mutually exclusive, or something like that? I’ve always been confused by what I take to be the position in this shortform, that defining the outcomes makes it somehow harder to define the process. Sure, you can define a process without defining an outcome (i.e. writing a program or training an NN), but since what we are confused about is what we even want at the end, for me that’s the priority. And doing so would help searching for processes leading to this outcome.
That being said, if you point is that defining outcomes isn’t enough, in that we also need to define/deconfuse/study the processes leading to these outcomes, then I agree with that.
A short complaint (which I hope to expand upon at some later point): there are a lot of definitions floating around which refer to outcomes rather than processes. In most cases I think that the corresponding concepts would be much better understood if we worked in terms of process definitions.
Some examples: Legg’s definition of intelligence; Karnofsky’s definition of “transformative AI”; Critch and Krueger’s definition of misalignment (from ARCHES).
Sure, these definitions pin down what you’re talking about more clearly—but that comes at the cost of understanding how and why it might come about.
E.g. when we hypothesise that AGI will be built, we know roughly what the key variables are. Whereas transformative AI could refer to all sorts of things, and what counts as transformative could depend on many different political, economic, and societal factors.
If we do not fully understand the mechanism of (e.g. human) intelligence, isn’t referring to the outcome preferable to a made-up story about the process?
(Of course, it would be even better if we understood the process and then referred to it.)
Do you think that these are mutually exclusive, or something like that? I’ve always been confused by what I take to be the position in this shortform, that defining the outcomes makes it somehow harder to define the process. Sure, you can define a process without defining an outcome (i.e. writing a program or training an NN), but since what we are confused about is what we even want at the end, for me that’s the priority. And doing so would help searching for processes leading to this outcome.
That being said, if you point is that defining outcomes isn’t enough, in that we also need to define/deconfuse/study the processes leading to these outcomes, then I agree with that.