In articles that I read, I often see a case made for optimization processes that tend to sacrifice as much as possible of the value on dimensions that the agent/optimizer does not care about for a very minuscule increase on dimensions that change the perceived total value. For example, AI that creates a dystopia that is very good on some measures, but really bad on some other just to refine those that matter for it.
What I don’t see analyzed that much is that agents need to be self-referencing in their thought process, and on a meta level, also take their thought process itself and its limits and consequences as part of their value function.
We live in a finite world where: - Any data has measurement errors, you can’t measure things ideally, and the precision depends on the resources used in the measurement (you can produce better measurement devices using more energy, time, and other resources) - Decision to optimize more or think more uses time and energy, so you need a self-referencing model that optimally should sensibly decide when to stop optimizing. - Often world around does not wait; things happen, and there are time constraints.
I see that as a limiting factor for over-optimization for minuscule results. Too much thinking and too detailed simulation or optimization lose useful resources (energy, matter, etc.) for very small gains, so the negative value of that loss should be seen by an agent as much higher than the positive value.
This is also why we are not agents who think everything through and have exact control over every aspect of our lives. On the contrary, we have a lot of cognitive biases and thought heuristics and automatic responses, so our brains don’t use so much energy.
I also don’t think that intelligence is about predicting power itself. It would be in an ideal world where computation would be free. In our universe, optimal intelligence is about very good predicting power that utilises simplification and discretization to be efficient and quick. Our whole language is about it—it takes things that are not discrete and differ in many small details, like every cat is different, and categorizes them—clusters them—into named classes about things, attributes, and actions (yes, I’m simplifying, but I want to only paint the idea).
In articles that I read, I often see a case made for optimization processes that tend to sacrifice as much as possible of the value on dimensions that the agent/optimizer does not care about for a very minuscule increase on dimensions that change the perceived total value. For example, AI that creates a dystopia that is very good on some measures, but really bad on some other just to refine those that matter for it.
What I don’t see analyzed that much is that agents need to be self-referencing in their thought process, and on a meta level, also take their thought process itself and its limits and consequences as part of their value function.
We live in a finite world where:
- Any data has measurement errors, you can’t measure things ideally, and the precision depends on the resources used in the measurement (you can produce better measurement devices using more energy, time, and other resources)
- Decision to optimize more or think more uses time and energy, so you need a self-referencing model that optimally should sensibly decide when to stop optimizing.
- Often world around does not wait; things happen, and there are time constraints.
I see that as a limiting factor for over-optimization for minuscule results. Too much thinking and too detailed simulation or optimization lose useful resources (energy, matter, etc.) for very small gains, so the negative value of that loss should be seen by an agent as much higher than the positive value.
This is also why we are not agents who think everything through and have exact control over every aspect of our lives. On the contrary, we have a lot of cognitive biases and thought heuristics and automatic responses, so our brains don’t use so much energy.
I also don’t think that intelligence is about predicting power itself. It would be in an ideal world where computation would be free. In our universe, optimal intelligence is about very good predicting power that utilises simplification and discretization to be efficient and quick. Our whole language is about it—it takes things that are not discrete and differ in many small details, like every cat is different, and categorizes them—clusters them—into named classes about things, attributes, and actions (yes, I’m simplifying, but I want to only paint the idea).
Just food for thought.