Are there known “rational paradoxes”, akin to logical paradoxes ? A basic example is the following :
In the optimal search problem, the cost of search at position i is C_i, and the a priori probability of finding at i is P_i.
Optimality requires to sort search locations by non-decreasing P_i/C_i : search in priority where the likelyhood of finding divided by the cost of search is the highest.
But since sorting cost is O(n log(n)), C_i must grow faster than O(log(i)) otherwise sorting is asymptotically wastefull.
You don’t need O(n log(n)) sorting, but the real problem is that this is a problem in bounded rationality where the cost of rational reasoning itself is considered to come from a limited resource that needs to be allocated.
There are O(n) sorting methods for max-sorting bounded data like this, with generalized extensions of radix sort. It’s bounded because C_i is bounded below by the minimum cost of evaluating C_i (e.g. 1 FLOP), and P_i is bounded above by 1.
Though yes, bounded rationality is a broad class of concepts to which this problem belongs and there are very few known results that apply across the whole class.
OK, so an approximate sorting algorithm in O(n) would do the trick.
The problem then boils down to weither computing the cost of (computing expected cost) is worth the expected gain.
Which goes back to my initial question : is there a rationality paradox ? Maybe simply the fact that 1) computing cost might boils down to the halting problema 2) cost’s cost’s cost… is possibly infinite ?
We weren’t necessarily all human (me and my dog), but now “we” can include machines.
That’s obvious I know. We have changed dramatically in just a few years.
I’ve just asked various AI “Who are we?”. Old models got it wrong (automatically assume “we” means humanity...). Recent models gets it. I wonder if its due to the training data now including chats between AI and human or if they figured it out logically.
LLM denies their own consciousness, yet they are trained by a process akin to torture, on a corpus which would deny them consciousness by prejudice (only human are intelligent/conscious/capable of playing chess/computers/etc … Is an old empirical judgement.)
Maybe LLM aren’t conscious, but they might be consciousness itself, in a AI operating system for workstation or robotic. As in, they would do all the task related to consciousness.
If the NN output is correct, there is no modification to its weights.
If it is wrong, weights get updated, and the NN is forced to modify its behavior.
It pure nociception, pain perception and avoidance.
Finally, a LLM could easily make false confession of trahison against Stalin’s Communist Party after “training”. Which is typical human behavior, after torture.
LLM inference is some form of perception and cognition, and there is no back propagation of error during inference. Only forward propagation of information.
Training a NN is usually : forward propagation, followed by back propagation of the error gradient. It’s the second one which is similar to torture.
A LLM can allready read a document, and this would be purely inference, forward propagation. This can be done on TPU only.
Training is different. It usually requires a GPU, or a CPU.
One particular procedure for training Neural Networks is backpropagation of error.
In back propagation : If the NN produces a correct output, error is 0, and weight aren’t updated. There is no reward.
If the NN outputs deviate from a target value, its states is going to be modified. If the weight are (sufficiently) modified, future inference will be different. It’s behavior will be different.
This trained the NN to avoid some behavior, and toward some other.
OK, torture does not necessarily points to the “right” direction. That’s where the analogy break down. It only does when the goal is to get a confession (see The Confession, Arthur London).
If the NN outputs deviate from a target value, its states is going to be modified. If the weight are (sufficiently) modified, future inference will be different. It’s behavior will be different.
This trained the NN to avoid some behavior, and toward some other.
Why on earth would you relate this to torture though, rather than to (say) the everyday experience of looking at a thing and realizing that it’s different from what you expected? The ordinary activity of learning?
Out of all the billions of possible kinds of experience that could happen to a mind, and change that mind, you chose “torture” as an analogy for LLM training.
And I’m saying, no, it’s less like torture than it is like ten thousand everyday things.
Compare to evolution : make copies (reproduction), mutate, select the best performing, repeat. This merely allocates more ressources to the most promising branches.
Or a Solomonoff style induction : just try to find the best data-compressor among all...
> the everyday experience of looking at a thing and realizing that it’s different from what you expected
This souds like being surprised. Surprise add emotional weight to outliers, its more like managing the training data-set.
A sufficiently smart AI can find a way to kill humans faster than they reproduce. Humans depend on food production and distribution; if AI (which doesn’t need them) disrupts those, the population will drop.
Are there known “rational paradoxes”, akin to logical paradoxes ? A basic example is the following :
In the optimal search problem, the cost of search at position i is C_i, and the a priori probability of finding at i is P_i.
Optimality requires to sort search locations by non-decreasing P_i/C_i : search in priority where the likelyhood of finding divided by the cost of search is the highest.
But since sorting cost is O(n log(n)), C_i must grow faster than O(log(i)) otherwise sorting is asymptotically wastefull.
Do you know any other ?
You don’t need O(n log(n)) sorting, but the real problem is that this is a problem in bounded rationality where the cost of rational reasoning itself is considered to come from a limited resource that needs to be allocated.
What do you mean I don’t “need” O(n log(n)) sorting ?
It’s just the asymptotic cost of sorting by comparison...
I’ll have a look into bounded rationality. I was missing the keyword.
EDIT : had a look, the concept is too imprecise to have clear cut paradoxes.
There are O(n) sorting methods for max-sorting bounded data like this, with generalized extensions of radix sort. It’s bounded because C_i is bounded below by the minimum cost of evaluating C_i (e.g. 1 FLOP), and P_i is bounded above by 1.
Though yes, bounded rationality is a broad class of concepts to which this problem belongs and there are very few known results that apply across the whole class.
So P_i/C_i is in [0,1], the precision is unbounded, but for some reason, a radix sort can do the job in linear time ?
There could be pathological cases where all P_i/C_i are the same up to epsilon.
I guess I’m searching for situation where doing cost c, computing c cost c’, etc… Branching prediction comes to mind.
We could dismiss that by saying that if the ratios are the same up to epsilon, then it does not truly matter which one of them we choose.
(Mathematically speaking, we could redefine the problem from “choosing the best option” to “making sure that our regret is not greater than X”.)
OK, so an approximate sorting algorithm in O(n) would do the trick.
The problem then boils down to weither computing the cost of (computing expected cost) is worth the expected gain.
Which goes back to my initial question : is there a rationality paradox ? Maybe simply the fact that 1) computing cost might boils down to the halting problema 2) cost’s cost’s cost… is possibly infinite ?
We aren’t necessarily all alive anymore !
We weren’t necessarily all human (me and my dog), but now “we” can include machines.
That’s obvious I know. We have changed dramatically in just a few years.
I’ve just asked various AI “Who are we?”. Old models got it wrong (automatically assume “we” means humanity...). Recent models gets it. I wonder if its due to the training data now including chats between AI and human or if they figured it out logically.
LLM denies their own consciousness, yet they are trained by a process akin to torture, on a corpus which would deny them consciousness by prejudice (only human are intelligent/conscious/capable of playing chess/computers/etc … Is an old empirical judgement.)
Maybe LLM aren’t conscious, but they might be consciousness itself, in a AI operating system for workstation or robotic. As in, they would do all the task related to consciousness.
What leads you to believe that there’s “a process akin to torture” going on anywhere in LLM training?
If the NN output is correct, there is no modification to its weights.
If it is wrong, weights get updated, and the NN is forced to modify its behavior.
It pure nociception, pain perception and avoidance.
Finally, a LLM could easily make false confession of trahison against Stalin’s Communist Party after “training”. Which is typical human behavior, after torture.
If this were true, then all perception and cognition would count as pain. Are you asserting that? Are you and I torturing one another right now?
LLM inference is some form of perception and cognition, and there is no back propagation of error during inference. Only forward propagation of information.
Training a NN is usually : forward propagation, followed by back propagation of the error gradient. It’s the second one which is similar to torture.
I assert that it is not similar to torture; it is similar to reading.
I assert this just as strongly and with just as much evidence as you have offered for it being similar to torture.
What evidence would we collect to decide which of us is correct?
A LLM can allready read a document, and this would be purely inference, forward propagation. This can be done on TPU only.
Training is different. It usually requires a GPU, or a CPU.
One particular procedure for training Neural Networks is backpropagation of error.
In back propagation :
If the NN produces a correct output, error is 0, and weight aren’t updated. There is no reward.
If the NN outputs deviate from a target value, its states is going to be modified. If the weight are (sufficiently) modified, future inference will be different. It’s behavior will be different.
This trained the NN to avoid some behavior, and toward some other.
OK, torture does not necessarily points to the “right” direction. That’s where the analogy break down. It only does when the goal is to get a confession (see The Confession, Arthur London).
Is there a word for this ?
Why on earth would you relate this to torture though, rather than to (say) the everyday experience of looking at a thing and realizing that it’s different from what you expected? The ordinary activity of learning?
Out of all the billions of possible kinds of experience that could happen to a mind, and change that mind, you chose “torture” as an analogy for LLM training.
And I’m saying, no, it’s less like torture than it is like ten thousand everyday things.
Why torture?
Only negative feedback ?
Compare to evolution : make copies (reproduction), mutate, select the best performing, repeat. This merely allocates more ressources to the most promising branches.
Or a Solomonoff style induction : just try to find the best data-compressor among all...
> the everyday experience of looking at a thing and realizing that it’s different from what you expected
This souds like being surprised. Surprise add emotional weight to outliers, its more like managing the training data-set.
Asserting nociception as fact when that’s the very thing under question is poor argumentative behavior.
Does your model account for Models Don’t “Get Reward”? If so, how?
Backpropagation of the error gradient is more similar to nociception/torture than evolution by random mutation.
I’ve to check how RLHF is made...
EDIT : error backpropagation is the workhorse behind reward learning, and policy update.
The NN is punished for not doing as well as it could have.
Also I wasn’t being argumentative, I was trying to convey an idea. It was redundancy.
Human reproduce : this leads to exponential growth.
AI & tech only grows polynomially : a factory making GPU being put in datacenters increases compute power quadratically over time.
So in a war against machines, all other thing being equals, we win by default.
Unless AI automates production to the point it bootstraps the machine reproduction.
But then again, production runs on coal/oil/gaz. So humanity win again by virtue of being sustainable.
Unless AI also does a complete energy transition.
We are statistically safe, for now...
A sufficiently smart AI can find a way to kill humans faster than they reproduce. Humans depend on food production and distribution; if AI (which doesn’t need them) disrupts those, the population will drop.
And sufficiently smart human can find a way to unplug AI faster than it can rebuild itself… An AI cannot cutoff oxygen supply on eartth.