As has been observed by other commenters, the argument fails to take into account runtime limitations—in the real world programs can self improve by finding (provably) faster programs that (provably) perform the same inferences that they do, which most people would consider self improvement. However the argument may be onto something: it is indeed true that a program p cannot output a program q with K(q) > K(p) by more than a constant (there is a short program which simulates any input program and then runs its output). Here K(p) is the length of the shortest program with the same behavior as p—in this case we seem to require p to both output another program q and learn to predict a sequence. It is also true that a high level of Kolmogorov complexity is required to eventually predict all sequences up to a high level of complexity: https://arxiv.org/pdf/cs/0606070.pdf. The real world implications of this argument are probably lessened by the fact that predictors are embedded, and can improve their own hardware or even negotiate with other agents for provably superior software.
As has been observed by other commenters, the argument fails to take into account runtime limitations—in the real world programs can self improve by finding (provably) faster programs that (provably) perform the same inferences that they do, which most people would consider self improvement. However the argument may be onto something: it is indeed true that a program p cannot output a program q with K(q) > K(p) by more than a constant (there is a short program which simulates any input program and then runs its output). Here K(p) is the length of the shortest program with the same behavior as p—in this case we seem to require p to both output another program q and learn to predict a sequence. It is also true that a high level of Kolmogorov complexity is required to eventually predict all sequences up to a high level of complexity: https://arxiv.org/pdf/cs/0606070.pdf.
The real world implications of this argument are probably lessened by the fact that predictors are embedded, and can improve their own hardware or even negotiate with other agents for provably superior software.