Actually I wonder if we could do an experiment in the following way:
Collect Wikipedia in English
Collect Wikipedia in some other language, e.g. Japanese
Train an LLM on these two languages
Try it out!
It’s true that there will be some amount of overlap, but this should put a ceiling on how well this approach could work.
I’m confused—your examples for IVT reference “lumpy” functions. I’m not exactly sure what that means, but it seems like you mean functions with discrete, sharp steps. Such a function would be discontinuous, and IVT only applies to continuous functions.