I think the products of optimisation for the task of minimising predictive loss on sufficiently large and diverse datasets (e.g. humanity’s text corpus) converge to general intelligence.
Could you expand on what you mean by general intelligence, and how it gets created selected for by the task of minimising predictive loss on sufficiently large and diverse datasets like humanity’s text corpus?
Could you expand on what you mean by general intelligence, and how it gets created selected for by the task of minimising predictive loss on sufficiently large and diverse datasets like humanity’s text corpus?
This is the part I’ve not yet written up in a form I endorse.
I’ll try to get it done before the end of the year.