Many thanks for sharing your reflections. I found them very valuable for me and agree with most of them fully or mostly, while you are much closer to LLM foundational model developments than I am, so this is just a humble feeling on my side (I am intensive user in data space and understanding things and society/policy/economy developments, since early 2022, and we plan to integrated LLMs into a SAAS we want to develop). Then, I was actually surprised to read you were active in the 80s already (what makes you even older than me, being from 1968 😉).
Anyway, I wanted to share some reflections back with you (inserted after the numbers from from your text—it didn’t want to keep the quotations due to form reformatting of what I pasted in, my apologies) about some aspects, where I have some ideas around the Sample-Efficient Learning (mostly, and a bit on other points). If you find them useful, please use them; I would also be happy about your feedback:
Sample-Efficient Learning
on 10. and 11.
Marc: I think, it is a contributor: filtering out fluff, but intelligence is also/more combining and transferring things we learned in comprehensive and complex ways
12.
Marc: possibly, largely (with the filtering and during-intake-interpretation&classifying being a key part, I think).
13 and 14.
Marc: this fits to my interpretation above: GPUs see all pixels, but they don’t interpret/filter the pattern but store it all and the pattern emerges (very inefficiently) from the the amount of pixel “bags”
15.
Marc: That is not the same, of course. It can though obviously compensate for a lot, same as thinking does very effectively and also rather efficiently, but hallucinations and easily losing context/track (more than smart humans would) appears the inherent price to pay.
16.
Marc: In my experience humans can do this; I have worked across domains and found this to be true for me.
18.
Marc: This fits to my above interpretation: many more facts/data, but not explicitly filtered/interpreted&classified during ingestion.
19.
Marc: I think that context pollution from inner reflection and context window size, and getting then lost in complexity: too many different sub-contexts that are mixed up in a single “thinking”, where each wrong turn derails the who effort, due to lack of de-pollution measures and lack of retro-inspection from outside the active context window—at least what I understand how this is currently done, while this could even rather easily be changed! I’d wish I had moved to work I the AI field as developer—I feel like this fits to my way of thinking and problems to solve.
30.
Marc: I would have some ideas how to improve this situation in LLMs – the tricks I am aware of now are arguably unsuitable (here humans are indeed better, but I think something similar can be achieved, even with a mix of LLM&ML).
36.
Marc: Indeed. There are even more reasons, why this graphic does not tell us much, really (yes, I should name them (maybe CAPEX/OPEX ratios, available cash for invest, narrowness of topic, scale of economic expectations, who finances etc.), but you already named several so that should suffice). But putting things in perspective is always compelling (including clearly to me—even if only to realising that another perspectives would be needed.)
Another thought: I think AGI and ASI is not defined/understood as it should be—this is currently overly anthropocentric—but why?
Note: No LLM was abused or at least used in writing this feedback 😉
Many thanks for sharing your reflections. I found them very valuable for me and agree with most of them fully or mostly, while you are much closer to LLM foundational model developments than I am, so this is just a humble feeling on my side (I am intensive user in data space and understanding things and society/policy/economy developments, since early 2022, and we plan to integrated LLMs into a SAAS we want to develop). Then, I was actually surprised to read you were active in the 80s already (what makes you even older than me, being from 1968 😉).
Anyway, I wanted to share some reflections back with you (inserted after the numbers from from your text—it didn’t want to keep the quotations due to form reformatting of what I pasted in, my apologies) about some aspects, where I have some ideas around the Sample-Efficient Learning (mostly, and a bit on other points). If you find them useful, please use them; I would also be happy about your feedback:
Sample-Efficient Learning
on 10. and 11.
Marc: I think, it is a contributor: filtering out fluff, but intelligence is also/more combining and transferring things we learned in comprehensive and complex ways
12.
Marc: possibly, largely (with the filtering and during-intake-interpretation&classifying being a key part, I think).
13 and 14.
Marc: this fits to my interpretation above: GPUs see all pixels, but they don’t interpret/filter the pattern but store it all and the pattern emerges (very inefficiently) from the the amount of pixel “bags”
15.
Marc: That is not the same, of course. It can though obviously compensate for a lot, same as thinking does very effectively and also rather efficiently, but hallucinations and easily losing context/track (more than smart humans would) appears the inherent price to pay.
16.
Marc: In my experience humans can do this; I have worked across domains and found this to be true for me.
18.
Marc: This fits to my above interpretation: many more facts/data, but not explicitly filtered/interpreted&classified during ingestion.
19.
Marc: I think that context pollution from inner reflection and context window size, and getting then lost in complexity: too many different sub-contexts that are mixed up in a single “thinking”, where each wrong turn derails the who effort, due to lack of de-pollution measures and lack of retro-inspection from outside the active context window—at least what I understand how this is currently done, while this could even rather easily be changed! I’d wish I had moved to work I the AI field as developer—I feel like this fits to my way of thinking and problems to solve.
30.
Marc: I would have some ideas how to improve this situation in LLMs – the tricks I am aware of now are arguably unsuitable (here humans are indeed better, but I think something similar can be achieved, even with a mix of LLM&ML).
36.
Marc: Indeed. There are even more reasons, why this graphic does not tell us much, really (yes, I should name them (maybe CAPEX/OPEX ratios, available cash for invest, narrowness of topic, scale of economic expectations, who finances etc.), but you already named several so that should suffice). But putting things in perspective is always compelling (including clearly to me—even if only to realising that another perspectives would be needed.)
Another thought: I think AGI and ASI is not defined/understood as it should be—this is currently overly anthropocentric—but why?
Note: No LLM was abused or at least used in writing this feedback 😉