Lacking any computer science background (I come from philosophy of mind, phenomenology, ethology (animal behaviour), psychology, and neuroscience), I simultaneously think that perspective gives me a unique take, and that anything I do on AI will be effectively worthless unless I get an elementary technical understanding. I agree with the point on diminishing returns, and that clearly, technical expertise is what I would profit from the most here. I managed to get hosted as a visiting researcher and thesis supevisor for AI in computer science, and have people close to me I could draw on for help who have a background in computer science, though often not the necessary birds eye view to identify what is important and what is unnecessary detail.
I’m currently particularly interested in Large Language Models, and also think they might be the best entry for me, insofar as I can interact with them competently without a programming background, and review their training data. I would really like to get an understanding of how they work that goes beyond the pop science articles of statistical parroting; basically, I am particularly interested in getting enough of an understanding of their architecture to be able to contrast it with biological models I am more familiar with. Ideally, I could benefit from learning about them while interacting with them; LLM can absolutely help you learn code and debug code, for instance, as well as explain some things—but with a massive risk of them hallucinating, and me not having the expertise to spot it.
Do you have advice on where to start on this? Which skills and knowledge are absolutely non-skippable? Which simpler models I might start with to give me a better intuition of what is going on? (I frankly do not get how LLM can possibly do what they do based on how their working mechanism has been explained to me.) Breakdowns for laypeople that get it right? I would be seriously grateful.
Thank you for writing this out.
Lacking any computer science background (I come from philosophy of mind, phenomenology, ethology (animal behaviour), psychology, and neuroscience), I simultaneously think that perspective gives me a unique take, and that anything I do on AI will be effectively worthless unless I get an elementary technical understanding. I agree with the point on diminishing returns, and that clearly, technical expertise is what I would profit from the most here. I managed to get hosted as a visiting researcher and thesis supevisor for AI in computer science, and have people close to me I could draw on for help who have a background in computer science, though often not the necessary birds eye view to identify what is important and what is unnecessary detail.
I’m currently particularly interested in Large Language Models, and also think they might be the best entry for me, insofar as I can interact with them competently without a programming background, and review their training data. I would really like to get an understanding of how they work that goes beyond the pop science articles of statistical parroting; basically, I am particularly interested in getting enough of an understanding of their architecture to be able to contrast it with biological models I am more familiar with. Ideally, I could benefit from learning about them while interacting with them; LLM can absolutely help you learn code and debug code, for instance, as well as explain some things—but with a massive risk of them hallucinating, and me not having the expertise to spot it.
Do you have advice on where to start on this? Which skills and knowledge are absolutely non-skippable? Which simpler models I might start with to give me a better intuition of what is going on? (I frankly do not get how LLM can possibly do what they do based on how their working mechanism has been explained to me.) Breakdowns for laypeople that get it right? I would be seriously grateful.