Great post! Was very insightful, since I’m currently working on evaluation of Identity management, strong upvoted. This seems focused on evaluating LLMs; what do you think about working with LLM cognitive architectures (LMCA), wrappers like auto-gpt, langchain, etc? I’m currently operating under assumption that this is a way we can get AGI “early”, so I’m focusing on researching ways to align LMCA, which seems a bit different from aligning LLMs in general. Would be great to talk about LMCA evals :)
Great post! Was very insightful, since I’m currently working on evaluation of Identity management, strong upvoted.
This seems focused on evaluating LLMs; what do you think about working with LLM cognitive architectures (LMCA), wrappers like auto-gpt, langchain, etc?
I’m currently operating under assumption that this is a way we can get AGI “early”, so I’m focusing on researching ways to align LMCA, which seems a bit different from aligning LLMs in general.
Would be great to talk about LMCA evals :)