Most likely the OP means this quick take by Ivan Vendrov. Davidad ended up believing that the LLMs have been grokking the Natural Abstract Goodness which is unlikely to be capturable by existing benchmarks. While I do buy the idea that the NAG exists, I don’t think that I understand how one can check that the LLMs really understood it.
Most likely the OP means this quick take by Ivan Vendrov. Davidad ended up believing that the LLMs have been grokking the Natural Abstract Goodness which is unlikely to be capturable by existing benchmarks. While I do buy the idea that the NAG exists, I don’t think that I understand how one can check that the LLMs really understood it.
Davidad also announced the other day that he’s leaving ARIA to pursue a research agenda focused on working with AIs on moral philosophy.
https://x.com/davidad/status/2039390998694891816