That’s kind of the overall purpose behind my project, which generally aims to study how machine learning and human learning can overlap. The idea is to see where the theoretical overlap is and see how we can leverage that combined with empirical evidence to show how easy/​difficult things like lie detection or value transfer are.
How do you see we can prepare in advance to study safety of such systems?
That’s kind of the overall purpose behind my project, which generally aims to study how machine learning and human learning can overlap. The idea is to see where the theoretical overlap is and see how we can leverage that combined with empirical evidence to show how easy/​difficult things like lie detection or value transfer are.