I'm studying ways to improve the sample efficiency of a supervised learner,
because I want to know how to reduce the number of calls to H in
‘Supervising strong learners by amplifying weak experts’
(https://www.lesswrong.com/s/EmDuGeRw749sD3GKd/p/xKvzpodBGcPMq7TqE),
in order to help my reader understand how we can adapt that
proof-of-concept for solving real world tasks that require even more
training data.
- This doesn’t just mean achieving more with the samples we have. It can mean
finding new kinds of samples that convey more information, and finding new
ways of extracting them from the human and conveying them to the learner.
rmoehn comments on Which of these five AI alignment research projects ideas are no good?