rmoehn comments on Which of these five AI alignment research projects ideas are no good?

rmoehn 8 Aug 2019 7:12 UTC

0 points

I'm studying the effects of importance sampling on the behaviour that an RL
agent learns,
    because I want to find out whether it can lead to undesirable outcomes
        in order to help my reader understand whether importance sampling can
        solve the problem of widely varying rewards in reward engineering.