″ This result is independently interesting as one solution to the problem of safe exploration with limited oversight in nonergodic environments, which [Amodei et al., 2016] discus ”
^ This wasn’t super clear to me.… maybe it should just be moved somewhere else in the text?
I’m not sure what you’re saying is interesting here. I guess it’s the same thing I found interesting, which is that you can get sufficient (and safe-as-a-human) exploration using the human-does-the-exploration scheme you propose. Is that what you mean to refer to?
Yeah that’s what I mean to refer to: this is a system which learns everything it needs to from the human while querying her less and less, which makes human-lead exploration viable from a capabilities standpoint. Do you think that clarification would make things clearer?
″ This result is independently interesting as one solution to the problem of safe exploration with limited oversight in nonergodic environments, which [Amodei et al., 2016] discus ”
^ This wasn’t super clear to me.… maybe it should just be moved somewhere else in the text?
I’m not sure what you’re saying is interesting here. I guess it’s the same thing I found interesting, which is that you can get sufficient (and safe-as-a-human) exploration using the human-does-the-exploration scheme you propose. Is that what you mean to refer to?
Yeah that’s what I mean to refer to: this is a system which learns everything it needs to from the human while querying her less and less, which makes human-lead exploration viable from a capabilities standpoint. Do you think that clarification would make things clearer?