It seems totally able to be remedied somehow, but it’s been an open problem for a looong time. It definitely seems like it’ll be one of the last things to fall from current vantage point. But maybe we just accumulate enough unreliable workarounds that it no longer is a severe limitation. I have ideas, hopefully they’re bad ones because I’d rather this not get improved until we’ve figured out how to gain confidence in safety/alignment qualitatively faster than we can right now, enough that open ended RL at test time can be assumed to be asymptotically safe.
It seems totally able to be remedied somehow, but it’s been an open problem for a looong time. It definitely seems like it’ll be one of the last things to fall from current vantage point. But maybe we just accumulate enough unreliable workarounds that it no longer is a severe limitation. I have ideas, hopefully they’re bad ones because I’d rather this not get improved until we’ve figured out how to gain confidence in safety/alignment qualitatively faster than we can right now, enough that open ended RL at test time can be assumed to be asymptotically safe.