I realized the reference “thin layer” is ambiguous in my post, just wanted to confirm if you were referring to the general case “”thin model, fat services”, or the specific safety question at the bottom “is it possible to have a thin mapping layer on top of your Physics simulator that somehow subverts or obfuscates it”? My child reply assumed the former, but on consideration/re-reading I suspect the latter might be more likely?
Thinking about ways in which this safety margin could break; is it possible to have a thin mapping layer on top of your Physics simulator that somehow subverts or obfuscates it
I suppose that a mapping task might fall under the heading of a mesa-optimizer, where what it is doing is optimizing for fidelity between between the outputs of the language layer and the inputs of the physics layer. This would be in addition to the mesa-optimization going on just in the physics simulator. Working title:
I realized the reference “thin layer” is ambiguous in my post, just wanted to confirm if you were referring to the general case “”thin model, fat services”, or the specific safety question at the bottom “is it possible to have a thin mapping layer on top of your Physics simulator that somehow subverts or obfuscates it”? My child reply assumed the former, but on consideration/re-reading I suspect the latter might be more likely?
I suppose that a mapping task might fall under the heading of a mesa-optimizer, where what it is doing is optimizing for fidelity between between the outputs of the language layer and the inputs of the physics layer. This would be in addition to the mesa-optimization going on just in the physics simulator. Working title:
CAIS: The Case For Maximum Mesa Optimizers