If it can assign meaning to states, then sure why not? Currently this comes with plenty of caveats, so it kind of depends on how much you want to stick about principledness and effectiveness.
Sometimes “deciding” etc. is represented in the activations, which is kind of trivial. So you can also be asking about interpreting the parameters of the AI that transform one state to another. Keywords might be circuit interpretability, automated circuit discovery, or parameter decomposition.
If it can assign meaning to states, then sure why not? Currently this comes with plenty of caveats, so it kind of depends on how much you want to stick about principledness and effectiveness.
Sometimes “deciding” etc. is represented in the activations, which is kind of trivial. So you can also be asking about interpreting the parameters of the AI that transform one state to another. Keywords might be circuit interpretability, automated circuit discovery, or parameter decomposition.