Maybe models track which features are basic and enforce that these features be more salient
Couldn’t it just write derivative features more weakly, and therefore not need any tracking mechanism other than the magnitude itself?
Some features which are computed from other features should probably themselves be treated as basic and thus represented with large salience.
Couldn’t it just write derivative features more weakly, and therefore not need any tracking mechanism other than the magnitude itself?
Some features which are computed from other features should probably themselves be treated as basic and thus represented with large salience.