Nora Belrose comments on Genetic fitness is a measure of selection strength, not the selection target

Nora Belrose 5 Nov 2023 5:25 UTC
13 points
−1
This seems to entirely ignore the actual point that is being made in the post. The point is that “IGF” is not a stable and contentful loss function, it is a misleadingly simple shorthand for “whatever traits are increasing their own frequency at the moment.” Once you see this, you notice two things:
1. In some weak sense, we are fairly well “aligned” to the “traits” that were selected for in the ancestral environment, in particular our social instincts.
2. All of the ways in which ML is disanalogous with evolution indicate that alignment will be dramatically easier and better for ML models. For starters, we don’t randomly change the objective function for ML models throughout training. See Quintin’s post for many more disanalogies.
- quetzal_rainbow 5 Nov 2023 9:19 UTC
  9 points
  6
  Parent
  The main problem I have with this type of reasoning is an arbitrary drawn ontological boundaries. Why IGF is “not real” and ML objective function is “real”, while if we really zoom in training process, the verifiable in positivist brutal way real training goal is “whatever direction in coefficient space loss function decreases on current batch of data” which seems to me pretty corresponding to “whatever traits are spreading in current environment”?