I think having all of this in mind as you train is actually pretty important. That way, when something doesn’t work, you know where to look:
Am I exploring enough, or stuck always pulling the first lever? (free energy)
Is it biased for some reason? (probably the metric)
Is it stuck not improving? (step or batch size)
Weight-initialization isn’t too helpful to think about yet (other than avoiding explosions at the very beginning of training, and maybe a little for transfer learning), but we’ll probably get hyper neural networks within a few years.
That may be true[1]. But it doesn’t seem like a particularly useful answer?
“The optimization target is the optimization target.”
For the outer optimiser that builds the AI
I think having all of this in mind as you train is actually pretty important. That way, when something doesn’t work, you know where to look:
Am I exploring enough, or stuck always pulling the first lever? (free energy)
Is it biased for some reason? (probably the metric)
Is it stuck not improving? (step or batch size)
Weight-initialization isn’t too helpful to think about yet (other than avoiding explosions at the very beginning of training, and maybe a little for transfer learning), but we’ll probably get hyper neural networks within a few years.