I’m very confused by “effective horizon length”. I have at least two questions:
1) what are the units of “effective horizon length”?
The definition “how much data the model must process …” suggests it is in units of information, and this is the case in the supervised learning extended example.
It’s then stated that effective horizon length has units subjective seconds [1].
Then H in the estimation of total training FLOP as FHKPα has units subjective seconds per sample.
2) what is the motivation for requiring a definition like this?
From doing the Fermi decomposition into FHKPα, intuitively the quantity that needs to be estimated is something like “subjective seconds per sample for a TAI to use the datapoint as productively as a human”. This seems quite removed from the perturbation definition, so I’d love some more motivation.
Oh, and additionally in [4 of 4], the “hardware bottlenecked” link in the responses section is broken.
-----
[1] I presume it’s possible to convert between “amount of data” and “subjective seconds” by measuring the number of seconds required by the brain to process that much data. However to me this is an implicit leap of faith.
I’m very confused by “effective horizon length”. I have at least two questions:
1) what are the units of “effective horizon length”?
The definition “how much data the model must process …” suggests it is in units of information, and this is the case in the supervised learning extended example.
It’s then stated that effective horizon length has units subjective seconds [1].
Then H in the estimation of total training FLOP as FHKPα has units subjective seconds per sample.
2) what is the motivation for requiring a definition like this?
From doing the Fermi decomposition into FHKPα, intuitively the quantity that needs to be estimated is something like “subjective seconds per sample for a TAI to use the datapoint as productively as a human”. This seems quite removed from the perturbation definition, so I’d love some more motivation.
Oh, and additionally in [4 of 4], the “hardware bottlenecked” link in the responses section is broken.
-----
[1] I presume it’s possible to convert between “amount of data” and “subjective seconds” by measuring the number of seconds required by the brain to process that much data. However to me this is an implicit leap of faith.