If the rewand channel has only one bit per day I don’t think any agent can infer much about the authors. Their days maybe. Some fundamental components of their preferrences possibly. But nothing a human could infer from all the bits of background he possesses. There are convergence rate results for classifiers that require just too many sample to extract enough information—especially in the face of real life feature vectors.
I’d assume there would be a reward for every story, that this would be on a ordinal scale with several options, and that it included feedback/corrections about grammar and phrasing.
If the rewand channel has only one bit per day I don’t think any agent can infer much about the authors. Their days maybe. Some fundamental components of their preferrences possibly. But nothing a human could infer from all the bits of background he possesses. There are convergence rate results for classifiers that require just too many sample to extract enough information—especially in the face of real life feature vectors.
I’d assume there would be a reward for every story, that this would be on a ordinal scale with several options, and that it included feedback/corrections about grammar and phrasing.