′ newcom’, ‘slaught’, ‘senal’ and ‘volunte’
I think these could be a result of a simple stemming algorithm:
newcomer → newcom
volunteer → volunte
senaling → senal
Stemming can be used to preprocess text and to create indexes in information retrieval.
Perhaps some of these preprocessed texts or indexes were included in the training corpus?
I think these could be a result of a simple stemming algorithm:
newcomer → newcom
volunteer → volunte
senaling → senal
Stemming can be used to preprocess text and to create indexes in information retrieval.
Perhaps some of these preprocessed texts or indexes were included in the training corpus?