This post comes out as inauthentic, with a 35% chance of being authentic.
However, Robin Hanson’s most recent post comes out as inauthentic, with a 16% chance of being authentic, so maybe this doesn’t work as well as I remember.
That authenticity detector is partly based on article length, I believe. I tried testing some posts, and they came out as inauthentic, I then just pasted a series of posts after one another, and the authenticity increased significantly.
Yes—since it’s based partially on length and repetition, one could initially fool it by pasting the same machine-generated text twice in a row. They put in a cheap hack to prevent this by explicitly checking for it; I imagine it’s still easy to fool.
I tried the three Less Wrong posts before this one, and it classified two of them as inauthentic and one of them as too short to test. I haven’t found anything that it considers authentic, so I’d call it a broken detector.
In fairness, it was designed specifically for scientific papers, so I’m not sure if blog posts should be expected to have the same sort of structure. I tried some old philosophical academic papers of mine, and came up in the 80% range (authentic).
I can’t help but think this is machine-generated. Anyone know the link to that utility MIT concocted for detecting machine-generated text?
Hmm… I think this one at Indiana University is the one I was thinking of: Inauthentic paper detector
This post comes out as inauthentic, with a 35% chance of being authentic.
However, Robin Hanson’s most recent post comes out as inauthentic, with a 16% chance of being authentic, so maybe this doesn’t work as well as I remember.
That authenticity detector is partly based on article length, I believe. I tried testing some posts, and they came out as inauthentic, I then just pasted a series of posts after one another, and the authenticity increased significantly.
Yes—since it’s based partially on length and repetition, one could initially fool it by pasting the same machine-generated text twice in a row. They put in a cheap hack to prevent this by explicitly checking for it; I imagine it’s still easy to fool.
I tried the three Less Wrong posts before this one, and it classified two of them as inauthentic and one of them as too short to test. I haven’t found anything that it considers authentic, so I’d call it a broken detector.
In fairness, it was designed specifically for scientific papers, so I’m not sure if blog posts should be expected to have the same sort of structure. I tried some old philosophical academic papers of mine, and came up in the 80% range (authentic).