We also went down a similar rabbit hole when trying to build off the paper “Language Models Learn to Mislead Humans via RLHF”, and for what it’s worth, it took far more work than 10-1525 hours. If you’re interested, we ended up writing our results in this post.
The original comment says 10-25 not 10-15 but to respond directly to the concern: my original estimate here is for how long it would take to set everything up and get a sense of how robust the findings are for a certain paper. Writing everything up, communicating back and forth with original authors, and fact checking would admittedly take more time.
Also, excited to see the post! Would be interested in speaking with you further about this line of work.
We also went down a similar rabbit hole when trying to build off the paper “Language Models Learn to Mislead Humans via RLHF”, and for what it’s worth, it took far more work than 10-
1525 hours. If you’re interested, we ended up writing our results in this post.The original comment says 10-25 not 10-15 but to respond directly to the concern: my original estimate here is for how long it would take to set everything up and get a sense of how robust the findings are for a certain paper. Writing everything up, communicating back and forth with original authors, and fact checking would admittedly take more time.
Also, excited to see the post! Would be interested in speaking with you further about this line of work.