A possible longer term issue with this is when future generations of models are pre-trained, this style of code will be a significant fraction of their training data which will only grow over time, so just as its been hard to get models out of the “chatgpt-ese” due to simulators reasons, it may also be hard to get models out of this messy code basin, even before you do any code RL, once they realize their chat-models and they’re “supposed to” talk like this.
I say issue, because it does seem worse to have a trend in the direction of AI code un-readability by humans have some momentum behind it, rather than just being a result of easily changeable RL fine-tuning.
On the plus side, it should be pretty easy to collect a lot of negative examples now of ‘code that solves the problem, but in a gross way’. Having a large dataset of such is the first step to using these negative examples to train models not to do this.
A possible longer term issue with this is when future generations of models are pre-trained, this style of code will be a significant fraction of their training data which will only grow over time, so just as its been hard to get models out of the “chatgpt-ese” due to simulators reasons, it may also be hard to get models out of this messy code basin, even before you do any code RL, once they realize their chat-models and they’re “supposed to” talk like this.
I say issue, because it does seem worse to have a trend in the direction of AI code un-readability by humans have some momentum behind it, rather than just being a result of easily changeable RL fine-tuning.
On the plus side, it should be pretty easy to collect a lot of negative examples now of ‘code that solves the problem, but in a gross way’. Having a large dataset of such is the first step to using these negative examples to train models not to do this.