Thank you for sharing! I was wondering how closely this mirrors human studies. This may be a learned tendency from pre-training.
Ephraiem Sarabamoun
Hey Christian, thank you for your thoughtful comment. Yes, I remember also hearing about such studies, and you are correct that such studies would imply that engineers are more bullish on the usefulness of models than is merited which is difficult to reconcile with the general tone of my comment (suggesting that engineers around me seem to underplay how much they themselves use models and downplay the capabilities of models)
I can think of a few interesting suggestions though that might be insightful:
1. Maybe the effects that I have observed are not representative of the field as a whole. I have noticed that junior developers and startup founders are generally much more open about their LLM usage and much more bullish on model capabilities than senior developers at large companies.
2. It is possible this is a social effect. I.e engineers today do think that LLMs are capable but downplay this fact in public. If, for example, the studies that you site are based on anonymous feedback then that discrepancy would be consistent with this effect. Engineers will reply anonymously that they are more productive with models but publicly will downplay this.
3. Models have gotten a lot better. I would be interested to see if those studies would yield the same results today as they did then. I personally would agree that a year ago models were probably a net negative on productivity especially for senior developers (they would introduce more subtle bugs that are a pain to debug). Today though, I think they are certainly a net productivity gain, and it would be interesting to see if developers today are undervaluing the productivity gains from models.
These are just a few thoughts I had. I would be interesting in hearing what you think about this. Ultimately this is only my experience, I would be interested to know what others’ experiences are
I have seen a softer version of this in software engineering. Even though in the case of software engineering, using AI would not count as “cheating”, still, many engineers will actively downplay the degree to which they rely on models for their work. I also noticed a general reluctance to acknowledge when models perform well and an eagerness to highlight when models perform poorly. I think this is partially driven by job insecurity (If you acknowledge that a model is better at coding than you are, then why should a company keep you on the payroll) and partially by feelings of inferiority (people will see me as a fraud, etc.). While such behaviors and feelings are understandable, I do think they are fundamentally disempowering and come from a place of weakness and insecurity.
Hey Eliezer, I am a big fan of your work. I did have one idea about this and would be interested in hearing your thoughts. I liked your example about the Maginot line, if one German division crosses into France, you do not have time to build a new better Maginot line before the next division arrives. This is both true and pretty funny. Here, the reason why you cannot rely on the “gradualness of the problem” as a source of hope is that the intervention technique (building the Maginot line) takes orders of magnitude more time to implement than the developing problem (German divisions moving into France). For AI takeoff though, perhaps the unfolding problem takes long enough to develop that interventions (a world wide ban on training frontier models for example) actually can work mid-disaster. I would be interested in hearing your thoughts on this. Thank you!
Hey there! Thank you very much for your comment. Yep, I do think all the links you included are highly relevant to this post.
Interesting read, thank you for sharing. I think I would distinguish between two “optimization processes” that have molded the fates of living things on this planet (I don’t love the term “optimization process” but it will do).
1. Evolution by natural selection which selects for “fitness”
2. Human intentional selection, which selects for usefulness to humans.
As a general principle, I do think you are on to something with the idea of a “honeymoon phase”. The general pattern is, new niche opens up, one specie moves into this niche and dominates it (honeymoon phase), other species move into the niche making it much less comfortable and much more competitive than it used to be.
I do think though, that with the emergence of humans on the planet, the second optimization process is dominant. Like you mentioned with the example of the chickens, I do think it does make sense to translate the idea of a honeymoon phase to this optimization process as well, but we must account for the fact that unlike natural selection, what is driving the train now are human preferences. This makes me more optimistic than you are that barring major cataclysms or the emergence of superhuman AIs, humanity will have a bright future.
One last note, if AI development continues and we do enter a post human AI age, then the world might be molded by what the AIs want rather than what we want. In that world, humans might indeed enter a honeymoon phase where the AIs treat us well at first and then really poorly after (like the chicken).
Would be interested in hearing your thoughts on this, and thanks again for sharing!