I’m sympathetic to most prosaic alignment work being basically streetlighting. However, I think there’s a nirvana fallacy going on when you claim that the entire field has gone astray. It’s easiest to illustrate what I mean with an analogy to capabilities.
In capabilities land, there were a bunch of old school NLP/CV people who insisted that there’s some kind of true essence of language or whatever that these newfangled neural network things weren’t tackling. The neural networks are just learning syntax, but not semantics, or they’re ungrounded, or they don’t have a world model, or they’re not representing some linguistic thing, so therefore we haven’t actually made any progress on true intelligence or understanding etc etc. Clearly NNs are just progress on the surface appearance of intelligence while actually just being shallow pattern matching, so any work on scaling NNs is actually not progress on intelligence at all. I think this position has become more untenable over time. A lot of people held onto this view deep into the GPT era but now even the skeptics have to begrudgingly admit that NNs are pretty big progress even if additional Special Sauce is needed, and that the other research approaches towards general intelligence more directly haven’t done better.
It’s instructive to think about why this was a reasonable thing for people to have believed, and why it turned out to be wrong. It is in fact true that NNs are kind of shallow pattern matchy even today, and that literally just training bigger and bigger NNs eventually runs into problems. Early NNs—heck, even very recent NNs—often have trouble with relatively basic reasoning that humans have no problem with. But the mistake is assuming that this means no progress has been made on “real” intelligence just because no NN so far has perfectly replicated all of human intelligence. Oftentimes, progress towards the hard problem does actually not immediately look like tackling the meat of the hard problem directly.
Of course, there is also a lot of capabilities work that is actually just completely useless for AGI. Almost all of it, in fact. Walk down the aisle at neurips and a minimum of 90% of the papers will fall in this category. A lot of it is streetlighting capabilities in just the way you describe, and does in fact end up completely unimpactful. Maybe this is because all the good capabilities work happens in labs nowadays, but this is true even at earlier neuripses back when all the capabilities work got published. Clearly, a field can be simultaneously mostly garbage and also still make alarmingly fast progress.
I think this is true for basically everything—most work will be crap (often predictably so ex ante), due in part to bad incentives, and then there will be a few people who still do good work anyways. This doesn’t mean that any pile of crap must have some good work in there, but it does mean that you can’t rule out the existence of good work solely by pointing at the crap and the incentives for crap. I do also happen to believe that there is good work in prosaic alignment, but that goes under the object level argument umbrella, so I won’t hash it out here.
I think you have two main points here, which require two separate responses. I’ll do them opposite the order you presented them.
Your second point, paraphrased: 90% of anything is crap, that doesn’t mean there’s no progress. I’m totally on board with that. But in alignment today, it’s not just that 90% of the work is crap, it’s that the most memetically successful work is crap. It’s not the raw volume of crap that’s the issue so much as the memetic selection pressures.
Your first point, paraphrased: progress toward the the hard problem does not necessarily immediately look like tackling the meat of the hard problem directly. I buy that to some extent, but there are plenty of cases where we can look at what people are doing and see pretty clearly that it is not progress toward the hard problem, whether direct or otherwise. And indeed, I would claim that prosaic alignment as a category is a case where people are not making progress on the hard problems, whether direct or otherwise. In particular, one relevant criterion to look at here is generalizability: is the work being done sufficiently general/robust that it will still be relevant once the rest of the problem is solved (and multiple things change in not-yet-predictable ways in order to solve the rest of the problem)? See e.g. this recent comment for an object-level example of what I mean.
in capabilities, the most memetically successful things were for a long time not the things that actually worked. for a long time, people would turn their noses at the idea of simply scaling up models because it wasn’t novel. the papers which are in retrospect the most important did not get that much attention at the time (e.g gpt2 was very unpopular among many academics; the Kaplan scaling laws paper was almost completely unnoticed when it came out; even the gpt3 paper went under the radar when it first came out.)
one example of a thing within prosaic alignment that i feel has the possibility of generalizability is interpretability. again, if we take the generalizability criteria and map it onto the capabilities analogy, it would be something like scalability—is this a first step towards something that can actually do truly general reasoning, or is it just a hack that will no longer be relevant once we discover the truly general algorithm that subsumes the hacks? if it is on the path, can we actually shovel enough compute into it (or its successor algorithms) to get to agi in practice, or do we just need way more compute than is practical? and i think at the time of gpt2 these were completely unsettled research questions! it was actually genuinely unclear whether writing articles about ovid’s unicorn is a genuine first step towards agi, or just some random amusement that will fade into irrelevancy. i think interp is in a similar position where it could work out really well and eventually become the thing that works, or it could just be a dead end.
If you’re thinking mainly about interp, then I basically agree with what you’ve been saying. I don’t usually think of interp as part of “prosaic alignment”, it’s quite different in terms of culture and mindset and it’s much closer to what I imagine a non-streetlight-y field of alignment would look like. 90% of it is crap (usually in streetlight-y ways), but the memetic selection pressures don’t seem too bad.
If we had about 10x more time than it looks like we have, then I’d say the field of interp is plausibly on track to handle the core problems of alignment.
ok good that we agree interp might plausibly be on track. I don’t really care to argue about whether it should count as prosaic alignment or not. I’d further claim that the following (not exhaustive) are also plausibly good (I’ll sketch each out for the avoidance of doubt because sometimes people use these words subtly differently):
model organisms—trying to probe the minimal sets of assumptions to get various hypothesized spicy alignment failures seems good. what is the least spoonfed demonstration of deceptive alignment we can get that is analogous mechanistically to the real deal? to what extent can we observe early signs of the prerequisites in current models? which parts of the deceptive alignment arguments are most load bearing?
science of generalization—in practice, why do NNs sometimes generalize and sometimes not? why do some models generalize better than others? In what ways are humans better or worse than NNs at generalizing? can we understand this more deeply without needing mechanistic understanding? (all closely related to ELK)
goodhart robustness—can you make reward models which are calibrated even under adversarial attack, so that when you optimize them really hard, you at least never catastrophically goodhart them?
scalable oversight (using humans, and possibly giving them a leg up with e.g secret communication channels between them, and rotating different humans when we need to simulate amnesia) - can we patch all of the problems with e.g debate? can we extract higher quality work out of real life misaligned expert humans for practical purposes (even if it’s maybe a bit cost uncompetitive)?
All four of those I think are basically useless in practice for purposes of progress toward aligning significantly-smarter-than-human AGI, including indirectly (e.g. via outsourcing alignment research to AI). There are perhaps some versions of all four which could be useful, but those versions do not resemble any work I’ve ever heard of anyone actually doing in any of those categories.
That said, many of those do plausibly produce value as propaganda for the political cause of AI safety, especially insofar as they involve demoing scary behaviors.
EDIT-TO-ADD: Actually, I guess I do think the singular learning theorists are headed in a useful direction, and that does fall under your “science of generalization” category. Though most of the potential value of that thread is still in interp, not so much black-box calculation of RLCTs.
I think we would all be interested to hear you elaborate on why you think these approaches have approximately no value. Perhaps this will be in a follow-up post.
All four of those I think are basically useless in practice for purposes of progress toward aligning significantly-smarter-than-human AGI, including indirectly (e.g. via outsourcing alignment research to AI).
It’s difficult for me to understand how this could be “basically useless in practice” for:
scalable oversight (using humans, and possibly giving them a leg up with e.g secret communication channels between them, and rotating different humans when we need to simulate amnesia) - can we patch all of the problems with e.g debate? can we extract higher quality work out of real life misaligned expert humans for practical purposes (even if it’s maybe a bit cost uncompetitive)?
It seems to me you’d want to understand and strongly show how and why different approaches here fail, and in any world where you have something like “outsourcing alignment research” you want some form of oversight.
1: Can you explain how generalization of NNs relates to ELK? I can see that it can help with ELK (if you know a reporter generalizes, you can train it on labeled situations and apply it more broadly) or make ELK unnecessary (if weak to strong generalization perfectly works and we never need to understand complex scenarios). But I’m not sure if that’s what you mean.
2: How is goodhart robustness relevant? Most models today don’t seem to use reward functions in deployment, and in training the researchers can control how hard they optimize these functions, so I don’t understand why they necessarily need to be robust under strong optimization.
there are plenty of cases where we can look at what people are doing and see pretty clearly that it is not progress toward the hard problem
There are plenty of cases where John can glance at what people are doing and see pretty clearly that it is not progress toward the hard problem.
Importantly, people with the agent foundations class of anxieties (which I embrace; I think John is worried about the right things!) do not spend time engaging on a gears level with prominent prosaic paradigms and connecting the high level objection (“it ignores the hard part of the problem”) with the details of the research.
“But Tsvi and John actually spend a lot of time doing this.”
No, they don’t! They paraphrase the core concern over and over again, often seemingly without reading the paper. I don’t think reading the paper would change your minds (nor should it!), but I think that there’s a culture problem tied to this off-hand dismissal of prosaic work that disincentivizes potential agent foundations (or similar new thing that shares the core concerns of agent foundations) researchers from engaging with, i.e., John.
Prosaic work is fraught and, much of it, doomed. New researchers over-index on tractability because short feedback loops are comforting (‘street-lighting’). Why aren’t we explaining why that is, on the terms of the research itself, rather than expecting people to be persuaded by the same high level point getting hammered into them again and again?
I’ve watched this work in real-time. If you listen to someone talk about their work, or read their paper and follow up in person, they are often receptive to a conversation about worlds in which their work is ineffective, evidence that we’re likely to be in such a world, and even to shifting the direction of their work in recognition of that evidence.
Instead, people with their eye on the ball are doing this tribalistic(-seeming) thing.
Yup, the deck is stacked against humanity solving the hard problems; for some reason, folks who know that are also committed to playing their hands poorly, and then blaming (only) the stacked deck!
John’s recent post on control is a counter-example to the above claims and was, broadly, a big step in the right direction, but had some issues with it, as raised by Redwood in the comments, which are a natural consequence of it being ~a new thing John was doing. I look forward to more posts like that in the future, from John and others, that help new entrants to empirical work (which has a robust talent pipeline!) understand, integrate, and even pivot in response to, the hard parts of the problem.
[edit: I say ‘gears level’ a couple times, but mean ‘more in the direction of gears-level than the critiques that have existed so far’]
Big crux here: I don’t actually expect useful research to occur as a result of my control-critique post. Even having updated on the discussion remaining more civil than I expected, I still expect basically-zero people to do anything useful as a result.
As a comparison: I wrote a couple posts on my AI model delta with Yudkowsky and with Christiano. For each of them, I can imagine changing ~one big piece in my model, and end up with a model which looks basically like theirs.
By contrast, when I read the stuff written on the control agenda… it feels like there is no model there at all. (Directionally-correct but probably not quite accurate description:) it feels like whoever’s writing, or whoever would buy the control agenda, is just kinda pattern-matching natural language strings without tracking the underlying concepts those strings are supposed to represent. (Joe’s recent post on “fake vs real thinking” feels like it’s pointing at the right thing here; the posts on control feel strongly like “fake” thinking.) And that’s not a problem which gets fixed by engaging at the object level; that type of cognition will mostly not produce useful work, so getting useful work out of such people would require getting them to think in entirely different ways.
… so mostly I’ve tried to argue at a different level, like e.g. in the Why Not Just… posts. The goal there isn’t really to engage the sort of people who would otherwise buy the control agenda, but rather to communicate the underlying problems to the sort of people who would already instinctively feel something is off about the control agenda, and give them more useful frames to work with. Because those are the people who might have any hope of doing something useful, without the whole structure of their cognition needing to change first.
I think the reason nobody will do anything useful-to-John as a result of the control critique post is that control is explicitly not aiming at the hard parts of the problem, and knows this about itself. In that way, control is an especially poorly selected target if the goal is getting people to do anything useful-to-John. I’d be interested in a similar post on the Alignment Faking paper (or model organisms more broadly), on RAT, on debate, on faithful CoT, on specific interpretability paradigms (circuits v SAEs, vs some coherentist approach vs shards vs....), and would expect those to have higher odds of someone doing something useful-to-John. But useful-to-John isn’t really the metric I think the field should be using, either....
I’m kind of picking on you here because you are least guilty of this failing relative to researchers in your reference class. You are actually saying anything at all, sometimes with detail, about how you feel about particular things. However, you wouldn’t be my first-pick judge for what’s useful; I’d rather live in a world where like half a dozen people in your reference class are spending non-zero time arguing about the details of the above agendas and how they interface with your broader models, so that the researchers working on those things can update based on those critiques (there may even be ways for people to apply the vector implied by y’all’s collective input, and generate something new / abandon their doomed plans).
A lot of people held onto this view deep into the GPT era but now even the skeptics have to begrudgingly admit that NNs are pretty big progress even if additional Special Sauce is needed
It’s a bit tangential to the context, but this is a topic I have an ongoing interest in: what leads you to believe that the skeptics (in particular NLP people in the linguistics community) have shifted away from their previous positions? My impression has been that many of them (though not all) have failed to really update to any significant degree. Eg here’s a paper from just last month which argues that we must not mistake the mere engineering that is LLM behavior for language understanding or production.
I’m sympathetic to most prosaic alignment work being basically streetlighting. However, I think there’s a nirvana fallacy going on when you claim that the entire field has gone astray. It’s easiest to illustrate what I mean with an analogy to capabilities.
In capabilities land, there were a bunch of old school NLP/CV people who insisted that there’s some kind of true essence of language or whatever that these newfangled neural network things weren’t tackling. The neural networks are just learning syntax, but not semantics, or they’re ungrounded, or they don’t have a world model, or they’re not representing some linguistic thing, so therefore we haven’t actually made any progress on true intelligence or understanding etc etc. Clearly NNs are just progress on the surface appearance of intelligence while actually just being shallow pattern matching, so any work on scaling NNs is actually not progress on intelligence at all. I think this position has become more untenable over time. A lot of people held onto this view deep into the GPT era but now even the skeptics have to begrudgingly admit that NNs are pretty big progress even if additional Special Sauce is needed, and that the other research approaches towards general intelligence more directly haven’t done better.
It’s instructive to think about why this was a reasonable thing for people to have believed, and why it turned out to be wrong. It is in fact true that NNs are kind of shallow pattern matchy even today, and that literally just training bigger and bigger NNs eventually runs into problems. Early NNs—heck, even very recent NNs—often have trouble with relatively basic reasoning that humans have no problem with. But the mistake is assuming that this means no progress has been made on “real” intelligence just because no NN so far has perfectly replicated all of human intelligence. Oftentimes, progress towards the hard problem does actually not immediately look like tackling the meat of the hard problem directly.
Of course, there is also a lot of capabilities work that is actually just completely useless for AGI. Almost all of it, in fact. Walk down the aisle at neurips and a minimum of 90% of the papers will fall in this category. A lot of it is streetlighting capabilities in just the way you describe, and does in fact end up completely unimpactful. Maybe this is because all the good capabilities work happens in labs nowadays, but this is true even at earlier neuripses back when all the capabilities work got published. Clearly, a field can be simultaneously mostly garbage and also still make alarmingly fast progress.
I think this is true for basically everything—most work will be crap (often predictably so ex ante), due in part to bad incentives, and then there will be a few people who still do good work anyways. This doesn’t mean that any pile of crap must have some good work in there, but it does mean that you can’t rule out the existence of good work solely by pointing at the crap and the incentives for crap. I do also happen to believe that there is good work in prosaic alignment, but that goes under the object level argument umbrella, so I won’t hash it out here.
I think you have two main points here, which require two separate responses. I’ll do them opposite the order you presented them.
Your second point, paraphrased: 90% of anything is crap, that doesn’t mean there’s no progress. I’m totally on board with that. But in alignment today, it’s not just that 90% of the work is crap, it’s that the most memetically successful work is crap. It’s not the raw volume of crap that’s the issue so much as the memetic selection pressures.
Your first point, paraphrased: progress toward the the hard problem does not necessarily immediately look like tackling the meat of the hard problem directly. I buy that to some extent, but there are plenty of cases where we can look at what people are doing and see pretty clearly that it is not progress toward the hard problem, whether direct or otherwise. And indeed, I would claim that prosaic alignment as a category is a case where people are not making progress on the hard problems, whether direct or otherwise. In particular, one relevant criterion to look at here is generalizability: is the work being done sufficiently general/robust that it will still be relevant once the rest of the problem is solved (and multiple things change in not-yet-predictable ways in order to solve the rest of the problem)? See e.g. this recent comment for an object-level example of what I mean.
in capabilities, the most memetically successful things were for a long time not the things that actually worked. for a long time, people would turn their noses at the idea of simply scaling up models because it wasn’t novel. the papers which are in retrospect the most important did not get that much attention at the time (e.g gpt2 was very unpopular among many academics; the Kaplan scaling laws paper was almost completely unnoticed when it came out; even the gpt3 paper went under the radar when it first came out.)
one example of a thing within prosaic alignment that i feel has the possibility of generalizability is interpretability. again, if we take the generalizability criteria and map it onto the capabilities analogy, it would be something like scalability—is this a first step towards something that can actually do truly general reasoning, or is it just a hack that will no longer be relevant once we discover the truly general algorithm that subsumes the hacks? if it is on the path, can we actually shovel enough compute into it (or its successor algorithms) to get to agi in practice, or do we just need way more compute than is practical? and i think at the time of gpt2 these were completely unsettled research questions! it was actually genuinely unclear whether writing articles about ovid’s unicorn is a genuine first step towards agi, or just some random amusement that will fade into irrelevancy. i think interp is in a similar position where it could work out really well and eventually become the thing that works, or it could just be a dead end.
If you’re thinking mainly about interp, then I basically agree with what you’ve been saying. I don’t usually think of interp as part of “prosaic alignment”, it’s quite different in terms of culture and mindset and it’s much closer to what I imagine a non-streetlight-y field of alignment would look like. 90% of it is crap (usually in streetlight-y ways), but the memetic selection pressures don’t seem too bad.
If we had about 10x more time than it looks like we have, then I’d say the field of interp is plausibly on track to handle the core problems of alignment.
ok good that we agree interp might plausibly be on track. I don’t really care to argue about whether it should count as prosaic alignment or not. I’d further claim that the following (not exhaustive) are also plausibly good (I’ll sketch each out for the avoidance of doubt because sometimes people use these words subtly differently):
model organisms—trying to probe the minimal sets of assumptions to get various hypothesized spicy alignment failures seems good. what is the least spoonfed demonstration of deceptive alignment we can get that is analogous mechanistically to the real deal? to what extent can we observe early signs of the prerequisites in current models? which parts of the deceptive alignment arguments are most load bearing?
science of generalization—in practice, why do NNs sometimes generalize and sometimes not? why do some models generalize better than others? In what ways are humans better or worse than NNs at generalizing? can we understand this more deeply without needing mechanistic understanding? (all closely related to ELK)
goodhart robustness—can you make reward models which are calibrated even under adversarial attack, so that when you optimize them really hard, you at least never catastrophically goodhart them?
scalable oversight (using humans, and possibly giving them a leg up with e.g secret communication channels between them, and rotating different humans when we need to simulate amnesia) - can we patch all of the problems with e.g debate? can we extract higher quality work out of real life misaligned expert humans for practical purposes (even if it’s maybe a bit cost uncompetitive)?
All four of those I think are basically useless in practice for purposes of progress toward aligning significantly-smarter-than-human AGI, including indirectly (e.g. via outsourcing alignment research to AI). There are perhaps some versions of all four which could be useful, but those versions do not resemble any work I’ve ever heard of anyone actually doing in any of those categories.
That said, many of those do plausibly produce value as propaganda for the political cause of AI safety, especially insofar as they involve demoing scary behaviors.
EDIT-TO-ADD: Actually, I guess I do think the singular learning theorists are headed in a useful direction, and that does fall under your “science of generalization” category. Though most of the potential value of that thread is still in interp, not so much black-box calculation of RLCTs.
I think we would all be interested to hear you elaborate on why you think these approaches have approximately no value. Perhaps this will be in a follow-up post.
It’s difficult for me to understand how this could be “basically useless in practice” for:
It seems to me you’d want to understand and strongly show how and why different approaches here fail, and in any world where you have something like “outsourcing alignment research” you want some form of oversight.
Thanks for the list! I have two questions:
1: Can you explain how generalization of NNs relates to ELK? I can see that it can help with ELK (if you know a reporter generalizes, you can train it on labeled situations and apply it more broadly) or make ELK unnecessary (if weak to strong generalization perfectly works and we never need to understand complex scenarios). But I’m not sure if that’s what you mean.
2: How is goodhart robustness relevant? Most models today don’t seem to use reward functions in deployment, and in training the researchers can control how hard they optimize these functions, so I don’t understand why they necessarily need to be robust under strong optimization.
There are plenty of cases where John can glance at what people are doing and see pretty clearly that it is not progress toward the hard problem.
Importantly, people with the agent foundations class of anxieties (which I embrace; I think John is worried about the right things!) do not spend time engaging on a gears level with prominent prosaic paradigms and connecting the high level objection (“it ignores the hard part of the problem”) with the details of the research.
“But Tsvi and John actually spend a lot of time doing this.”
No, they don’t! They paraphrase the core concern over and over again, often seemingly without reading the paper. I don’t think reading the paper would change your minds (nor should it!), but I think that there’s a culture problem tied to this off-hand dismissal of prosaic work that disincentivizes potential agent foundations (or similar new thing that shares the core concerns of agent foundations) researchers from engaging with, i.e., John.
Prosaic work is fraught and, much of it, doomed. New researchers over-index on tractability because short feedback loops are comforting (‘street-lighting’). Why aren’t we explaining why that is, on the terms of the research itself, rather than expecting people to be persuaded by the same high level point getting hammered into them again and again?
I’ve watched this work in real-time. If you listen to someone talk about their work, or read their paper and follow up in person, they are often receptive to a conversation about worlds in which their work is ineffective, evidence that we’re likely to be in such a world, and even to shifting the direction of their work in recognition of that evidence.
Instead, people with their eye on the ball are doing this tribalistic(-seeming) thing.
Yup, the deck is stacked against humanity solving the hard problems; for some reason, folks who know that are also committed to playing their hands poorly, and then blaming (only) the stacked deck!
John’s recent post on control is a counter-example to the above claims and was, broadly, a big step in the right direction, but had some issues with it, as raised by Redwood in the comments, which are a natural consequence of it being ~a new thing John was doing. I look forward to more posts like that in the future, from John and others, that help new entrants to empirical work (which has a robust talent pipeline!) understand, integrate, and even pivot in response to, the hard parts of the problem.
[edit: I say ‘gears level’ a couple times, but mean ‘more in the direction of gears-level than the critiques that have existed so far’]
Big crux here: I don’t actually expect useful research to occur as a result of my control-critique post. Even having updated on the discussion remaining more civil than I expected, I still expect basically-zero people to do anything useful as a result.
As a comparison: I wrote a couple posts on my AI model delta with Yudkowsky and with Christiano. For each of them, I can imagine changing ~one big piece in my model, and end up with a model which looks basically like theirs.
By contrast, when I read the stuff written on the control agenda… it feels like there is no model there at all. (Directionally-correct but probably not quite accurate description:) it feels like whoever’s writing, or whoever would buy the control agenda, is just kinda pattern-matching natural language strings without tracking the underlying concepts those strings are supposed to represent. (Joe’s recent post on “fake vs real thinking” feels like it’s pointing at the right thing here; the posts on control feel strongly like “fake” thinking.) And that’s not a problem which gets fixed by engaging at the object level; that type of cognition will mostly not produce useful work, so getting useful work out of such people would require getting them to think in entirely different ways.
… so mostly I’ve tried to argue at a different level, like e.g. in the Why Not Just… posts. The goal there isn’t really to engage the sort of people who would otherwise buy the control agenda, but rather to communicate the underlying problems to the sort of people who would already instinctively feel something is off about the control agenda, and give them more useful frames to work with. Because those are the people who might have any hope of doing something useful, without the whole structure of their cognition needing to change first.
I think the reason nobody will do anything useful-to-John as a result of the control critique post is that control is explicitly not aiming at the hard parts of the problem, and knows this about itself. In that way, control is an especially poorly selected target if the goal is getting people to do anything useful-to-John. I’d be interested in a similar post on the Alignment Faking paper (or model organisms more broadly), on RAT, on debate, on faithful CoT, on specific interpretability paradigms (circuits v SAEs, vs some coherentist approach vs shards vs....), and would expect those to have higher odds of someone doing something useful-to-John. But useful-to-John isn’t really the metric I think the field should be using, either....
I’m kind of picking on you here because you are least guilty of this failing relative to researchers in your reference class. You are actually saying anything at all, sometimes with detail, about how you feel about particular things. However, you wouldn’t be my first-pick judge for what’s useful; I’d rather live in a world where like half a dozen people in your reference class are spending non-zero time arguing about the details of the above agendas and how they interface with your broader models, so that the researchers working on those things can update based on those critiques (there may even be ways for people to apply the vector implied by y’all’s collective input, and generate something new / abandon their doomed plans).
It’s a bit tangential to the context, but this is a topic I have an ongoing interest in: what leads you to believe that the skeptics (in particular NLP people in the linguistics community) have shifted away from their previous positions? My impression has been that many of them (though not all) have failed to really update to any significant degree. Eg here’s a paper from just last month which argues that we must not mistake the mere engineering that is LLM behavior for language understanding or production.