Yes, that’s closer to it, although I feel like I’m in the unfortunate position of understanding just enough to notice and be put off by the inaccuracies in most content of that description. (Also, “you’ve got to check its work” has become a meaningless phrase that produces no particular behavior in anybody, due to it being parroted disclaimer-style by people demoing products whose advertised benefits only exist if it’s perfectly fine to not, in fact, check its work.)
I also feel as though:
1) there are some things more in the ‘how does it work’ side of things that non-programmers can usefully have some understanding of? [1]
Non-programmers are capable of understanding, using and crafting system prompts. And I’ve definitely read articles about the jagged edge in intelligence/performance that I could have understood if all the examples hadn’t been programming tasks!
2) avoiding doing anything but telling people exactly what a given model in a given format can and can’t do, without going at all into why, leaves them vulnerable to the claim, “Ah, but this update means you don’t have to worry about that any more”… I think there are some things people can usefully understand that are in the vein of e.g. “which current limitations represent genuinely hard problems, and which might be more to do with design choices?” Again—something that goes some amount into the whys of things, but also something I think I have some capability to understand as a non-programmer. [2]
For instance: when I explain that the programs being promoted to them for use in academic research have as their basis a sort of ‘predict-how-this-text-continues machine’ which, when running on a large enough sample, winds up at-least-appearing to understand things because of how it’s absorbed patterns in our use of language, that also has lots of layers of additional structure/training etc. on top which further specifies the sort of output it produces, in an increasingly-good but not yet perfect attempt to get it to not make stuff up, which it doesn’t ‘understand’ that it is doing because its base thing is patterns and not facts and logic… I find that they then get that they should check its output, and are more likely to report actually doing so in future. I’m sure there are things about this explanation that you’ll want to take issue with and I welcome that—and I’m aware of the failure mode of ‘fancy autocorrect’ which is also not an especially helpful model of reality—but it does actually seem to help!
Example: I initially parsed LLMs’ tendency to make misleading generalisations when asked to summarise scientific papers—which newer models were actually worse at—just as, ‘okay, so they’re bad at that then.” But then I learned some more, did some more research—without becoming a programmer—and I now feel I can speculate that this is one of the ones that could be quite fixable, as a plausible reason for this is that all the big commercial LLMs we have are being designed as general-purpose objects, and it’s plausible that a general-public definition of helpfulness trading off against scientific precision is the reason the studied LLMs actually got worse in their newer models. This does seem like a helpful thing to be able to do when the specifics of what stuff can and can’t do is likely to change pretty quickly—and what things were trained for, how ‘helpfulness’ was defined, the fact that a decision was made to make things aimed ultimately towards general intelligence and not specific uses—doesn’t seem like stuff you need to code to understand.
Yes, that’s closer to it, although I feel like I’m in the unfortunate position of understanding just enough to notice and be put off by the inaccuracies in most content of that description. (Also, “you’ve got to check its work” has become a meaningless phrase that produces no particular behavior in anybody, due to it being parroted disclaimer-style by people demoing products whose advertised benefits only exist if it’s perfectly fine to not, in fact, check its work.)
I also feel as though:
1) there are some things more in the ‘how does it work’ side of things that non-programmers can usefully have some understanding of? [1]
Non-programmers are capable of understanding, using and crafting system prompts. And I’ve definitely read articles about the jagged edge in intelligence/performance that I could have understood if all the examples hadn’t been programming tasks!
2) avoiding doing anything but telling people exactly what a given model in a given format can and can’t do, without going at all into why, leaves them vulnerable to the claim, “Ah, but this update means you don’t have to worry about that any more”… I think there are some things people can usefully understand that are in the vein of e.g. “which current limitations represent genuinely hard problems, and which might be more to do with design choices?” Again—something that goes some amount into the whys of things, but also something I think I have some capability to understand as a non-programmer. [2]
For instance: when I explain that the programs being promoted to them for use in academic research have as their basis a sort of ‘predict-how-this-text-continues machine’ which, when running on a large enough sample, winds up at-least-appearing to understand things because of how it’s absorbed patterns in our use of language, that also has lots of layers of additional structure/training etc. on top which further specifies the sort of output it produces, in an increasingly-good but not yet perfect attempt to get it to not make stuff up, which it doesn’t ‘understand’ that it is doing because its base thing is patterns and not facts and logic… I find that they then get that they should check its output, and are more likely to report actually doing so in future. I’m sure there are things about this explanation that you’ll want to take issue with and I welcome that—and I’m aware of the failure mode of ‘fancy autocorrect’ which is also not an especially helpful model of reality—but it does actually seem to help!
Example: I initially parsed LLMs’ tendency to make misleading generalisations when asked to summarise scientific papers—which newer models were actually worse at—just as, ‘okay, so they’re bad at that then.”
But then I learned some more, did some more research—without becoming a programmer—and I now feel I can speculate that this is one of the ones that could be quite fixable, as a plausible reason for this is that all the big commercial LLMs we have are being designed as general-purpose objects, and it’s plausible that a general-public definition of helpfulness trading off against scientific precision is the reason the studied LLMs actually got worse in their newer models. This does seem like a helpful thing to be able to do when the specifics of what stuff can and can’t do is likely to change pretty quickly—and what things were trained for, how ‘helpfulness’ was defined, the fact that a decision was made to make things aimed ultimately towards general intelligence and not specific uses—doesn’t seem like stuff you need to code to understand.