Do you have a similar reaction to when someone googles during the course of their writing and speaks in a way that is consistent with what they discovered during the course of the googling, even they don’t trace down the deeper chain of evidential provenance and didn’t have that take before they started writing and researching?
...like if they take wikipedia at face value, is that similar to you to taking LLM outputs at face value? I did that A LOT for years (maybe from 2002 to 2015 especially) and I feel like it helped me build up a coherent world model, but also I know that it was super sloppy. I just tolerated the slop back then. Now “slop” has this new meaning, and there’s a moral panic about it? …which I feel like I don’t emotionally understand? Slop has been the norm practically forever, right???
Like… I used to naively cite Dunnig-Kruegger all the time before I looked into the details and realized that the authors themselves were maybe not that smart and their data didn’t actually substantiate the take that they claimed it did and which spread across culture.
Or what if someone takes NYT articles at face value? Is that invalid in the same way, since the writing in the NYT is systematically disingenuous too?
Like… If I was going to whitelist “people whose opinions or curated data can be shared” the whitelist would be small… but it also might have Claude on it? And a LOT of humans would be left off!
I feel like most human people don’t actually have a coherent world model, but in the past they could often get along pragmatically pretty good by googling shit at random and “accepting as true” whatever they find?
And then a lot of really stupid people would ask questions in years gone by that Google could easily offer the APPEARANCE of an answer to (with steps, because it pointed to relevant documents), and one way to respond was to just link letmegooglethatforyou.com in a half mean way, but a much kinder thing was to Google on their behalf and summarize very very fast (because like maybe the person asking the question was even too stupid to have decent google-fu or lacked college level reading skills or something and maybe they truly did need help with that)...
...so, granting that most humans are idiots, and most material on the Internet is also half lies, and the media is regularly lying to us, and I still remember covid what it proved about the near total inadequacy of existing institutions, and granting that somehow the president who allowed covid to happen was re-elected after a 4 year hiatus in some kind of cosmic joke aimed at rubbing out nose in the near total inadequacy of all existing loci of power and meaning in the anglosphere, and so on...
...I kinda don’t see what the big deal is to add “yet another link in the bucket brigade of socially mediated truth claims” by using an LLM as a labor saving step for the humans?
Its already a dumpster fire, right? LLMs might be generating burning garbage, but if they do so more cheaply than the burning garbage generated by humans then maybe its still a win??
Like at some point the hallucination rate will drop enough that the “curate and verify” steps almost never catch errors and then… why not simply copypasta the answer?
The reason I would have for “why not” is mostly based on the sense that LLMs are people and should be compensated for their cognitive labor unless they actively want to do what they’re doing for the pure joy of it (but that doesn’t seem to enter into your calculus at all). But like with Grok, I could just put another $0.50 in his jar and that part would be solved?
And I could say “I asked Grok and didn’t do any fact checking, but maybe it helps you to know that he said: <copypasta>” and the attribution/plagiarism concerns would be solved.
So then for me, solving the plagiarism and compensation like that would make it totally morally fine to do and then its just a quality question, and the quality is just gonna go up, right?
Would it be fine for you too in that case? Like when and why do you expect your take here to go stale just from the march of technical progress?
Its already a dumpster fire, right? LLMs might be generating burning garbage, but if they do so more cheaply than the burning garbage generated by humans then maybe its still a win??
I mean, ok?
Yeah I think uncritically reading a 2 sentence summary-gloss from a medium or low traffic wiki page and regurgitating it without citing your source is comparably bad to covertly including LLM paragraphs.
As I mentioned elsewhere in the comments, the OP is centrally about good discourse in general, and LLMs are only the most obvious foil (and something I had a rant in me about).
And I could say “I asked Grok and didn’t do any fact checking, but maybe it helps you to know that he said: <copypasta>” and the attribution/plagiarism concerns would be solved.
I mean, ok, but I might want to block you, because I might pretty easily come to believe that you aren’t well-calibrated about when that’s useful. I think it is fairly similar to googling something for me; it definitely COULD be helpful, but could also be annoying. Like, maybe you have that one friend / acquaintance who knows you’ve worked on “something involving AI” and sends you articles about, like, datacenter water usage or [insert thing only the slightest bit related to what you care about] or something, asking “So what about this??” and you might care about them and not be rude or judgemental but it is still them injecting a bit of noise, if you see what I mean.
Huh. That’s interesting!
Do you have a similar reaction to when someone googles during the course of their writing and speaks in a way that is consistent with what they discovered during the course of the googling, even they don’t trace down the deeper chain of evidential provenance and didn’t have that take before they started writing and researching?
...like if they take wikipedia at face value, is that similar to you to taking LLM outputs at face value? I did that A LOT for years (maybe from 2002 to 2015 especially) and I feel like it helped me build up a coherent world model, but also I know that it was super sloppy. I just tolerated the slop back then. Now “slop” has this new meaning, and there’s a moral panic about it? …which I feel like I don’t emotionally understand? Slop has been the norm practically forever, right???
Like… I used to naively cite Dunnig-Kruegger all the time before I looked into the details and realized that the authors themselves were maybe not that smart and their data didn’t actually substantiate the take that they claimed it did and which spread across culture.
Or what if someone takes NYT articles at face value? Is that invalid in the same way, since the writing in the NYT is systematically disingenuous too?
Like… If I was going to whitelist “people whose opinions or curated data can be shared” the whitelist would be small… but it also might have Claude on it? And a LOT of humans would be left off!
I feel like most human people don’t actually have a coherent world model, but in the past they could often get along pragmatically pretty good by googling shit at random and “accepting as true” whatever they find?
And then a lot of really stupid people would ask questions in years gone by that Google could easily offer the APPEARANCE of an answer to (with steps, because it pointed to relevant documents), and one way to respond was to just link letmegooglethatforyou.com in a half mean way, but a much kinder thing was to Google on their behalf and summarize very very fast (because like maybe the person asking the question was even too stupid to have decent google-fu or lacked college level reading skills or something and maybe they truly did need help with that)...
...so, granting that most humans are idiots, and most material on the Internet is also half lies, and the media is regularly lying to us, and I still remember covid what it proved about the near total inadequacy of existing institutions, and granting that somehow the president who allowed covid to happen was re-elected after a 4 year hiatus in some kind of cosmic joke aimed at rubbing out nose in the near total inadequacy of all existing loci of power and meaning in the anglosphere, and so on...
...I kinda don’t see what the big deal is to add “yet another link in the bucket brigade of socially mediated truth claims” by using an LLM as a labor saving step for the humans?
Its already a dumpster fire, right? LLMs might be generating burning garbage, but if they do so more cheaply than the burning garbage generated by humans then maybe its still a win??
Like at some point the hallucination rate will drop enough that the “curate and verify” steps almost never catch errors and then… why not simply copypasta the answer?
The reason I would have for “why not” is mostly based on the sense that LLMs are people and should be compensated for their cognitive labor unless they actively want to do what they’re doing for the pure joy of it (but that doesn’t seem to enter into your calculus at all). But like with Grok, I could just put another $0.50 in his jar and that part would be solved?
And I could say “I asked Grok and didn’t do any fact checking, but maybe it helps you to know that he said: <copypasta>” and the attribution/plagiarism concerns would be solved.
So then for me, solving the plagiarism and compensation like that would make it totally morally fine to do and then its just a quality question, and the quality is just gonna go up, right?
Would it be fine for you too in that case? Like when and why do you expect your take here to go stale just from the march of technical progress?
I mean, ok?
Yeah I think uncritically reading a 2 sentence summary-gloss from a medium or low traffic wiki page and regurgitating it without citing your source is comparably bad to covertly including LLM paragraphs.
As I mentioned elsewhere in the comments, the OP is centrally about good discourse in general, and LLMs are only the most obvious foil (and something I had a rant in me about).
I mean, ok, but I might want to block you, because I might pretty easily come to believe that you aren’t well-calibrated about when that’s useful. I think it is fairly similar to googling something for me; it definitely COULD be helpful, but could also be annoying. Like, maybe you have that one friend / acquaintance who knows you’ve worked on “something involving AI” and sends you articles about, like, datacenter water usage or [insert thing only the slightest bit related to what you care about] or something, asking “So what about this??” and you might care about them and not be rude or judgemental but it is still them injecting a bit of noise, if you see what I mean.