On being sort of back and sort of new here

Loki zen16 Jul 2025 12:55 UTC

32 points

AI AI Rights / Welfare Criticisms of The Rationalist Movement AI Risk AI Sentience

So I’m “back” on Less Wrong, which is to say that I was surprised to find that I already had an account and had even apparently commented on some things. 11 years ago. A career change and a whole lot of changes in the world ago.

I’ve got a funny relationship to this whole community I guess.

I’ve been ‘adj’ since forever but I’ve never been a rat and never, until really quite recently, had much of an interest in the core rat subjects. I’m not even a STEM person, before or after the career change. (I was in creative arts—now it’s evidence-based medicine.)

I just reached the one-year anniversery of the person I met on rat-adj tumblr moving accross the ocean for me, for love. So there was always going to be an enduring mark on my life left by this community, but I figured that would be it.

Only… I was in the splash radius of all your talk about AI.

And I loved Frank, and I loved reading what Nostalgebraist wrote about her, even if there were parts of it I didn’t fully understand.

And when one couldn’t get away from AI Discourse on tumblr, I found concepts I learned from you guys way back when bubbling up in my vocabulary—if only, in classic -adj fashion, by virtue of how completely wrong I thought you’d all been about it all. ^[1]

When suddenly everyone was fucking talking about it, like at work and shit, and it became apparent both that:
a) reading a few blog posts about and around Frank made me, relatively speaking, the expert in the room, and
b) that we really needed an expert in the room that knew more than fucking that...

...I sucked it up and got serious about learning as much as I’ve been able to.

It’s not been easy, coming from a non-programming background—few of the resources that can teach you this stuff are pitched at someone like me^[2].

Which brings us to here:

I got to know you guys by arguing with you about how wrongheaded all this AI risk shit was, and now, I am seriously concerned about, spending a fair amount of my time thinking on, and to a certain extent working professionally on the mitigation of AI risk.

Is this me saying you were right?

Well, no. Sorry.

I think I was right that there was a lot of wasted effort going into armchair reasoning about imaginary stuff that might in no way resemble the actual real programs that came along that we needed to worry about. I think I was right that doing all that, building a community and a shared system of concepts around that, ran the risk of getting you stuck in ways of thinking that’d turn out to be ill-suited to the actual situation when it came along, and I think that has happened to some of you, from the stuff I’ve read since I’ve come back here. (It certainly hasn’t helped your ability to create approachable/accessible content on the subject!)

I think I’m right that corporations are the real (or at least the immediate) alignment problem wrt AI, and in retrospect, I think there’s important stuff that we couldn’t talk about back then because profit-driven corporations as a big driver of risk would have sounded too much like Politics (The Mind-Killer). I think it was always naiive to base a lot of planning on the assumption that a non-profit could be the one to clinch that first-mover advantage, and sure enough that non-profit is now two corporations.

But—you were talking about it. And maybe if I hadn’t been arguing with you I wouldn’t have been thinking about it. And I’m sure there can’t be nothing from the armchair reasoning years that has value, to say nothing of everything you must’ve come up with since then. And I know you know the actual tech, and the maths, better than I do. So I’m gonna need to start arguing with you again. You are good at making that educational.

I’ve also found my position on the possibility of something like consciousness arising from something LLM-like becoming more uncertain the more I talk to them. I know that if anyone is going to be able to take seriously the idea that we should perhaps be starting to think about what that would mean just in case, it’s going to be you.

So yeah. Hi. Let’s take some ideas seriously.

^
My impression of the most central/loudest/commonest take from you guys on AI risk was that we needed to worry about what the programs might do, and my take was “what we need to worry about long before that is what people do with the programs.”

Of which more later. But, just real quick, I’ll be clear that I don’t even necessarily mean malicious actors. The real harms we’ve actually seen happen, and which look poised to happen in the near future, come of, essentially, delegating stuff to LLMs that they are not ready to do unsupervised.
^
otoh—now that I do flatter myself that I know a thing or two, it seems to me that there’s a lot of really blinkered thinking going on that comes of the only people who are seriously looking into things like AI ethics and AI alignment being programmers who think like programmers.

Loki zen16 Jul 2025 12:55 UTC

32 points

13 comments3 min readLW link

AI AI Rights / Welfare Criticisms of The Rationalist Movement AI Risk AI Sentience

Ben Pace 16 Jul 2025 18:46 UTC
10 points
2
And I loved Frank, and I loved reading what Nostalgebraist wrote about her, even if there were parts of it I didn’t fully understand.
[...]
...reading a few blog posts about and around Frank made me, relatively speaking, the expert in the room...
If you were confused, like I was, about who “Frank” is, from 2019 to 2023 nostalgebraist made an autoresponder language-model trained on his writing that would answer ppl’s questions on Tumblr. Pretty cool.
- Loki zen 17 Jul 2025 6:01 UTC
  1 point
  0
  Parent
  Good catch—that was me doing exactly what I was complaining about re the few random posts I caught on here; referring to stuff that “everyone knows” in my particular corner of the ratsphere/adjacency without explanation!
AnthonyC 16 Jul 2025 16:05 UTC
7 points
2
I think you’ll find a lot of agreement (and even, if you look closely, decade-old-or-longer anticipation) that a lot of the armchair reasoning would and did turn out to be inapplicable to what actually happened. You’ll also find acceptance, mostly, that this is normal for complex, emerging fields, and can be in many ways useful, in that it created a context in which it was possible to discuss the real problems at all, instead of having to improvise everything at top speed with no shared vocabulary or time to reflect.
As for not creating up-to-date, accessible, approachable content… In many ways I agree, but I think the key questions are, written by whom, approachable by whom, and to achieve what goal? There frankly aren’t enough experts to come close to meeting the need for them, but with the pace of change it’s also hard to justify them spending their time writing content for a wider audience. Scott Alexander’s Ars Longa, Vita Brevis from 2017 was a fictionalized account of how making such content about a critical topic isn’t worth the highest experts’ time, but instead you need a chain of people from topic experts to education and writing experts to create it.
For the time being, I still get a surprisingly large amount of mileage from sharing Tim Urban’s 2015 Wait but Why series on AI. I’ve also done pretty well myself explaining things to individuals and small groups using known context about their starting points, but have no idea how I’d approach a larger audience. I’d run into exactly the problem EY ran into 15 years ago, that each thing he’d want to write had prerequisites that also didn’t have approachable, short content he could point to, so he would recurse, until it turned into thousands of pages on dozens of topics written over quite a few years. We do have a lot of very good content on critical topics written since then by many people, but it can be hard to find, and definitely isn’t concise.
As an example: If you go back 5 years, you’ll find a huge amount written here about covid, for obvious reasons. When you compare that to most other communities, it’s clear just how many people in society, and in relevant positions of power, who have no real idea what ‘science’ actually is and does, what it means to know or understand things individually or collectively, and lots of other fundamental things that are in many ways prerequisites for really understanding what’s going on in AI. But people think they know these things, which makes it hard to communicate about without sounding condescending or out of touch.
- Loki zen 17 Jul 2025 9:52 UTC
  1 point
  0
  Parent
  actually to engage with more aspects of this particular response—I agree that you need a chain of people, and I guess I’m looking to become a part of it.
  I think it would be incredibly unbecoming of the AI risk guys to conclude that this isn’t worth their time; this would seem to amount to closing one’s eyes to the fact that non-experts are, right now, needing to make decisions about AI that could result in harm if they’re made wrong.
  - AnthonyC 17 Jul 2025 14:24 UTC
    2 points
    0
    Parent
    I think it would be incredibly unbecoming of the AI risk guys to conclude that this isn’t worth their time
    I don’t think that’s what I’m saying. At least, it’s not what I’m trying to say. Rather, I’m saying that any given individual has finite time and needs to prioritize, the number of people in the “AI risk guys” reference class is small relative to need, and promoting, broad, basic understanding to facilitate safety in the context of mundane utility is not necessarily the highest priority for many or most of them.
    Mundane utility seems to me to be the kind of risk that actually can, mostly, be handled by people with pretty basic understanding of AI (the kind you can actually get by trying things out and talking to people like you who know more than average) just adapting existing best practices and product development/implementation paths.
    If you’re managing a hospital or nuclear power plant or airplane manufacturing plant, you don’t roll out a new technology until some company that has a highly credible claim to expertise demonstrates to a sufficiently conservative standards body that it’s a good idea and won’t get you sued or prosecuted into oblivion, and won’t result in you murdering huge numbers of people (on net, or at all). Even then you do a pilot first unless you’re sure there’s dozens of other successful implementations out there you can copy. If that takes another five years or ten or more, so be it. That’s already normal in your industry.
    If you’re running a more typical company managing non-lethal or mostly-non-lethal-except-for-rare-edge-cases risks, management gets to decide how much to wager on how new and untested a technology, and if they bet the company and lose, well, so be it, that’s also normal. And if you want to adopt AI in some way and still be more sure you’re implementing it well, then you hire a professional—either a company, or a team, or a consultant, or something—just like you would if you were planning any other strategic change to how you do business. Someone in your position would not be that person, but I’d be confident you are able to spot when the situation calls for that kind of diligence, and could pretty reasonably evaluate which self-proclaimed professionals seem to have enough competence to give good advice.
    In my personal example: I’m helping my employer decide what AI tools to use internally and how to roll them out and manage their use, but we are hiring an outside company to figure out how to incorporate AI into our actual product offering to clients. The latter is a higher business risk situation, and calls for deeper technical expertise. It does not call for an AI safety expert. In fact, I’d be disappointed if an AI safety expert were focused on something like this instead of focusing on either technical AI safety to try to ensure humanity’s survival, or policy advocacy to get people with real power to do useful things to promote the same.
    Edit to add: I do also think you can get good mileage out of just having basic conversations with people you know in different positions, in the types of language they understand and care about. If you’re talking about AI to HR, then there’s headcount efficiency for some tasks more than others, the potential over the next couple of years for agents to automate some parts of some roles but less for others, and the need to start thinking about how you might need to reorganize roles iteratively over time. If you’re talking to an R&D director, then there’s all the different ways to use AI to act as a thought partner, improve understanding of the landscape or what’s known and available, or refine experimental design and improve data analysis. If you’re talking to sales, there’s ways to gain insight about what clients/customers want or respond to based on trends in a larger data set that a human would have a hard time finding. Sparking thought and curiosity while also highlighting general categories and intensities of risk is often enough to get the needed conversations going.
    - Loki zen 17 Jul 2025 14:38 UTC
      1 point
      0
      Parent
      If you’re managing a hospital or nuclear power plant or airplane manufacturing plant, you don’t roll out a new technology until some company that has a highly credible claim to expertise demonstrates to a sufficiently conservative standards body that it’s a good idea and won’t get you sued or prosecuted into oblivion, and won’t result in you murdering huge numbers of people (on net, or at all). Even then you do a pilot first unless you’re sure there’s dozens of other successful implementations out there you can copy. If that takes another five years or ten or more, so be it. That’s already normal in your industry.
      It would be really great if people were behaving about this in the way that is normal in my industry.
      Unfortunately, they’re not. The hype has everyone bad, and in combination with the enormous pressure to cut costs in the NHS , the pressure on everyone’s time...
      This isn’t like a new medical imaging device. A doctor can have it on their phone and use it without checking with anyone. Our ability to promote the stuff they’re supposed to be using for looking things up at point-of-care has been decimated by staff cuts, to add to the problem.
      I am being put forward to produce training on what people should and shouldn’t be doing. If someone more qualified was coming along, that wouldn’t be happening. If everyone was just going to hang tight until the technology had matured and they got the official word to go ahead from someone who knew what they were about, I wouldn’t be worried.
      - AnthonyC 17 Jul 2025 15:01 UTC
        3 points
        0
        Parent
        Every country is different, but in the US, the natural course in those situations is that if it works, it gets normalized and brought into official channels, and if it doesn’t, a lot of people are going to get sued for malpractice. If you’re being put in that position, then you should insist on making sure the lawyers and compliance officers are consulted to refine whatever language the trainings use, or else refuse to put your name on any of it.
        Some of this is pretty straightforward, if not simple to execute. For example, if you’re talking about LLMs, and you’re following the news, then you know that the US courts have ruled that the NY Times has the right to read every chat anyone, anywhere has with ChatGPT, no matter what OpenAI’s terms of service previously said, in order to look for evidence of copyright infringement. There seem (I think) to be exceptions for enterprise customers. But, this should be sufficient to credibly say, “If you’re using this through a personal account on an insecure device, you’re breaking the laws on patient confidentiality, and can be sanctioned for it in whatever ways just like you would if you got caught talking about a patient’s condition and identity in public.”
        Good trainings build on what the audience already knows. Maybe start with something like, “Used well, you can consult LLMs with maybe about as much trust as you would a first-day new resident.” You need to be very careful how you frame your questions—good prompting strategy, good system prompts, consistent anonymization practices, asking for and confirming references to catch hallucinations or other errors. Those are all things you can look up in many places, here and elsewhere, but you’ll have to pick through a lot of inapplicable and poor advice because nothing is rock-solid or totally stable right now. Frame it as a gradual rollout requiring iteration on all of those, plan for updated trainings every quarter or two, provide a mechanism for giving feedback on how it’s going. You can use feedback to develop a system prompt people can use with their employer-provided accounts.
        This may be harder, but you also need to make sure you’re able to collect accurate information on what people in various roles are actually, currently doing, and they need to be able to tell you that without fear of retroactive reprisal or social sanction, or you’re likely to drown in deception about the situation on the ground, and won’t know what goals or use cases you need to be thinking about.
        AnthonyC 17 Jul 2025 15:07 UTC
        5 points
        0
        Parent
        One thing to add: if you find that the whole system is demanding you, an amateur, provide training while refusing to do the bare minimum necessary to even follow the law or maintain the standard of care, then think very carefully about what you want to do in response. Try to muddle through and do the best you can without incurring personal liability? Become a whistleblower? Find a new job? At that point you might well want to consult a lawyer yourself to help navigate the situation. And make sure you document, in writing, any requests made of you that you are uncomfortable with, and your objections to them.
- Loki zen 16 Jul 2025 17:08 UTC
  1 point
  0
  Parent
  I’ll definitely check that out! If you’re able to point me to any of the other material you’re talking about, I’d very much appreciate it—what I was saying was very much a surface-level take based on what I clicked to from the immediate front page.
  
  To answer your questions:
  -approachable by non-programmers who find themselves in the position of needing to make decisions about how and whether to use AI in their lives and work environments (which could be things that it’s very important run smoothly, like a hospital)
  -with the goal of enabling them to make sensible decisions
  -written by: I don’t know. Someone who doesn’t throw their hands in the air at the very prospect of teching anything substantive to a non-programmer, and who is not employed in a sales or marketing capacity by a company trying to sell an AI-branded product.
  - AnthonyC 16 Jul 2025 18:26 UTC
    2 points
    0
    Parent
    So, so this is mostly about mundane utility, not about x-risk and the like, and not about generally improving the quality of one’s own thinking. That’s very helpful.
    For x-risk stuff, I would have said you might want to pre-order If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All.
    For improving one’s own thinking, there’s no short path, and trying to make one without a guide who gives reliable, individual feedback can be worse than doing nothing.
    For mundane utility, that’s a bit harder, especially for something general, because the specifics keep changing with constant new releases, and they differ a lot between fields. Earlier this year I was leading a team (of essentially the 3 main early adopters) doing exactly this for my own employer, and it amounted to us trialing a bunch of things, deciding it made the most sense to pick some low-hanging fruit while waiting for more ambitious adoption to get easier as tech improved, and assembling a bunch of our own training modules and best practices. But ours is a much lower pressure and less regulated environment than your hospital example. Even then, twice I went into one of those trainings and had to say, “This morning [CEO] tweeted that [company] just released [update]. If that works it seems like it really changes this step, but for the better. Expect that to happen a lot.”
    Without knowing specifically what you do, what styles you like to read, or anything else, I’ll just say I’ve learned a lot and found it useful to read Zvi Mowshowitz’s weekly AI updates, either here or at https://thezvi.substack.com/ It’s more a weekly news roundup than anything else, and we’re like 124 weeks in, but you can skip around to bits you care about.
    - Loki zen 17 Jul 2025 7:07 UTC
      1 point
      0
      Parent
      Thank you for helping me to clarify what I was looking for—this is very helpful.
      To add more clarity—my focus/concern is the immediate-term risk of serious but probably not x-risk level harms caused by inappropriate, ignorance-, greed- and misinformation-driven implementation of AI (including current generation LLMs). I think we’re already seeing this happen and there’s potential for it to get quite a bit worse, what with OpenAI’s extremely successful media campaign to exaggerate LLM capabilities and make everyone think they have to use them for everything right now this minute.
      It follows therefore that it’s very necessary to find ways for non-progammers to become as informed as they can be about what they really need to know instead of getting all their information from marketing hype.
      I personally am a clinical librarian working in supporting evidence based medical practice and research. I am the person clinicians come to asking questions about whether and how they should be using AI, and I am part of the team deciding if the largest NHS Trust adopts certain AI tools in its evidence support and research practices.
      - AnthonyC 18 Jul 2025 15:15 UTC
        6 points
        0
        Parent
        I am part of the team deciding if the largest NHS Trust adopts certain AI tools
        I very much hope that there are also doctors, nurses, administrators, and other relevant roles on that team. If not, or really regardless, any tool selection should involve a pilot process, and side-by-side comparisons of results from several options using known past, present, and expected future use cases. The outputs also should be evaluated independently by multiple people with different backgrounds and roles.
        I’m going to assume the tools you’re considering are healthcare-specific and advertise themselves as being compliant with any relevant UK laws. If so, what do the providers claim about how they can and should be used, and how they shouldn’t? Do the pilot results bear that out? If not, then you really do need to understand how the tools work, what data goes where, and the like.
        I do think the value and potential are real here. As a non-professional private citizen, I have used and do use LLM chat interfaces to gather information and advice for my own, personal health, and that of my pets (or other loved ones when requested). It has been extremely useful at helping me learn things that many more hours of manual searching previously failed to find, and acting as an idea-generator and planning collaborator. But, I have put in a lot of effort to develop a system prompt and prompting strategies that reduce sycophancy, lying, hallucination, and fabrication, and promote clear, logical reasoning and explicit probability estimates of inferences and conclusions. I compare outputs of different LLMs to one another. I ask for and check sources. And I’m not doing anything in response that has any significant probability of being dangerous, or without consulting real people I trust to tell me if I’m being an idiot. But on the other hand, I’ve seen the results other people get when they don’t put in that much effort to using the LLMs well, and the results are either very generic, or in various ways just plain bad.
        I suspect, though I don’t know, that the ceiling of what results a skilled user can achieve using a frontier LLM is probably higher than what most dedicated healthcare-focused tools can do, but the floor is very likely to be much, much worse.
        This may or may not be useful, but in terms of training, I have a few phrases and framings I keep repeating to people that seem to resonate. I don’t remember where I first heard them.
        “Think of AI as an infinite army of interns. Five years ago they were in elementary school. Now they’ve graduated college. But they have no practical experience and only have the context you explicitly give them. Reasoning models and Deep Research/Extended Thinking are like giving the interns more time to think with no feedback from you.”
        “Don’t assume what you were doing and seeing with AI 3+ months ago has any bearing on what AI can or will do today. Models and capabilities change, even with the same designation.”
        “You’re not talking to a person. Your prompt is telling the AI who to pretend to be as well as what kind of conversation it is having.” (In the personal contexts mentioned above, I would include things at the start of a prompt like “You are a metabolic specialist with 20 years of clinical and research experience” or “You are a feline ethologist.”)
        Related: “If your prompt reads like a drunken text, the response will be the kind of response likely to be the reply to a drunken text.”
        Also related: “Your system prompt tells the AI who you are, and how you want it to behave. This can be as detailed as you need, and should probably be paragraphs or pages long.” (Give examples)
        “The big AI companies are all terrible at making their models into useful products. Never assume they’ve put reasonable effort into making features work the way you’d expect, or that they’ve fixed bugs that would require a few minutes of coding effort to fix.”
        Loki zen 18 Jul 2025 15:31 UTC
        2 points
        0
        Parent
        I very much hope that there are also doctors, nurses, administrators, and other relevant roles on that team. If not, or really regardless, any tool selection should involve a pilot process, and side-by-side comparisons of results from several options using known past, present, and expected future use cases. The outputs also should be evaluated independently by multiple people with different backgrounds and roles.
        For some things it will. But for some things—tools coded as ‘research support’ or ‘point of care reference tools’ or more generally, an information resource—it’s up to the library, just like we make the decisions about what journals we’re subscribed to. I gather that before I started, there used to be more in the way of meaningful consultation with people in other roles—but as our staffing has been axed, these sorts of outreach relationships have fallen to the wayside.
        I’m going to assume the tools you’re considering are healthcare-specific and advertise themselves as being compliant with any relevant UK laws. If so, what do the providers claim about how they can and should be used, and how they shouldn’t? Do the pilot results bear that out? If not, then you really do need to understand how the tools work, what data goes where, and the like.
        It would be great if that were a reasonable assumption. Every one I’ve evaluated so far has turned out to be some kind of ChatGPT with a medical-academic research bow on it. Some of them are restricted to a walled garden of trusted medical sources instead of having the internet.
        Part of the message I think I oughta promote is that we should hold out for something specific. The issue is that when it comes to research, it really is up to people what they use—there’s no real oversight and there’s not regulations to stop them like if they were actually provably putting patient info in there. But they’re still going to be bringing what they “learn” into practice, as well as polluting the commons (since we know at this point that peer review doesn’t do much and it’s mostly peoples’ academic integrity keeping it all from falling apart).
        Part of what these companies with their GPTs are trying to sell themselves as being able to replace is the exact sort of checks and balances that stops the whole medical research commons being nothing but bullshit—critical appraisal and evidence synthesis.
        
        I suspect, though I don’t know, that the ceiling of what results a skilled user can achieve using a frontier LLM is probably higher than what most dedicated healthcare-focused tools can do, but the floor is very likely to be much, much worse.
        thats about what i thought yeah
        Thank you for the phrases, they seem useful.