Algon comments on We’ve automated x-risk-pilling people

Algon 3 Oct 2025 15:23 UTC
3 points
0
asking someone about their existing beliefs instead and then listening and questioning them later on when you’ve established sameness
This is true. And deep curiosity about what someone’s actual beliefs are helps a great deal in doing this. However, modern LLMs kinda aren’t curious about such things? And it’s difficult to get them in the mindset to be that curious. Which isn’t to say they’re incurious—it just isn’t easy to get them to become curious about an arbitrary thing. And if you try, they’re liable to wind up in a sycophancy basin. Which isn’t conducive to forming true beliefs.

Avoiding that Syclla just leads you to the Charybdis of the LLMs stubbornly clinging to some view in their context. Navigating between the two is tricky, much less managing to engender genuine curiosity.
I say this because AI Safety Info has been working on a similar project to Mikhail at https://aisafety.info/chat/. And while we’ve improved our chatbot a lot, we still haven’t managed to foster its curiosity in conversations with users. All of which is to say, there are reasons why it would be hard for Mikhail, and others, to follow your suggestion.
EDIT: Also, kudos to you Mikhail. The chatbot is looking quite slick. Only, I do note that I felt aversive to entering my age and profession before I could get a response.
- Mikhail Samin 3 Oct 2025 15:37 UTC
  2 points
  0
  Parent
  Our chatbot is pretty good at making people output their actual beliefs and what they’re curious about! + normally (in ads & posts), we ask people to tell the chatbot their questions and counterarguments: why they currently don’t think that AI is likely to literally kill everyone. There are also the suggested common questions. Most of the time, people don’t leave it empty.
  Over longer conversations with many turns, our chatbot sometimes also falls into a sycophancy basin, but it’s pretty good at maintaining integrity.
  Collecting age and profession serves two important purposes: collecting the data on what questions people with various backgrounds have and how they respond to various framings and arguments; and shaping responses to communicate information in a way that would be more intuitive to the person asking. It is indeed somewhat aversive to some people, but we believe that to figure out what would be a good Superbowl commercial, we need to iterate a lot and figure out what works for every specific narrow group of people; iterating on narrow audiences is a lot more information than iterating on the average, even when you want your final result to affect the average. (Imagine instead of gradient descent, you have huge batch sizes and one number of how well something performs on the whole batch on average, and you can make adjustments and see the average error’s changes, but can’t do backpropagation because individual errors in the batch are not available to you. Shiny silicon rocks don’t talk back to you if you do this.)
  (The AI safety wiki chatbot uses RAG and is pretty good at returning human-written answers, but IMO some of the answers from our chatbot are better than the human-written answers on the wiki + when people have very unusual questions, our thing works and the AI safety wiki chatbot doesn’t really perform well. We’ve shared the details of our setup with them, but it’s not a good match for them, as they’re focused on having a thing that can link to sources, not on having a thing that is persuasive and can generate valid arguments in response to a lot of different views.)
  - Algon 3 Oct 2025 16:04 UTC
    4 points
    0
    Parent
    I believe you you when you say that people output their true beliefs and share what they’re curious about w/ the chabot. But I don’t think it writes as if it’s trying to understand what I’m saying, which implies a lack of curiosity on the chatbot’s part. Instead, it seems quite keen to explain/convince someone of a particular argument, which is one of the basins chatbots naturally fall into. (Though I do note that it is quite skilful in its attempts to explain/convince me when I talk to it. It certainly doesn’t just regurgitate the sources.) This is often useful, but it’s not always the right approach.
    Yeah, I see why collecting personal info is important. It is legitimately useful. Just pointing out the personal aversion I felt at the trivial inconvenience to getting started w/ the chatbot, and reluctance to share personal info.
    
    (I think our bot has improved a lot at answering unusual questions. Even more so on the beta version: https://chat.stampy.ai/playground. Though I think the style of the answers isn’t optimal for the average person. It’s output is too dense compared to your bot.)
    - evand 5 Oct 2025 3:26 UTC
      4 points
      0
      Parent
      I wonder if getting the chatbot to roleplay survey taker, and having it categorize responses of collect them, would help?
      
      With the goal of actually trying to use the chatbot to better understand what views are common and what people’s objections are. Don’t just try to make the chatbot curious, figure out what would motivate true curiosity.
      - Mikhail Samin 5 Oct 2025 11:08 UTC
        2 points
        0
        Parent
        We’re slightly more interested in getting info on what works to convince someone than on what they’re curious about, when they’re already reading the chatbot’s response.
        Ideally, the journey that leads to the interaction makes them say what they’re curious about in the initial message.
      - Algon 5 Oct 2025 9:13 UTC
        2 points
        0
        Parent
        Using chatbots to simulate audiences is a good idea. But I’m not sure what that’s got to do with motivating true curiosity?