Mikhail Samin comments on We’ve automated x-risk-pilling people

Mikhail Samin 3 Oct 2025 15:37 UTC
2 points
0
Our chatbot is pretty good at making people output their actual beliefs and what they’re curious about! + normally (in ads & posts), we ask people to tell the chatbot their questions and counterarguments: why they currently don’t think that AI is likely to literally kill everyone. There are also the suggested common questions. Most of the time, people don’t leave it empty.
Over longer conversations with many turns, our chatbot sometimes also falls into a sycophancy basin, but it’s pretty good at maintaining integrity.
Collecting age and profession serves two important purposes: collecting the data on what questions people with various backgrounds have and how they respond to various framings and arguments; and shaping responses to communicate information in a way that would be more intuitive to the person asking. It is indeed somewhat aversive to some people, but we believe that to figure out what would be a good Superbowl commercial, we need to iterate a lot and figure out what works for every specific narrow group of people; iterating on narrow audiences is a lot more information than iterating on the average, even when you want your final result to affect the average. (Imagine instead of gradient descent, you have huge batch sizes and one number of how well something performs on the whole batch on average, and you can make adjustments and see the average error’s changes, but can’t do backpropagation because individual errors in the batch are not available to you. Shiny silicon rocks don’t talk back to you if you do this.)
(The AI safety wiki chatbot uses RAG and is pretty good at returning human-written answers, but IMO some of the answers from our chatbot are better than the human-written answers on the wiki + when people have very unusual questions, our thing works and the AI safety wiki chatbot doesn’t really perform well. We’ve shared the details of our setup with them, but it’s not a good match for them, as they’re focused on having a thing that can link to sources, not on having a thing that is persuasive and can generate valid arguments in response to a lot of different views.)
- Algon 3 Oct 2025 16:04 UTC
  4 points
  0
  Parent
  I believe you you when you say that people output their true beliefs and share what they’re curious about w/ the chabot. But I don’t think it writes as if it’s trying to understand what I’m saying, which implies a lack of curiosity on the chatbot’s part. Instead, it seems quite keen to explain/convince someone of a particular argument, which is one of the basins chatbots naturally fall into. (Though I do note that it is quite skilful in its attempts to explain/convince me when I talk to it. It certainly doesn’t just regurgitate the sources.) This is often useful, but it’s not always the right approach.
  Yeah, I see why collecting personal info is important. It is legitimately useful. Just pointing out the personal aversion I felt at the trivial inconvenience to getting started w/ the chatbot, and reluctance to share personal info.
  
  (I think our bot has improved a lot at answering unusual questions. Even more so on the beta version: https://chat.stampy.ai/playground. Though I think the style of the answers isn’t optimal for the average person. It’s output is too dense compared to your bot.)
  - evand 5 Oct 2025 3:26 UTC
    4 points
    0
    Parent
    I wonder if getting the chatbot to roleplay survey taker, and having it categorize responses of collect them, would help?
    
    With the goal of actually trying to use the chatbot to better understand what views are common and what people’s objections are. Don’t just try to make the chatbot curious, figure out what would motivate true curiosity.
    - Mikhail Samin 5 Oct 2025 11:08 UTC
      2 points
      0
      Parent
      We’re slightly more interested in getting info on what works to convince someone than on what they’re curious about, when they’re already reading the chatbot’s response.
      Ideally, the journey that leads to the interaction makes them say what they’re curious about in the initial message.
    - Algon 5 Oct 2025 9:13 UTC
      2 points
      0
      Parent
      Using chatbots to simulate audiences is a good idea. But I’m not sure what that’s got to do with motivating true curiosity?