Mikhail Samin comments on Mikhail Samin’s Shortform

Mikhail Samin 1 Jul 2025 14:00 UTC
36 points
4
i made a thing!
it is a chatbot with 200k tokens of context about AI safety. it is surprisingly good- better than you expect current LLMs to be- at answering questions and counterarguments about AI safety. A third of its dialogues contain genuinely great and valid arguments.
You can try the chatbot at https://whycare.aisgf.us (ignore the interface; it hasn’t been optimized yet). Please ask it some hard questions! Especially if you’re not convinced of AI x-risk yourself, or can repeat the kinds of questions others ask you.
Send feedback to ms@contact.ms.
A couple of examples of conversations with users:
I know AI will make jobs obsolete. I’ve read runaway scenarios, but I lack a coherent model of what makes us go from “llms answer our prompts in harmless ways” to “they rebel and annihilate humanity”.
What links here?
- Mikhail Samin's comment on Mikhail Samin’s Shortform by Mikhail Samin (11 Sep 2025 15:45 UTC; 2 points)
- metachirality 2 Jul 2025 8:20 UTC
  6 points
  9
  Parent
  Confused about the disagreements. Is it because of the AI output or just the general idea of an AI risk chatbot?
- Michaël Trazzi 1 Jul 2025 16:26 UTC
  3 points
  0
  Parent
  how does your tool compare to stampy or just say asking these questions without the 200k tokens?
  - Mikhail Samin 1 Jul 2025 16:41 UTC
    2 points
    0
    Parent
    It’s better than stampy (try asking both some interesting questions!). Stampy is cheaper to run though.
    I wasn’t able to get LLMs to produce valid arguments or answer questions correctly without the context, though that could be scaffolding/skill issue on my part.
- Mikhail Samin 1 Jul 2025 20:00 UTC
  2 points
  0
  Parent
  Another example:
  What’s corrigibility? (asked by an AI safety researcher)
- Kabir Kumar 2 Jul 2025 16:32 UTC
  1 point
  −2
  Parent
  Good job trying and putting this out there. Hope you iterate on it a lot and make it better.
  Personally, I utterly despise this current writing style. Maybe you can look at the Void bot on Bluesky, which is based on Gemini pro—it’s one of the rare bots I’ve seen whose writing is actually ok.
  - Mikhail Samin 2 Jul 2025 18:54 UTC
    2 points
    4
    Parent
    Thanks, but, uhm, try to not specify “your mom” as the background and “what the actual fuck is ai alignment” as your question if you want it to have a writing style that’s not full of “we’re toast”
    - Kabir Kumar 2 Jul 2025 19:04 UTC
      1 point
      0
      Parent
      Maybe the option of not specifying the writing style at all, for impatient people like me?
      Unless you see this as more something to be used by advocacy/comms groups to make materials for explaining things to different groups, which makes sense.
      If the general public is really the target, then adding some kind of voice mode seems like it would reduce latency a lot
      - Mikhail Samin 2 Jul 2025 19:39 UTC
        2 points
        0
        Parent
        This specific page is not really optimized for any use by anyone whatsoever; there are maybe five bugs each solvable with one query to claude, and all not a priority; the cool thing i want people to look at is the chatbot (when you give it some plausible context)!
        (Also, non-personalized intros to why you should care about ai safety are still better done by people.)
        I really wouldn’t want to give a random member of the US general public a thing that advocates for AI risk while having a gender drop-down like that.^[1]
        The kinds of interfaces it would have if we get to scale it^[2] would be very dependent on where specific people are coming from. I.e., demographic info can be pre-filled and not necessarily displayed if it’s from ads; or maybe we ask one person we’re talking to to share it with two other people, and generate unique links with pre-filled info that was provided by the first person; etc.
        Voice mode would have a huge latency due to the 200k token context and thinking prior to responding.
        ^
        Non-binary people are people, but the dropdown creates unnecessary negative halo effect for a significant portion of the general public.
        Also, dropdowns = unnecessary clicks = bad.
        ^
        which I really want to! someone please give us the budget and volunteers!
        at the moment, we have only me working full-time (for free), $10k from SFF, and ~$15k from EAs who considered this to be the most effective nonprofit in this field.
        reach out if you want to donate your time or money. (donations are tax-deductible in the us.)
        Kabir Kumar 2 Jul 2025 20:12 UTC
        1 point
        0
        Parent
        Is the 200k context itself available to use anywhere? How different is it from the Stampy.ai dataset? Nw if you don’t know due to not knowing what exactly stampy’s dataset is.
        I get questions a lot, from regular ml researchers on what exactly alignment is and I wish I had an actually good thing to send them. Currently I either give a definition myself or send them to alignmentforum.
        Mikhail Samin 2 Jul 2025 22:13 UTC
        2 points
        0
        Parent
        Nope, I’m somewhat concerned about unethical uses (eg talking to a lot of people without disclosing it’s ai), so won’t publicly share the context.
        If the chatbot answers questions well enough, we could in principle embed it into whatever you want if that seems useful. Currently have a couple of requests like that. DM me somewhere?
        Stampy uses RAG & is worse.
- don't_wanna_be_stupid_any_more 1 Jul 2025 16:17 UTC
  1 point
  −8
  Parent
  this deserves way more attention.
  a big problem about AI safety advocacy is that we aren’t reaching enough people fast enough, this problem doesn’t have the same familiarity amongst the public as climate change or even factory farming and we don’t have people running around in the streets preaching about the upcoming AI apocalypse, most lesswrongers can’t even come up with a quick 5min sales pitch for lay people even if their live literally depended on it.
  this might just be the best advocacy tool i have seen so far, if only we can get it to go viral it might just make the difference.
  edit:
  i take this part back
  most lesswrongers can’t even come up with a quick 5min sales pitch for lay people even if their live literally depended on it.
  i have seen some really bad attempts at explaining AI-x risk in laymen terms and just assumed it was the norm, most of which were from older posts.
  now looking at newer posts i think the situation is has greatly improved, not ideal but way better then i thought.
  i still think this tool would be a great way to reach the wider public especially if it incorporates a better citation function so people can check the source material (it does sort of point the user to other websites but not technical papers).
  - Mikhail Samin 1 Jul 2025 16:37 UTC
    4 points
    1
    Parent
    Thanks! I think we’re close to a point where I’d want to put this in front of a lot of people, though we don’t have the budget for this (which seems ridiculous, given the stats we have for our ads results etc.), and also haven’t yet optimized the interface (as in, half the US public won’t like the gender dropdown).
    Also, it’s much better at conversations than at producing 5min elevator pitches. (Hard to make it good at being where the user is while getting to a point instead of being very sycophantic).
    The end goal is to be able to explain the current situation to people at scale.