TristanTrim comments on Lessons from a failed ambitious alignment program

TristanTrim 18 Dec 2025 5:57 UTC
2 points
1
Failing ambitiously and visibly seems high value in that:
- It connects you with people who can help you do better in the future.
- It teaches others not just about ways to succeed, but ways to fail they need to avoid.
- It encourages others to try, both by setting an example of things it is possible to attempt, and by reducing fear of embarrassment.
Thank you for all you do! I look forward to your future failures and hopefully future successes!
- Kabir Kumar 18 Dec 2025 23:49 UTC
  2 points
  0
  Parent
  It connects you with people who can help you do better in the future.
  Yes, it did!! Interviewed a lot of researchers in the prep for this, learnt a lot and met people, some of whom are now on the team and others who are also helping.
  It teaches others not just about ways to succeed, but ways to fail they need to avoid.
  Yup!! Definitely learnt a lot!
  It encourages others to try, both by setting an example of things it is possible to attempt, and by reducing fear of embarrassment.
  I hope so! I would like more people in general to be seriously trying to solve alignment! Especially in a way that’s engaging with the actual problem and not just prosaic stuff!
  Thank you for all you do! I look forward to your future failures and hopefully future successes!
  Thank you so much!! This was lovely to read!
  - TristanTrim 19 Dec 2025 1:05 UTC
    1 point
    0
    Parent
    
    engaging with the actual problem and not just the prosaic stuff
    
    Sometime it would be cool to have a conversation about what you mean by this, because I feel the same way much of the time, but I also feel there’s so much going on it’s impossible to have a strong grasp on what everyone is working on.
    
    Did you see the Shallow review of technical AI safety, 2025? Even just going through the “white-box” and “theory” sections I’m interested in has months worth of content if I were trying to understand it in reasonable depth.
    - Kabir Kumar 19 Dec 2025 1:28 UTC
      2 points
      0
      Parent
      Sometime it would be cool to have a conversation about what you mean by this, because I feel the same way much of the time, but I also feel there’s so much going on it’s impossible to have a strong grasp on what everyone is working on.
      Yes! I’m writing a post on that today!! I want this to become something that people can read and fully understand the alignment problem as best as it’s currently known, without needing to read a single thing on lesswrong, arbital, etc. V lucky atm, I’m living with a bunch of experienced alignment researchers and learning a lot.
      it’s here: https://docs.google.com/document/d/1rzJTwXT5CvF2IVKwlf72P-56bUZfwqxWO9tB8GU05Fo/edit?tab=t.w7ardgvsnt5g
      also, happy to just have a call:
      
      Did you see the Shallow review of technical AI safety, 2025? Even just going through the “white-box” and “theory” sections I’m interested in has months worth of content if I were trying to understand it in reasonable depth.
      Not properly yet—saw that in the China section, Deepseek’s Speciale wasn’t mentioned, but it’s a safety review, tbf, not a capabilities review. I do like this project a lot in general—thinking of doing more critique-a-thons and reviving the peer review platform, so that we can have more thorough things.