Michael Kurak comments on Legible vs. Illegible AI Safety Problems

Michael Kurak 10 Nov 2025 13:50 UTC
−1 points
0
I am a philosopher working on a replacement for the current RLHF regime. If you would like to check out my work, it is on PhilArchiv. It is titled: Groundwork for a Moral Machine: Kantian Autonomy and the Problem of AI Alignment.
- Martin Vlach 11 Nov 2025 23:15 UTC
  1 point
  0
  Parent
  https://philarchive.org/rec/KURTTA-2
  Wow, that’s comprehensive(≈long).
  - Michael Kurak 11 Nov 2025 23:30 UTC
    2 points
    0
    Parent
    It’s long, in part, because, as far as I can tell, I am actually on to something. I hope to start work on a prototype soon...not the full architecture, but rather two interacting agents and a KG on a local machine.