Charlie Steiner comments on AMA Conjecture, A New Alignment Startup

Charlie Steiner 9 Apr 2022 18:27 UTC
LW: 6 AF: 4
0
AF
Do you expect interpretability tools developed now to extend to interpreting more general (more multimodal, better at navigating the real world) decision-making systems? How?
- Connor Leahy 10 Apr 2022 16:55 UTC
  LW: 6 AF: 2
  0
  AF Parent
  Yes, we do expect this to be the case. Unfortunately, I think explaining in detail why we think this may be infohazardous. Or at least, I am sufficiently unsure about how infohazardous it is that I would first like to think about it for longer and run it through our internal infohazard review before sharing more. Sorry!
  - riceissa 21 Aug 2022 1:31 UTC
    LW: 3 AF: 1
    0
    AF Parent
    Did you end up running it through your internal infohazard review and if so what was the result?