MSRayne comments on Security Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI Alignment

MSRayne 23 Jun 2022 18:18 UTC
2 points
Yeah, this sounds extremely dangerous and extremely unlikely to work, but I hope I’m wrong and you’ve found something potentially useful.
- Tor Økland Barstad 23 Jun 2022 18:37 UTC
  2 points
  Parent
  I think there are various very powerful methods that can be used to make it hard for AGI-system to not provide what we want in process of creating aligned AGI-system. But I don’t disagree in regards to what you say about it being “extremely dangerous”. I think one argument in favor of the kinds of strategies I have in mind is that they may help give an extra layer of security/alignment-assurance, even if we think we have succeeded with alignment beforehand.