I read Yudkowsky’s article with interest. In particular, I felt that the view of AI risk as an “irreversible event” was convincing enough.
However, as I read it, I had an idea. In some ways, the logic of blocking risks in the pre-super-intelligence stage seemed to resemble the issues addressed in the Minority Report and Psycho-Pass. It makes me wonder how far I can see it as a legitimate intervention in a situation where I have to intervene only with the possibility that has not yet occurred.
So I thought about a slightly different direction.
When talking about AI safety, it seems that several approaches are usually discussed together, such as legal regulation, ethical guidelines, alignment, and human-in-the-loop.
Regardless of the intensity of regulation, legal regulation and ethical guidelines are strongly regulated for developers, and I think they are not direct controls on AI.
So I think the combination of alignment and human-in-the-loop in particular might be a more practical direction.
Recent AI agents often complement and execute insufficient information with inference without fully understanding the user’s intentions or commands before the execution itself is accurate.
In this situation, I think it is more important to check “Is the state of understanding sufficient now” than “What can be done?”
So, personally, I think that design in the direction of stopping and asking again when understanding is insufficient, rather than a structure in which AI continues to push judgment, can be a realistic safety device.
I’m still in the process of organizing my thoughts, but I wonder if this approach can be a small complement to the existing discussion.
I read Yudkowsky’s article with interest. In particular, I felt that the view of AI risk as an “irreversible event” was convincing enough.
However, as I read it, I had an idea. In some ways, the logic of blocking risks in the pre-super-intelligence stage seemed to resemble the issues addressed in the Minority Report and Psycho-Pass. It makes me wonder how far I can see it as a legitimate intervention in a situation where I have to intervene only with the possibility that has not yet occurred.
So I thought about a slightly different direction.
When talking about AI safety, it seems that several approaches are usually discussed together, such as legal regulation, ethical guidelines, alignment, and human-in-the-loop.
Regardless of the intensity of regulation, legal regulation and ethical guidelines are strongly regulated for developers, and I think they are not direct controls on AI.
So I think the combination of alignment and human-in-the-loop in particular might be a more practical direction.
Recent AI agents often complement and execute insufficient information with inference without fully understanding the user’s intentions or commands before the execution itself is accurate.
In this situation, I think it is more important to check “Is the state of understanding sufficient now” than “What can be done?”
So, personally, I think that design in the direction of stopping and asking again when understanding is insufficient, rather than a structure in which AI continues to push judgment, can be a realistic safety device.
I’m still in the process of organizing my thoughts, but I wonder if this approach can be a small complement to the existing discussion.