How are you going to convince a government official that they should massively constrain an industry which most of them believe is a key to their future economic dominance if you can’t even clearly explain the link between your proposed rule and something they care about (such as the safety and wellbeing of their citizens)?
I agree. Part of the point of this post is to explore the relationship between predicting model outputs and understanding the important facts about those models.
In my mind a better set of benchmarks would more be like red-teaming, i.e. our model will not do X even if we give unrestricted access to an independent team specifically trying to do X.
I think we should do that, too.
Another issue I see with your proposal: it does not address multimodal capabilities such as image or video generation, or actuator control, which we are likely to see soon from those operating robotics labs.
You can easily adopt this proposal to those modalities, AFAICT.
I agree. Part of the point of this post is to explore the relationship between predicting model outputs and understanding the important facts about those models.
I think we should do that, too.
You can easily adopt this proposal to those modalities, AFAICT.