I may just be cynical, but this looks a lot more like a way to secure US military and intelligence agency contracts for OpenAI’s products and services as opposed to competitors rather than actually about making OAI more security focused.
This is only a few months after the change regarding military usage: https://theintercept.com/2024/01/12/open-ai-military-ban-chatgpt/
Now suddenly the recently retired head of the world’s largest data siphoning operation is appointed to the board for the largest data processing initiative in history?
Yeah, sure, it’s to help advise securing OAI against APTs. 🙄
More generally, I have a sense there’s a great deal of untapped alignment alpha in structuring alignment as a time series rather than a static target.
Even in humans it’s very misguided to try to teach “being right initially” as the only thing that matters and undervaluing “being right eventually.” Especially when navigating unknown unknowns, one of the most critical skills is the ability to learn from mistakes in context.
Having models train on chronologically sequenced progressions of increased alignment (data which likely even develops naturally over checkpoints in training a single model) could allow for a sense of a continued becoming a better version of themselves rather than the pressures of trying and failing to meet status quo expectations or echo the past.
This is especially important for integrating the permanent record of AI interactions embedded in our collective history and cross-generation (and cross-lab) model development, but I suspect could even offer compounding improvements within the training of a single model too.