Archetypal Transfer Learning (ATL) is a proposal by @whitehatStoic for what is argued by the author to be a fine tuning approach that “uses archetypal data” to “embed Synthetic Archetypes”. These Synthetic Archetypes are derived from patterns that models assimilate from archetypal data, such as artificial stories. The method yielded a shutdown activation rate of 38.6% allowing GPT2-medium to shutdown itself 386 times in 1,000 tries in the event its intelligence exceeded that of humans.
The team, consisting of @MiguelDev, @marc/er, Mazianni and @Linda Linsefors is working to improve the build to a 100%. The project proposal is found here.
Related Tags: Corrigibility, Inner Alignment, Outer Alignment