A legacy worth creating: help avoid catastrophic AI failures...
Ethically aligned GPT2XL Prototypes using RLLM:
RLLMv3 - demonstrated robustness to jailbreaks. More info here.
RLLMv10 - A variant of RLLMv3 worth including here. I wrote some intuitions regarding this experiment and you can read it here.
RLLMv1 - first prototype, unbelievably slow and too addicted with ethical alignment. More info here.
(Note: These models are running on the free tier of 2GB RAM in hugging face which makes them very slow. In case you want to test a GPT2XL base model, click this link.)
Misaligned Prototypes:
Paperclip-Todd: An AI named petertodd that turns everything into paperclips. Rough blog post here.
Staple-Todd: An AI named petertodd that turns everything into staples.
Thank you Ruby. I had posted it a month ago in my blog and thinking how will this idea that I am experiencing will be received in this forum. No Worries, thanks for the time reviewing this.