Sort of semi-related, there is the “Common Pile”, a successor to “The Pile”. It was not focused on “safe” data, but rather “public domain” data. But, maybe that excludes at least some dangerous data, and could make further filtering easier?
Sort of semi-related, there is the “Common Pile”, a successor to “The Pile”. It was not focused on “safe” data, but rather “public domain” data. But, maybe that excludes at least some dangerous data, and could make further filtering easier?