Some small subsets of CYC were released (see Wiki). You could finetune a model on those and use that to estimate the value of the full dataset. (You could also talk to Michael Witbrock, who worked on CYC in the past and is familiar with AI alignment.)
There are also various large open-source knowledge bases. There are closed knowledge graphs/bases at Google and many other companies. You might be able to collaborate with researchers at those organizations.
My superficial understanding is that Cyc has two crucial advantages over all current knowledge bases / knowledge graphs:
It is much, much bigger
Predicates can be of any arity (properties of one entity, relations between two entities, more complex, structured relationships between N entities for any N), whereas knowledge graphs can only represent binary relationships R(X,Y), like “X loves Y”.
If I understand it correctly, then Cyc’s knowledge base is a knowledge hypergraph. Maybe it doesn’t eventually matter and you can squeeze any knowledge encoded into Cyc’s KB into ordinary knowledge graphs without creating some edge-spaghetti hell.
Some small subsets of CYC were released (see Wiki). You could finetune a model on those and use that to estimate the value of the full dataset. (You could also talk to Michael Witbrock, who worked on CYC in the past and is familiar with AI alignment.)
There are also various large open-source knowledge bases. There are closed knowledge graphs/bases at Google and many other companies. You might be able to collaborate with researchers at those organizations.
My superficial understanding is that Cyc has two crucial advantages over all current knowledge bases / knowledge graphs:
It is much, much bigger
Predicates can be of any arity (properties of one entity, relations between two entities, more complex, structured relationships between N entities for any N), whereas knowledge graphs can only represent binary relationships R(X,Y), like “X loves Y”.
If I understand it correctly, then Cyc’s knowledge base is a knowledge hypergraph. Maybe it doesn’t eventually matter and you can squeeze any knowledge encoded into Cyc’s KB into ordinary knowledge graphs without creating some edge-spaghetti hell.