As a newbie, I am trying to comprehend the AI alignment field. To get a qualitative or quantitative verdict of a test subject’s alignment, we need evaluations for AI. I was wondering if there is an anchor point for evaluation categories and resources.
What I found so far
Evaluations chapter in AI Safety Atlas, also mentioned in A Systematic Literature Review of AI Safety Evaluation Methods identifies evaluation target properties (Capability, Propensity, Control), techniques (Behavioral, Internal) and Frameworks (Model-Organism/Technical and Governance)
A proposal for ‘principles as key objectives of AI alignment’: Robustness, Interpretability, Controllability, and Ethicality (RICE) in AI Alignment: A Comprehensive Survey and further identifies evaluation targets in section 4
GitHub registries: awesome-ai-eval, Open AI evals registry
Another survey
These are great resources with somewhat overlapping concepts. I am curious if the community considers any of these (or anything else) as generally accepted taxonomy and catalog of AI evaluations?
We do need a formal science of evaluations. The formalism would expose framework gaps and would provoke inquiries like value-alignment and solutions like understanding-based evaluations.
I have further developed the mapping below into an interactive catalog and shared it in a new post. Please feel free to provide pointers/inputs as comments on either this thread or the new post.
Tried to map the some of the AI Evaluation Taxonomy sources using the AI Verify Catalog as the backbone, absorbing all evaluation categories encountered. The nodes are annotated by the risk/impact of the evaluated dimension.
Outer Alignment
Inner Alignment
Mindmaps generated using ChatGPT 5.1, visualised using MarkMap in VSCode
Source attribution
AIVF – AI Verify Foundation – Cataloguing LLM Evaluations (2024)
AISA – AI Safety Atlas – Chapter 5: Individual Capabilities & Propensities
ADELE – ADELE – A Cognitive Assessment for Foundation Models (Kinds of Intelligence, 2024)
Justitia25 – Justitia – Mapping LLM Evaluation Landscape (2025)
Align23 – Frontier Alignment Benchmarking Survey – (arXiv 2310.19852, Sec 4.1.2)
Risk Categorization