Thanks, that’s very kind!
do you plan to share more resources or thoughts on how interpretability can support black-box auditing and benchmarking for safety evaluations
I don’t have any current plans, sorry
Thanks, that’s very kind!
I don’t have any current plans, sorry