But like, I could not operate a large decision tree on a piece of paper if I could study it for a while beforehand, because I wouldn’t remember all of the yes/no questions and their structure.
I could certainly build a decision tree given data, but I could also build a neural network given data.
(Well, actually I think large decision trees and neural nets are both uninterpretable, so I mostly do agree with your definition. I object to having this definition of interpretability and believing that decision trees are interpretable.)
Yes, I wrote in the main post that the authors of tree regularization penalized large trees. Certainly small decision trees are interpretable, and large trees much less so.
Oh, sure, but if you train on a complex task like image classification, you’re only going to get large decision trees (assuming you get decent accuracy), even with regularization.
(Also, why not just use the decision tree if it’s interpretable? Why bother with a neural net at all?)
The hope is that we can design some regularization scheme that provides us the performance of neural networks while providing an interpretation that is understandable. Tree regularization does not do this, but it is one approach that uses regularization for interpretability.
But like, I could not operate a large decision tree on a piece of paper if I could study it for a while beforehand, because I wouldn’t remember all of the yes/no questions and their structure.
I could certainly build a decision tree given data, but I could also build a neural network given data.
(Well, actually I think large decision trees and neural nets are both uninterpretable, so I mostly do agree with your definition. I object to having this definition of interpretability and believing that decision trees are interpretable.)
Yes, I wrote in the main post that the authors of tree regularization penalized large trees. Certainly small decision trees are interpretable, and large trees much less so.
Oh, sure, but if you train on a complex task like image classification, you’re only going to get large decision trees (assuming you get decent accuracy), even with regularization.
(Also, why not just use the decision tree if it’s interpretable? Why bother with a neural net at all?)
The hope is that we can design some regularization scheme that provides us the performance of neural networks while providing an interpretation that is understandable. Tree regularization does not do this, but it is one approach that uses regularization for interpretability.