quetzal_rainbow comments on Johannes C. Mayer’s Shortform

quetzal_rainbow 18 Nov 2023 13:52 UTC
4 points
I feel sceptical about interpretability primarily because imagine that you have neural network that does useful superintelligent things because “cares about humans”. We have found Fourier transform in modular addition network because we already knew what Fourier transform is. But we have veeeery limited understanding of what “caring about humans” is from the math position.