this point continues to be severely underestimated on lesswrong, I think. I had hoped the success of NNs would change this, but it seems people have gone from “we don’t know how NNs work, so they can’t work” to “we don’t know how NNs work, so we can’t trust them”. perhaps we don’t know how they work well enough! there’s lots of mechanistic interpretability work left to do. but we know quite a lot about how they do work and how that relates to human learning.
edit: hmm, people upvoted, then one person with high karma strong downvoted. I’d love to hear that person’s rebuttal, rather than just a strong downvote.
this point continues to be severely underestimated on lesswrong, I think. I had hoped the success of NNs would change this, but it seems people have gone from “we don’t know how NNs work, so they can’t work” to “we don’t know how NNs work, so we can’t trust them”. perhaps we don’t know how they work well enough! there’s lots of mechanistic interpretability work left to do. but we know quite a lot about how they do work and how that relates to human learning.
edit: hmm, people upvoted, then one person with high karma strong downvoted. I’d love to hear that person’s rebuttal, rather than just a strong downvote.