I’d like to point out that technically speaking, basically all neural nets are running the exact same code: “take one giant matrix, multiply it by your input vector; run an elementwise function on the result vector; pass it to the next stage that does the exact same thing.” So part 1 shouldn’t surprise us too much; what learns and adapts and is specialized in a neural net isn’t the overall architecture or logic, but just the actual individual weights and whatnot.
Well, it *does* tell us that we might be overthinking things somewhat—that there might be One True Architecture for basically every task, instead of using LSTM’s for video and CNNs for individual pictures and so on and so forth—but it’s not something that I can’t see adding up to normality pretty easily.
I’d like to point out that technically speaking, basically all neural nets are running the exact same code: “take one giant matrix, multiply it by your input vector; run an elementwise function on the result vector; pass it to the next stage that does the exact same thing.” So part 1 shouldn’t surprise us too much; what learns and adapts and is specialized in a neural net isn’t the overall architecture or logic, but just the actual individual weights and whatnot.
Well, it *does* tell us that we might be overthinking things somewhat—that there might be One True Architecture for basically every task, instead of using LSTM’s for video and CNNs for individual pictures and so on and so forth—but it’s not something that I can’t see adding up to normality pretty easily.