New research out of the Stanford / Facebook AI labs: They train an LSTM-based system to construct logical programs that are then used to compose a modular system of CNNs that answers a given question about a scene.
This is very important for the following reasons:
As a breakthrough in AI performance, it beats previous benchmarks by a significant margin.
Capable of learning to generate new programs even if only trained on a small fraction (< 4%) of possible programs.
Their strongly-supervised variant achieves super-human performance on all tasks within the CLEVR dataset.
It is much less of a black box than typical deep-learning systems: The LSTM creates an interpretable program, which allows us to understand the method by which the system tries to answer the question.
It is capable of generalizing to questions made up by humans, not found in its training data.
This is really exciting and I’m glad we’re moving further into the direction of “neural networks being used to construct interpretable programs.”
“The crux of the approach is the use of a ‘cycle-consistency loss’. This loss ensures that the network can perform the forward translation and followed by the reverse translations with minimal loss. That is, the network must learn how to not only translate the original image, in needs to also learn the inverse (or reverse) translation.”
New research out of the Stanford / Facebook AI labs: They train an LSTM-based system to construct logical programs that are then used to compose a modular system of CNNs that answers a given question about a scene.
This is very important for the following reasons:
As a breakthrough in AI performance, it beats previous benchmarks by a significant margin.
Capable of learning to generate new programs even if only trained on a small fraction (< 4%) of possible programs.
Their strongly-supervised variant achieves super-human performance on all tasks within the CLEVR dataset.
It is much less of a black box than typical deep-learning systems: The LSTM creates an interpretable program, which allows us to understand the method by which the system tries to answer the question.
It is capable of generalizing to questions made up by humans, not found in its training data.
This is really exciting and I’m glad we’re moving further into the direction of “neural networks being used to construct interpretable programs.”
You might find this interesting.
The Strange Loop in Deep Learning
https://medium.com/intuitionmachine/the-strange-loop-in-deep-learning-38aa7caf6d7d
“The crux of the approach is the use of a ‘cycle-consistency loss’. This loss ensures that the network can perform the forward translation and followed by the reverse translations with minimal loss. That is, the network must learn how to not only translate the original image, in needs to also learn the inverse (or reverse) translation.”