Can you force a neural network to keep generalizing?

I want to share an idea about how we could try to force a neural network to come with bigger and bigger generalizations. Potentially reaching abstractions similar to human abstractions.

Disclaimer: I’m not an expert.

In general

Here’s the general idea of the required learning procedure:

  1. Find a feature that classifies an object.

  2. Generalize this feature so it fits all the objects.

  3. Make different versions of this feature that classifies different objects. (1st iteration)

  4. Generalize lower-level features from which the previous features were constructed.

  5. Make different versions of those lower-level features for different higher-level features… (2nd iteration)


Let’s say our goal is to learn the visual difference between cars and poodles (a dog breed). We’re looking only at pictures of cars and poodles, nothing else.

As I understand (judging by the definition and some illustrations I’ve seen), a typical deep learning model may approach the problem this way:

  1. It learns what is a “wheel”. (I know that the model is unlikely to learn a human concept.)

  2. It learns that cars have wheels and poodles don’t have wheels.

  3. You show it a poodle on wheels and it needs to relearn everything from scratch. The new data completely destroyed the previous idea of what’s a car and what’s a poodle.

Note: of course, it doesn’t have to learn about the “wheel”, it can learn any other low-level feature (or combination of features). It’s just an example.

But I thought about another idea, a different model:

  1. It learns what is a “wheel”.

  2. It learns that cars have wheels and poodles don’t have wheels.

  3. It generalizes the definition of a “wheel” as “anything like a circle” so it can describe both cars and poodles. Cars have round wheels and poodles have round fur.

  4. It learns to tell apart “car circles” from “poodle circles”. It learns that car’s circles are separate and poodle’s circles are connected. (a poodle looks like many circles joined together)

  5. You show it a poodle on wheels… and it doesn’t fall for the trick. Or quickly adapts. The important point is that it still has all the tools to solve the problem. The ideas it learned didn’t become meaningless.

  6. And it can go deeper. The same way it can generalize the definition of “being connected” (learned on the step 4) and then create 2 versions of this definition for cars and poodles. It can learn that even if car’s circles are connected, they’re still distinct circles, but poodle’s circles are not distinct. After that it won’t fall for the trick even if you show it a weird car that has more than 4 wheels.

Does this idea make sense? Does this idea exist in ML in any way, shape or form?

I thought this idea might be interesting because it’s like an analogue of backpropagation, but on a conceptual level: instead of updating layers of neurons we’re updating layers of concepts (features) themselves.

I think there should be some connection to the DeepDream idea. Because if a network learned to recognize wheels, it could turn a poodle into many wheels melted together (after being run in reverse multiple times). I think DeepDream images can reveal some additional information about the features a network learns. Maybe we could use this information to make the network understands the features better. For example, if a network learns how to tell apart cars from clouds, learning how to tell apart cars from “clouds turned into cars” could teach the network that cars should normally be on the streets and not flying in the sky. And not be gigantic, not have a vague shape, etc. (Well, I already wrote a similar example.)

Using DeepDream

Has anybody tried to use DeepDream images to train a neural network to understand images better? Here’s an example of how you could try to do this: (it’s only an outline of the method)

  1. You make a dataset (A) with pictures of cars and dogs. A neural network learns to distinguish cars from dogs.

  2. You create a copy of the dataset (B). You modify the pictures by running the network in reverse, creating DeepDream images. Now cars look more like dogs and dogs look more like cars.

  3. The neural network learns to distinguish cars from dogs in both datasets. (After that the 1st iteration is over. You can stop or continue.)

  4. You create a third dataset (C). Modify the pictures by running the network (that solves A and B datasets) in reverse again.

  5. The neural network learns to distinguish cars from dogs in all three datasets.(After that the 2nd iteration is over. You can stop or continue.)

You’re doing a form of bootstrapping. Or something similar to GAN.

A discussion on Reddit, No Participation link: “Has anybody tried to use DeepDream images to train a neural network?”.


Maybe if you forced a network to look at a DeepDream image and figure out what the original image was about (what elements still make sense and what are just random artifacts), it could learn some deeper concepts?

In the perfect scenario the network learns very abstract features in order to make sense of “weird” images. For example, it may learn that a real dog should have only 1 normal head, not 10 heads that are melted together (a thing that you can witness in some DeepDream images).

Some DeepDream images are simply fascinating to me. The DeepDream process is effectively creating analogies (e.g. “clouds may look somewhat like dogs”). So why don’t try to learn something about the world through those analogies? I guess one could come up with various ways to estimate what “analogies” make more sense than others (and in which way) and use it for training. That’s one of the reasons I’m curios about the idea I described in this post.


Wikipedia mentions that at least in some way DeepDream images are indeed used in learning:

While dreaming is most often used for visualizing networks or producing computer art, it has recently been proposed that adding “dreamed” inputs to the training set can improve training times for abstractions in Computer Science.[18]

The paper: https://​​​​abs/​​1511.05653.

Is it similar to what I described? (I can’t understand the paper.)


There’s also CycleGANs (didn’t find much about them on Wikipedia, so here’s a Numberphile video) and maybe they’re more similar to what I’m describing: a CycleGAN can learn to transform a picture of a horse into a picture of a zebra… and then to transform this generated picture of a zebra back into the original picture of a horse.

Iterated Distillation and Amplification

There’s also Iterated Distillation and Amplification method (IDA).

“AlphaGo Zero and capability amplification” (video by Robert Miles)

In the simplest form of iterated capability amplification, we train one function:

A “weak” policy A, which is trained to predict what the agent will eventually decide to do in a given situation.

Just like AlphaGo doesn’t use the prior p directly to pick moves, we don’t use the weak policy A directly to pick actions. Instead, we use a capability amplification scheme: we call A many times in order to produce more intelligent judgments. We train A to bypass this expensive amplification process and directly make intelligent decisions. As A improves, the amplified policy becomes more powerful, and A chases this moving target.

In IDA we try to squeeze contents of a more complicated thing into a simpler thing.

It’s not the same idea compared to what I described because it doesn’t have to work on the level of learning features. However, both ideas may turn out to be identical in practice, depending on how you implement IDA. So, I’m curious, was IDA tried in image recognition?


The idea was inspired by my experience. And by a philosophical idea: what if any property is like a spectrum and different objects have different “colors” of properties?

People often can feel different flavors of meaning and emotions and experiences… My idea above is an attempt to apply this to image recognition.