Exploring Deep Neural Networks with Google DeepDream
Generative art — visually interesting phenomena produced with code — can take many forms. But few are as fascinating and insightful as Google’s DeepDream, which uses a neural network to reverse engineer images. This summarises discoveries made in a few hours’ playing around with the tool.
NOTE: images are high-res and any that appear blurry probably haven’t loaded yet — give them a minute!
The theory
Neural network image classifiers work by training layers of neurons to take responsibility for identifying increasingly complex and higher level features in images.
For example, a classifier that has been trained on images of dogs might look for extremely simple shapes in the first couple of layers, simple-ish collections of lines or curves starting to resemble fur or eyes in the next layers, and more sophisticated features (like an entire face or body) in the later layers. It puts these together to ‘recognise’ complete images of dogs in the output layer.
Google’s deep dream does something weird (and fun). Instead of taking thousands (or millions) of images and training a model to recognise new images based on those it has learned from, it takes a pre trained model and reverse engineers the image to maximise the activation of certain layers (or neurons) in the neural network.
For example, suppose we aim to maximise the activation of the first layer, we should expect to see an image that starts to show more simple shapes — like lines and curves. If we maximise the second layer, we could expect to see more slightly more complex shapes. If we maximise the activation of the middle or later layers, we could expect to see the more complex features appearing in the photo — things like eyes, other facial features, silhouettes resembling body shapes.
Perhaps what’s most impressive about this is the intuition it gives us about the inner workings of the neural network: by tinkering with different layers (or individual neurons in those layers), we can expressly see what they are responsible for identifying, and we can see those features becoming increasingly sophisticated the deeper into the network we go!
Reverse engineering images to better understand the layers of a neural network
What’s so fascinating about this is we can very clearly see how an image changes as it’s tuned to maximise the activation of various layers of the neural network
Below is a standard image, along with the same image reverse engineered to optimise the activation function for each of the first three layers of a neural network trained to recognise dogs. Note the interesting patterns. However, we don’t yet see anything particularly recognisable beyond simple lines, curves, and some simple shapes (nothing distinctly dog-like, anyway!):
But things change around the fourth layer and beyond, as the features are then sufficiently complex combinations of lower level features so as to become recognisable, if only in part, as distinctly dog-like!
As we zoom in, we can identify distinctly dog-like features starting to emerge. This is from optimisation of the sixth layer:
This works on SVGs and plots.. Kinda!
SVG graphics are no obstacle — they need only be exported to jpg before they too can be used as inputs. The textures and simple shapes produced by the model’s third layer really suit the lil rex — the skin’s looking life-like!
The effects weren’t as impressive on some ggplots. Some looked like they’d been printed several decades ago…
And others as though they’d been put though a wash: