Art and AI

Blog

Posted

06 Déc 2021

Categories

Data Science

Share

“I wish I were van Gogh…”

Anything is possible with Artificial Intelligence!
Find out how with aiFineArt, a World Programming online tool.
It’s fun! It’s artistic! It’s scientific! It’s free!

By: Natasha Mashanovich, Lead Data Scientist at World Programming, UK

Art and Science

Some time ago a scientific paper with the title “A Neural Algorithm of Artistic Style” by Gatys et al. [1] caught my attention. The authors tried answering the research question: Can Artificial Intelligence produce a masterwork? They proved it can by developing the Neural Style Transfer (NST) algorithm – an AI system able to create a special “chemistry” between the content and the style of an image – to produce a masterpiece in a similar style to the great artists. The NST algorithm, based on a Deep Neural Network, is an example of great synergy between science and art.

In this blog you will find out more about the neural algorithm of artistic style, see some striking examples generated using the algorithm, and read a few interesting stories behind the famous paintings used as styling templates. Moreover, you will be able to generate your own fine art using aiFineArt, World Programming’s online tool.

Science Perspective: Neural Style Transfer

The central theme of Gatys et al. paper [1] is content and style reconstruction of image data, and for that purpose they used Convolutional Neural Networks (ConvNets). ConvNets have been inspired by human neuron response to stimuli in the visual field of humans’ brain. ConvNets are Deep Neural Networks specially designed for 2-dimensional images that process visual data through multiple filters, channels, and layers. The purpose of filters is to extract certain image features. Filters applied to image data (i.e. pixels) produce feature maps which are convoluted layers and each map is a different version of the input image. Convolutional networks can orchestrate hundreds of filters in parallel for a single image. Image channels represent colours, and they add a third dimension that is depth to filters. Typically, we would have 3 channels per filter for each of red, green, and blue colours. Convolutional networks have layers too. Hierarchically ordered layers process visual information in a feed-forward manner. The ConvNet architecture consists of different types of layers.

Typically, the ConvNet configuration starts with different convolutional layers each defined with an activation function, size, and number of outputs. Some of those layers have been “strengthened” with a pooling layer for extracting dominant features and suppressing noise. The architecture ends with fully connected layers, which bring the image to a form suitable for Multi-layer Perceptron with a Softmax function for classification. Each layer can be visualised through the reconstruction of the corresponding feature map. Higher levels are of special interest herein as they can capture the context of the image, hence extracting the image at its conceptual level.

There are several ConvNets available as pre-trained models with different architectures, including the VGG model [3] utilised in the NST algorithm. VGG is an object-recognition model built on ImageNet [4], a large-scale image database with over 14 million images organised into more than 20 thousand categories. The VGG model itself is trained on a subset of ImageNet with 1.3M images set into a thousand categories [5]. VGG model has been implemented in Python’s Keras and PyTorch deep learning frameworks. VGG has several configurations of which VGG-16 (Figure 1) and VGG-19, with 16 and 19 convolutional layers, respectively, are the most common.

Figure 1. VGG16 Convolutional Network Architecture [6]
The NST algorithm exploits ConvNet’s ability to visualise an image at each layer by reconstructing the image from the feature maps at that layer. Gatys’ et al. key finding is that content and style representations in ConvNet are separable, hence the content of one image can be combined to the style of another image to produce a child image that would have traits of both (Figure 2).
Figure 2. Neural Style Transfer

The first part of the algorithm is analysis of content and style images. For content reconstruction higher layers in the network are used, for style reconstruction correlations between filter responses at multiple layers plays important role. The second part of the algorithm is synthesis of the child image, which is based on the minimal loss function of both terms – the content and the style. Relative weighting between the content and style reconstruction indicates the emphasis we give to either style or content representation. The optimal ratio should be empirically specified, if the value selected is too low only the style might be captured, and vice versa if the value is too high.

The NST algorithm utilises only two types of layers in VGG network architecture: convolutional and pooling layer. Specifically, for content representation conv4 has been extracted, and for style representation convolutional layers 1 to 5 have been used. Gatys’ et al. PyTorch implementation of the NST algorithm is publicly available [7].

Art perspective: Famous paintings and stories behind

“The Starry Night”

It was simply impossible not to include one of the world’s most iconic paintings – the Starry Night [9], the magnum opus of Vincent van Gogh! He painted it in 1889, during his hospitalisation at a lunatic asylum in France where he had been admitted after suffering a serious breakdown. The painting portrays the landscape with a cypress tree that was visible through the window of his hospital’s room. Observing the sky at an early morning and well before sunrise, he captured the stars, the Moon, and the planet Venus on a magical summer night. The most mysterious aspect is the twinkling night sky vividly illuminating from the canvas. Artists, historians, and scientists have been speculating whether those magic whirlpool brush strokes were a reflection of his turbulent state of mind, or a result of lead poisoning found in his oil paints causing swelling retinas and consequently the vision of light circles around objects. Or maybe, as some argue, they are the result of his genius mind finding a way to represent the spiral of a galaxy or a comet. One of recent theories advocated by some astrophysicists is the mysterious and astonishing resemblance of the painting with illuminating star dust as seen through a NASA telescope. The physicists also examined the correlation between van Gogh’s technique with fluid turbulence [10].

Portraits of Adele Bloch-Bauer

Adele Bloch-Bauer was the wife of a wealthy magnate who commissioned Gustav Klimt to paint his wife twice. Her first and more famous portrait, also known as “The Lady in Gold”, or “The Woman in Gold” [11] Klimt completed in 1907. It is oil paint with dominating gold and silver leaf. The inspiration for this masterpiece he found in Byzantine mosaics and Egyptian art. Adele’s second portrait [12] is oil on canvas completed in 1912. Both portraits were stolen by Nazis during WWII. After the war the paintings were on display in a Viennese gallery until 2006, when they were returned to the legal owner after a long trial. Soon after the trial, both paintings were sold at Christie’s for record prices at that time. This remarkable story was recounted in three documentary films and a feature film “Woman in Gold“.

“Girl With a Pearl Earring”

“Mona Lisa of the North”, as this seventeenth century painting by Vermeer [13] is often called, has fascinated and intrigued admirers of fine art ever since. What has been so captivating and mystifying about the painting? Vermeer’s mastery of dramatic style lighting; the mystery of the girl’s gaze; her exotic dress with a blue and yellow turban; the enormous gleaming pearl (or possibly polished metal) earing; her sensual lips; the girl’s unidentified identity; and dilemma if the painting was a portrait or a “tronie” of an imaginary person – has been inspiration for many artists. A novel and a movie starring Scarlett Johansson with the same title were directly inspired by the Vermeer’s painting and it has been used on the cover of many art books, and artefacts.

Frida Kahlo, the artist

Frida Kahlo was a Mexican artist well known for her 55 self-portraits painted with bold and vibrant colours. She said: “I paint self-portraits because I am so often alone, because I am the person I know best.” [18]. Frida’s paintings reflect her tormented personal life caused by polio disease she had suffered as a child; by a bus accident she had as a teenage girl, resulting in her lifelong pain and 30 operations; by a turbulent relationship with her husband to whom she was twice married. The Tate Modern considers Kahlo as “one of the most significant artists of the twentieth century” [19]. Her personal life and opus have been inspiration for many artists in fields of literacy, music, and cinematography.

A brain teaser

Make a guess! In each of the four corners of Figure 3 there are self-portraits painted by Frida. The remaining images are Frida’s photos, four of which are generated with the NST algorithm. Match the NST generated photos with the corresponding corner paintings.

Figure 3. Self-portraits by Frida Kahlo (top-left [14], top-right [15], bottom-left [16], bottom-right [17], centre [18])

Make it happen

PyTorch

There are numerous PyTorch implementations of the NST algorithm, for example L. Gatys [7] and A. Jacq [8]. The VGG model available in PyTorch as a pre-trained deep learning model, has been used in the algorithm for feature extraction and visualisation. The NST algorithm [2] consists of the following steps:

 

  1. Set up the NST input parameters
    1. Pre-process content and style images by resizing them to the same dimensions and normalising the input values to be compatible with the VGG model.
    2. Create an instance of the VGG model with the pre-trained weights for the ImageNet dataset.
    3. Specify the number of iterations.
    4. Specify the relative weighting between content and style image representation.
  2. Create a model and calculate losses
    1. Reconstruct the VGG network to get the access to the network’s intermediate layers (e.g. Conv2d, ReLU, MaxPool2d, AvgPool2d) and select convolutional layers of interest.
    2. For content reconstruction, extract a single convolutional layer. A middle layer is recommended (e.g. conv_4 or conv_5). Calculate the content loss between the feature map of the convolutional layer and the feature map of the original content image.
    3. For style reconstruction, extract multiple VGG layers of interest and employ correlation of features between layers. Calculate the total style loss, as the sum of losses at each convolutional layer (i.e. conv_1 to conv_5).
    4. Create a new model instance consisting of the extracted layers for style and content representation.
  3. Perform the neural style transfer
    1. Select a gradient descent optimisation algorithm from PyTorch library.
    2. For each iteration specified in 1.3.
      1. Train the new model (from step 2.4) on an input image that is the copy of the content image.
      2. Calculate sum of total losses from the content and style losses calculated in steps 2.2 and 2.3.
      3. Modify the total loss by applying the relative weight from step 1.4.
      4. Run the optimisation algorithm to compute gradients using standard back-propagation on the modified total loss; the optimiser finds out the model parameters that should be updated.
      5. Finally, iteratively update the input image with computed gradients until it simultaneously matches the style of one image and the content of another one.
    3. Return the new (transformed) image.
WPS Analytics Hub

To execute the algorithm, the PyTorch code has been deployed as a web service using WPS Analytics Hub. Among other functions, the Hub provides facilities to deploy programs and models coded in several languages including SAS language, Python and R as APIs for real-time and on-demand applications. WPS Analytics Hub integrates with GitHub for version control. Program packages can be deployed onto dedicated servers via a point-and-click interface and once deployed the auto-created API is available from the Hub directory.

When dreams come true

Let’s experiment!

A Venice House
photo by N Mashanovich
Year: 2018
If I were van Gogh or Monet or Klimt or Frida…
Starry Night
by Vincent van Gogh
Year: 1889
Collection: Museum of Modern Art (MoMA), New York
[9]
Portrait of Adele Bloch-Bauer I
(Woman in gold)

by Gustav Klimt
Year: 1907
Collection: Neue Galerie, New York
[11]
Portrait of Adele Bloch-Bauer II
by Gustav Klimt
Year: 1912
Collection: temporarily lent to Neue Galerie New York
[12]
Yellow Red Blue Composition
by Wassily Kandinsky
Year: 1925
Collection: National Museum of Modern Art, Paris
[20]
The Frame
by Frida Kahlo: 1939
Collection: National Museum of Modern Art, Paris
[14]
Red Balloon
by Paul Klee
Year: 1922
Collection: The Guggenheim, New York
[21]
Girl with a Pearl Earring
by Johannes Vermeer
Year: 1665
Collection: Mauritshuis, The Hague
[13]
Water Lilies and Japanese Bridge
by Claude Monet
Year: 1899
Collection: Princeton University Art Museum, New Jersey
[22]
Composition with Red, Blue, and Yellow
by Piet Mondrian
Year: 1930
Collection: Kunsthaus Zürich
[23]
There is Always Hope (Banksy Girl and Heart Balloon)
by Dominic Robinson
Year: 2004
Location: South Bank, London
[24]
WPS Analytics
World Programming
[25]

Try it yourself!

You can try it for free with aiFineArt, a World Programming online tool. The only limitation is not your imagination but the computationally intensive NST algorithm. Hence, we limited the size of your input image to provide reasonably quick response.

Have fun!

References

[1] Gatys, L.A., Ecker, A.S. and Bethge, M., 2015. A Neural Algorithm of Artistic Style. arXiv preprint arXiv:1508.06576.
[2] Gatys, L.A., Ecker, A.S. and Bethge, M., 2016. Image Style Transfer Using Convolutional Neural Networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2414-2423
[3] Simonyan, K. and Zisserman, A., 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
[4] ImageNet database, https://www.image-net.org/
[5] Krizhevsky, A., Sutskever, I., and Hinton, G. E., 2012. ImageNet Classification with Deep Convolutional Neural Networks, in NIPS, pp. 1106–1114, 2012, https://proceedings.neurips.cc/paper/2012/hash/c399862d3b9d6b76c8436e924a68c45b-Abstract.html
[6] Neurohive, VGG16 Architecture, https://neurohive.io/en/popular-networks/vgg16/
[7] PyTorch NST implementation by L. Gatys the author of NST, https://github.com/leongatys/PytorchNeuralStyleTransfer
[8] Neural Style Transfer implementation using PyTorch, https://pytorch.org/tutorials/advanced/neural_style_tutorial.html
[9] “The Starry Night” by Vincent van Gogh, https://commons.wikimedia.org/w/index.php?curid=25498286
[10] TEDEd, https://ed.ted.com/lessons/the-unexpected-math-behind-van-gogh-s-starry-night-natalya-st-clair
[11] “Woman in Gold” by Gustav Klimt, https://commons.wikimedia.org/w/index.php?curid=153485
[12] “Portrait of Adele Bloch-Bauer II” by Gustav Klimt, https://commons.wikimedia.org/wiki/File:Gustav_Klimt_047.jpg
[13] “Girl with a Pearl Earring” by Johannes Vermeer, https://commons.wikimedia.org/w/index.php?curid=55017931
[14] Self-portrait “The Frame” by Frida Kahlo, https://www.fridakahlo.org/self-portrait-the-frame.jsp
[15] Self-portrait “Thorn Necklace and Hummingbird” by Frida Kahlo, https://en.wikipedia.org/wiki/File:Frida_Kahlo_(self_portrait).jpg
[16] Self-portrait “Dedicated to Dr Eloesser” by Frida Kahlo, https://www.fridakahlo.org/self-portrait-dedicated-to-dr-eloesser.jsp
[17] Self-portrait “Me and My Parrot” by Frida Kahlo, https://www.fridakahlo.org/me-and-my-parrots.jsp
[18] Frida Kahlo, https://www.fridakahlo.org/
[19] The Tate Modern exhibition of Frida Kahlo, https://www.tate.org.uk/whats-on/tate-modern/exhibition/frida-kahlo
[20] “Composition Yellow-Red-Blue” by Wassily Kandinsky, https://commons.wikimedia.org/w/index.php?curid=38658125
[21] “Red Balloon” by Paul Klee, https://commons.wikimedia.org/w/index.php?curid=64890629
[22] “Water Lilies and Japanese Bridge” by Claude Monet, https://commons.wikimedia.org/wiki/File:Claude_Monet_-_Water_Lilies_and_Japanese_Bridge_-_Google_Art_Project.jpg
[23] “Composition with Red, Blue, and Yellow” by Piet Mondrian, https://commons.wikimedia.org/w/index.php?curid=37642803
[24] “Banksy Girl and Heart Balloon” by Dominic Robinson (version in South Bank), https://commons.wikimedia.org/w/index.php?curid=73570221
[25] World Programming, https://www.worldprogramming.com

Discuss your needs