Using Python to realize the transfer of picture style 07/06 Update SLTechnology News&Howtos

Using Python to realize the transfer of picture style

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

The main content of this article is to explain "using Python to achieve picture style transfer", interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Next, let the editor take you to learn "using Python to achieve picture style transfer"!

1. What is the style transfer of the picture?

The so-called picture style transfer refers to the use of program algorithms to learn the style of famous paintings, and then apply this style to another picture.

For example, see the picture above. On the left is our original picture (also known as content image): a photo taken by the editor on a small bridge in the ancient town of Luzhi, Suzhou.

In the middle is our style picture: the masterpiece "The Scream" by the Norwegian expressionist painter Edward Monk.

On the right is the stylized result image generated by applying Edward Monk's "Scream" style to the original picture. Take a closer look at how the picture retains the reflection of running water, houses, houses in the water, and even the contents of distant trees, but uses the style of "shouting". It's like Edward Monk used his superb painting skills in our scenery!

The question is, what kind of neural network should we define to perform the style transfer of images?

Is it possible?

The answer is: yes. I will briefly discuss how to transfer image style based on neural network in the next section.

two。 Basic principles

Gatys et al published the first article on style transfer algorithms based on deep learning in 2015, with the original link to https://arxiv.org/abs/1508.06576, and then included at the CVPR summit in 2016.

Interestingly, they proposed a style transfer algorithm that does not require a new network architecture at all, which uses a slightly modified network architecture based on the previous VGG19, and the network parameters also use pre-trained (usually on ImageNet) network parameters. Let's look at how it works:

We know that convolution neural network (CNN) has a strong ability to extract image features (feature/representation), as shown in the image above.

For content images, deep networks (d and e) extract high-dimensional features while discarding details; shallow networks (a, b and c) extract low-dimensional features, and most of the details of the images are retained.

For style images, by including multi-layer feature correlation (Gram matrix), multi-scale image style reconstruction can be obtained and its texture information can be captured. The network constructed in this way can ignore the specific details of the image and retain the style.

In order to combine the content image with the style image (see below), we should minimize the distance between the features of the stylized result image (initially a white noise image) and the features of the content picture and the style picture at the same time. Finally, we get the stylized result image we need.

Therefore, the loss function for generating the target image can be defined as:

Where α and β are the weights of the features of content images and style images respectively, and we can get the desired results by minimizing this loss function. Let's look at a dynamic diagram:

It is worth noting that the optimized parameters here are no longer the weight ω and deviation b of the network, but a white noise image of the initial input.

Although the above methods can produce a very beautiful style transfer effect, but the speed is very slow.

In 2016, Johnson et al proposed a style transfer algorithm which can improve the speed by three orders of magnitude based on the work of Gatys et al. Although the speed of the algorithm is very fast, the biggest disadvantage is that you can't choose your style pictures as freely as Gatys and others. For each style picture, you need to train a network to recreate the style. Once the network model is trained, you can apply it to any content picture you want.

In this blog, we will use the method of Johnson et al. Its algorithm implementation and pre-training model can be referred to https://github.com/jcjohnson/fast-neural-style.

3. Fast implementation based on OpenCV

Let's use OpenCV to quickly transfer the style of the image. I encapsulate it into a function called style_transfer (). For instructions, please refer to the internal comments of the function. At present, only 11 pre-training models are available.

1. ``# # load the required library `2. `import cv2`3. `import time`4. `def style_transfer (pathIn='', `5. `pathOut='', `6. `model='', `7. `width=None, `8. `jpg_quality=80): `9. ````10. `pathIn: the path to the original image `11. `pathOut: the path to the stylized image `12. `model: the path to the pre-training model `13. `width: set the width of the stylized image The default is None, that is, the original image size `14. 'jpg_quality: 0-100. set the quality of the output image. The default is 80. The larger the picture, the better the quality. `15. ```16. `# # read the original picture and adjust the picture to the desired size Then get the width and height of the image `17. `img = cv2.imread (pathIn) `18.` (h, w) = img.shape [: 2] `19. `if width is not None: `20. `img = cv2.resize (img, (width, round (width*h/w)), interpolation=cv2.INTER_CUBIC) `21. `(h, w) = img.shape [: 2] `23. `24. Load the pre-training model locally. `print ('load pre-training model.') `25. `net = cv2.dnn.readNetFromTorch (model) `26. `# # construct an image into a blob: set the image size and subtract the pixel value of each channel from the average value (for example, the statistical average of all ImageNet training samples for each channel) `27. Then perform a feedforward network calculation and output the time required for calculation `28. `blob = cv2.dnn.blobFromImage (img, 1.0, (w, h), (103.939, 116.779, 123.680), swapRB=False, crop=False) `29. `net.setInput (blob) `30. `start = time.time () `31. `output = net.forward () `32. `end = time.time () `33. `print ("style transfer cost: {: .2f} seconds" .format (end-start)) `35. `# # reshape output the result, add the subtracted average value back, and exchange each color channel `36. `output = output.reshape ((3, output.shape [2], output.shape [3])) `37. `output [0] + = 103.939`38. `output [1] + = 116.779`39. `output [2] + = 123.680`40. `output = output.transpose (1,2,0) `42. `# # output stylized images`43. `cv2.imwrite (pathOut, output, [int (cv2.IMWRITE_JPEG_QUALITY), jpg_quality]) `

Let's test it:

1. `> models = glob.glob ('. / * .t7') `2.` > > models # list all available pre-training models `3. `[.\ models\\ eccv16\\ composition_vii.t7', `4.`'.\\ models\\ eccv16\\ la_muse.t7', `5. `.'.\ models\\ eccv16\\ starry_night.t7', `6. `.\ models\\ eccv16\\ the_wave.t7' `7. `'.\\ models\\ instance_norm\\ candy.t7', `8.`'.\\ models\\ instance_norm\\ feathers.t7', `9. `'.\ models\\ instance_norm\\ la_muse.t7', `10. `.'.\ models\\ instance_norm\\ mosaic.t7', `11.`.\ models\\ instance_norm\\ starry_night.t7', `12. `.\\ models\\ instance_norm\\ the_scream.t7' `13. `.\\ models\\ instance_norm\\ udnie.t7'] `15.` > pathIn ='. / img/img01.jpg' `16. `> > pathOut ='. / result/result_img01.jpg' `17.` > model ='. / models/instance_norm/the_scream.t7' `18. `> > style_transfer (pathIn, pathOut, model, width=500) `19.` load the pre-training model. `20.` style transfer cost: 1.18s `22. `> pathIn ='. / img/img02.jpg' `23. `> pathOut ='. / result/result_img02.jpg' `24. `> model ='. / models/instance_norm/starry_night.t7' `25. `> > style_transfer (pathIn, pathOut, model, width=500) `26. `Load the pre-training model. `27. `style transfer cost: 3.17 seconds `29. `> pathIn ='. / img/img03.jpg' `30. `> pathOut ='. / result/result_img03.jpg' `31. `> model ='. / models/instance_norm/the_scream.t7' `32. `> > style_transfer (pathIn, pathOut, model, width=500) `33. `Load the pre-training model. `34. `Styles transfer cost: 0.90 seconds `36. `> pathIn ='. / img/img04.jpg' `37. `> pathOut ='. / result/result_img04.jpg' `38. `> model ='. / models/eccv16/the_wave.t7' `39. `> > style_transfer (pathIn, pathOut, model, width=500) `40. `Load the pre-training model. `41. `style transfer cost: 2.68 seconds `43. `> pathIn ='. / img/img05.jpg' `44. `> model ='. / models/instance_norm/mosaic.t7' `45. `> > style_transfer (pathIn, pathOut, model, width=500) `46. `Load the pre-training model. `47. `style migration cost: 1.23 seconds`

As you can see from the running results, on CPU, it takes only a few seconds to transfer the style of an image. If you use GPU, you can transfer the style of video / camera in real time.

4. Relevant progress at present

Since Gatys et al realized style transfer based on deep learning for the first time (2015), style transfer technology has been developing, and now it has been greatly improved in terms of speed and quality. Some of the current progress can be seen at the following link:

Https://github.com/jcjohnson/fast-neural-style

Https://github.com/DmitryUlyanov/texture_nets

Https://github.com/luanfujun/deep-painterly-harmonization

Https://junyanz.github.io/CycleGAN/

Some of their works:

1. Style transfer

two。 The fusion of foreign pictures

3. The change of picture season

4. The falsification of picture background

5. Role swap

At this point, I believe you have a deeper understanding of "using Python to achieve picture style transfer". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.