How to use Python to enhance image data Data Augmentation 07/06 Update SLTechnology News&Howtos

How to use Python to enhance image data Data Augmentation

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article mainly explains "how to use Python to enhance image data Data Augmentation", the content of the article is simple and clear, easy to learn and understand, the following please follow the editor's ideas slowly in depth, together to study and learn "how to use Python to enhance image data Data Augmentation" bar!

1.1 introduction

Deep neural networks generally need a large number of training data in order to obtain ideal results. When the amount of data is limited, data enhancement (Data Augmentation) can be used to increase the diversity of training samples, improve the robustness of the model and avoid over-fitting.

In computer vision, typical data enhancement methods include Flip, Rotat, Scale, random cropping or zero padding (Random Crop or Pad), color jitter (Color jittering), and noise (Noise).

The author is following up the project of human posture detection and key point tracking (Human Pose Estimatiion and Tracking in videos) in video and image. Therefore, the data enhancement in this article uses only-- Flip, Rotate, zoom and Scale).

2.1 crop (Crop)

Image.shape-- ([3, width, height]) A picture in a video sequence with different sizes before cropping.

Bbox.shape-- ([4,]) human body detection box for cutting

X.shapetalk-([1jue 13]) all x coordinate values of 13 key points of the human body

Y.shapemuri-([119913]) all the y coordinate values of 13 key points of the human body

Def crop (image, bbox, x, y, length): X, y, bbox = x.astype (np.int), y.astype (np.int), bbox.astype (np.int) x_min, y_min, x_max, y_max = bbox w, h = x_max-x_min, y_max-y_min # Crop image to bbox image = image [y _ min:y_min + h, x_min:x_min + w :] # Crop joints and bbox x-= x_min y-= y_min bbox = np.array ([0,0, x_max-x_min, y_max-y_min]) # Scale to desired size side_length = max (w, h) f_xy = float (length) / float (side_length) image, bbox, x, y = Transformer.scale (image, bbox, x, y, f_xy) # Pad new_w New_h = image.shape [1], image.shape [0] cropped = np.zeros ((length, length, image.shape [2])) dx = length-new_w dy = length-new_h x_min, y_min = int (dx / 2.), int (dy / 2.) X_max, y_max = x_min + new_w, y_min + new_h cropped [y _ min:y_max, x_min:x_max,:] = image x + = x_min y + = y_min x = np.clip (x, x_min, x_max) y = np.clip (y, y_min, y_max) bbox + = np.array ([x_min, y_min, x_min, y_min]) return cropped, bbox X.astype (np.int), y.astype (np.int)

2.2 Zoom (Scale)

Image.shape-- ([3,256,256]) A picture in a video sequence. After cropping, the input network is 256 to 256.

Bbox.shape-- ([4,]) human body detection box for cutting

X.shapetalk-([1jue 13]) all x coordinate values of 13 key points of the human body

Y.shapemuri-([119913]) all the y coordinate values of 13 key points of the human body

Fancixymuri-Zoom multiple

Def scale (image, bbox, x, y, f_xy): (h, w, _) = image.shape h, w = int (h * f_xy), int (w * f_xy) image = resize (image, (h, w), preserve_range=True, anti_aliasing=True, mode='constant') .astype (np.uint8) x = x * f_xy y = y * f_xy bbox = bbox * f_xy x = np.clip (x, 0) W) y = np.clip (y, 0, h) return image, bbox, x, y

2.3Flip (fillip)

Here is to flip the picture left and right around the axis of symmetry (because the human body is symmetrical, it helps to prevent the model from overfitting in key point detection)

Def flip (image, bbox, x, y): image = np.fliplr (image). Copy () w = image.shape [1] x_min, y_min, x_max, y_max = bbox bbox = np.array ([w-x_max, y_min, w-x_min, y_max]) x = w-x, y = Transformer.swap_joints (x, y) return image, bbox, x, y

Before flipping:

After flipping:

2.4 rotation (rotate)

Angle-- rotation Angl

Def rotate (image, bbox, x, y, angle): # image-- (256,256,3) # bbox-- (4,) # x-- [126,129,124,1177,107,108,105,137,152,1299] # y-- [209176 136278 2525 6547 46 24 44 64 49 54] # angle-8.165648811999333 # center of image [128128] Width,height = (np.array (image.shape [: 2] [::-1])-1) / 2. Width,height = image.shape [0], image.shape [1] x1 = x y1 = height-y image = rotate (image, angle, preserve_range=True) .astype (np.uint8) Olympian angle_rad = (np.pi * angle) / 180.0 x = rhombox + np.cos (angle_rad) * (x1-oyogx)-np.sin (angle_rad) * (y1-oyogy) y = ruddy + np.sin (angle_rad) * (x1-ofolx) + np.cos (angle_rad) * (y1-oyogy) x = x y = height- Y bbox [0] = riterx + np.cos (angle_rad) * (bbox [0]-ofolx) + np.sin (angle_rad) * (bbox [1]-ofoly) bbox [1] = ronomy +-np.sin (angle_rad) * (bbox [0]-ofolx) + np.cos (angle_rad) * (bbox [1]-otropy) bbox [2] = Ryogx + np.cos (angle_rad) * (bbox [2]-ofolx) + np.sin (angle_rad) * (bbox [3]-ofoly) bbox [3] = ruddy +-np.sin (angle_rad) * (bbox [2]-ofolx) + np.cos (angle_rad) * (bbox [3]-ofoly) return image Bbox, x.astype (np.int), y.astype (np.int)

Before rotation:

After rotation:

3 results (output)

Original image before data enhancement:

After data enhancement:

Thank you for your reading, the above is the content of "how to use Python to enhance image data Data Augmentation". After the study of this article, I believe you have a deeper understanding of how to use Python to enhance image data Data Augmentation, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.