How to realize Human posture estimation by Machine Learning in Python 07/15 Update SLTechnology News&Howtos

How to realize Human posture estimation by Machine Learning in Python

2025-07-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

Editor to share with you how to achieve human posture estimation through machine learning in Python, I believe most people do not know much about it, so share this article for your reference, I hope you will learn a lot after reading this article, let's go to know it!

What is attitude estimation?

Attitude estimation is a computer vision technology that tracks the motion of people or objects. This is usually done by finding the key location of a given object. Based on these key points, we can compare various movements and postures and come up with opinions. Attitude estimation has been actively applied in the fields of augmented reality, animation, games and robots.

At present, there are several models that can perform attitude estimation. Here are some methods for pose estimation:

1.Open pose

2.Pose net

3.Blaze pose

4.Deep Pose

5.Dense pose

6.Deep cut

The choice of either model rather than the other may depend entirely on the application. In addition, factors such as run time, model size, and ease of implementation may also be a variety of reasons for choosing a particular model. Therefore, it is best to understand your requirements from the beginning and choose the model accordingly.

In this article, we will use Blaze pose to detect human posture and extract key points. This model can be easily implemented through a very useful library, known as Media Pipe.

Media Pipe--Media Pipe is an open source cross-platform framework for building multi-model machine learning pipelines. It can be used to realize frontier models such as face detection, multi-hand tracking, hair segmentation, object detection and tracking.

Blaze Pose Detector-most attitude detection relies on a COCO topology consisting of 17 key points, while the Blaze attitude detector predicts 33 human key points, including the torso, arms, legs and face. Including more key points is necessary for the successful application of domain-specific pose estimation models, such as hands, faces, and feet. Each key is predicted using three degrees of freedom and a visibility score. Blaze Pose is a sub-millisecond model, which can be used in real-time applications, and its accuracy is better than that of most existing models. There are two versions of the model, Blazepose lite and Blazepose full, to provide a balance between speed and accuracy.

Blaze poses offers a variety of applications, including fitness and yoga trackers. These applications can be implemented by using an additional classifier, such as the one we will build in this article.

You can learn more about Blaze Pose Detector here: https://ai.googleblog.com/2020/08/on-device-real-time-body-pose-tracking.html

2D and 3D attitude estimation

Pose estimation can be done in 2D or 3D. 2D attitude estimation predicts the key points in the image by pixel values. 3D attitude estimation refers to the three-dimensional spatial arrangement of predicted key points as its output.

Prepare data sets for attitude estimation

We learned in the previous section that the key points of human posture can be used to compare different postures. In this section, we will use the Media Pipe library itself to prepare the dataset. We will take images of two yoga poses, extract key points from them, and store them in a CSV file.

You can download the dataset from Kaggle through this link

The data set contains five yoga postures, but in this article, I have only used two poses. You can use all of these if necessary, and the program will remain the same.

Import mediapipe as mpimport cv2import timeimport numpy as npimport pandas as pdimport osmpPose = mp.solutions.posepose = mpPose.Pose () mpDraw = mp.solutions.drawing_utils # For drawing keypointspoints = mpPose.PoseLandmark # Landmarkspath = "DATASET/TRAIN/plank" # enter dataset pathdata = [] for p in points: X = str (p) [13:] data.append (x + "_ x") data.append (x + "_ y") data.append (x + ") _ z ") data.append (x +" _ vis ") data = pd.DataFrame (columns = data) # Empty dataset

In the code snippet above, we first imported the necessary libraries to help create the dataset. Then in the next four lines, we will import the modules needed to extract keys and their drawing tools.

Next, we create an empty Pandas data box and enter the columns. The columns here include 33 key points detected by the Blaze attitude detector. Each key contains four attributes, the x and y coordinates of the key (normalized from 0 to 1), the z coordinate represents the landmark depth with the hip as the origin and the same proportion as x, and finally the visibility score. The visibility score indicates the probability that the landmark is visible or invisible in the image.

Count = 0 for img in os.listdir (path): temp = [] img = cv2.imread (path + "/" + img) imageWidth, imageHeight = img.shape [: 2] imgRGB = cv2.cvtColor (img, cv2.COLOR_BGR2RGB) blackie = np.zeros (img.shape) # Blank image results = pose.process (imgRGB) if results.pose_landmarks: # mpDraw.draw_landmarks (img Results.pose_landmarks, mpPose.POSE_CONNECTIONS) # draw landmarks on image mpDraw.draw_landmarks (blackie, results.pose_landmarks, mpPose.POSE_CONNECTIONS) # draw landmarks on blackie landmarks = results.pose_landmarks.landmark for iMagol j in zip (points,landmarks): temp = temp + [j.x, j.y, j.z J.visibility] data.loc [count] = temp count + = 1 cv2.imshow ("Image", img) cv2.imshow ("blackie", blackie) cv2.waitKey (100) data.to_csv ("dataset3.csv") # save the data as a csv file

In the above code, we iterate through the pose image separately, using the Blaze pose model to extract key points and store them in a temporary array "temp".

After the iteration, we add this temporary array to our dataset as a new record. You can also use the drawing utility in Media Pipe itself to view these landmarks.

In the above code, I drew these landmarks on the image and the blank image "blackie" to focus only on the results of the Blaze pose model. The shape of the blank image "blackie" is the same as that of the given image.

One thing to note is that the Blaze pose model uses RGB images instead of BGR (read by OpenCV).

After obtaining the key points of all the images, we must add a target value as the label of the machine learning model. You can set the target value of the first pose to 0 and the other to 1. After that, we can save this data to a CSV file, which we will use in the next steps to create a machine learning model.

You can observe the appearance of the dataset from the image above.

Create a pose estimation model

Now that we have created our dataset, we only need to choose a machine learning algorithm to classify poses. In this step, we will take an image, run the blaze pose model (which we used to create the dataset) to get the key points of the people in the image, and then run our model on the test case.

The model is expected to give correct results with high confidence. In this article, I will use SVC (support Vector Classifier) in the sklearn library to perform classification tasks.

From sklearn.svm import SVCdata = pd.read_csv ("dataset3.csv") XMagi Y = data.iloc [:,: 132], data ['target'] model = SVC (kernel =' poly') model.fit (XMagi Y) mpPose = mp.solutions.posepose = mpPose.Pose () mpDraw = mp.solutions.drawing_utilspath = "enter image path" img = cv2.imread (path) img = cv2.cvtColor (img Cv2.COLOR_BGR2RGB) results = pose.process (imgRGB) if results.pose_landmarks: landmarks = results.pose_landmarks.landmark for j in landmarks: temp = temp + [j.x, j.y, j.z J.visibility] y = model.predict ([temp]) if y = 0: asan = "plank" else: asan = "goddess" print (asan) cv2.putText (img, asan, (50lce50), cv2.FONT_HERSHEY_SIMPLEX,1, (255Mae 255L0), 3) cv2.imshow ("image", img)

In the above line of code, we first imported SVC (support vector classifier) from the sklearn library. We have trained the dataset we built on SVC with the target variable as the Y tag.

Then we read the input image and extract the key points, just as we did when we created the dataset.

Finally, we enter temporary variables and use the model to predict. You can now detect poses using simple if-else conditions.

Model result

From the image above, you can see that the model has correctly classified the pose. You can also see the pose detected by the Blaze pose model on the right.

In the first picture, if you look carefully, some key points are invisible, but the posture classification is correct. This is possible because of the visibility of the key point attributes given by the Blaze attitude model.

The above is all the contents of the article "how to estimate human posture through machine learning in Python". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.