How to realize face recognition by Opencv+SVM 07/19 Update SLTechnology News&Howtos

How to realize face recognition by Opencv+SVM

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

Opencv+SVM how to achieve face recognition, in view of this problem, this article introduces the corresponding analysis and solutions in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible method.

Preface

How to use OpenCV for face recognition.

First, the face detection is performed first, and the 128-bit face vector is extracted from each human face using depth learning.

Secondly, support vector machine (SVM) is used to train the face recognition model on the basis of embedding.

Third, OpenCV is used to recognize faces in images and video streams.

Project structure

Coding

Create a new face_embeddings.py script and write the following code:

# import the necessary packagesimport numpy as npimport pickleimport cv2import osimport os

Import the required package. Then define several functions:

Def list_images (basePath, contains=None): # return the set of files that are valid return list_files (basePath, validExts=image_types, contains=contains) def list_files (basePath, validExts=None, contains=None): # loop over the directory structure for (rootDir, dirNames, filenames) in os.walk (basePath): # loop over the filenames in the current directory for filename in filenames: # if the contains string is not none and the filename does not contain # the supplied string Then ignore the file if contains is not None and filename.find (contains) =-1: continue # determine the file extension of the current file ext = filename [filename.rfind ("."):] .lower () # check to see if the file is an image and should be processed if validExts is None or ext.endswith (validExts): # construct the Path to the image and yield it imagePath = os.path.join (rootDir Filename) yield imagePathdef resize (image, width=None, height=None, inter=cv2.INTER_AREA): dim = None (h W) = image.shape [: 2] # return if width is None and height is None directly if the height and width are None: return image # check whether the width is None if width is None: # calculate the height ratio and calculate the width r = height / float (h) dim = (int (w * r)) Height) # height is None else: # calculate width ratio And calculate the height r = width / float (w) dim = (width, int (h * r)) resized = cv2.resize (image, dim, interpolation=inter) # return the resized image return resized

List_images function to read the pictures under the dataset folder.

Resize function, proportional resize picture. Next, define some variables:

Dataset_path='dataset'embeddings_path='output/embeddings.pickle'detector_path='face_dete_model'embedding_model='nn4.small2.v1.t7'confidence_low=0.5

Dataset_path: dataset path

Embeddings_path: the path to the output encoding file

Detector_path: the path of face detection model

Embedding_model: coding model

Confidence_low: the lowest confidence level.

Then there is the most important part of the code:

Print ("loading face detector...") protoPath = os.path.sep.join ([detector_path, "deploy.proto.txt"]) modelPath = os.path.sep.join ([detector_path, "res10_300x300_ssd_iter_140000_fp16.caffemodel"]) detector = cv2.dnn.readNetFromCaffe (protoPath ModelPath) # load serialized face coding model print ("loading face recognizer...") embedder = cv2.dnn.readNetFromTorch (embedding_model) # get the path of the input image in the dataset print ("quantifying faces...") imagePaths = list (list_images (dataset_path)) # initialize our extracted facial coding list and the corresponding name knownEmbeddings = [] knownNames = [] # Total number of faces initialized total = faces loop over the image pathsfor (I) ImagePath) in enumerate (imagePaths): # extract the person name from the image path print ("processing image {} / {}" .format (I + 1 os.path.sep (imagePaths) name = imagePath.split (os.path.sep) [- 2] # load the image Adjust it to a width of 600 pixels (while maintaining aspect ratio) Then grab the image size image = cv2.imread (imagePath) image = resize (image, width=600) (h, w) = image.shape [: 2] # construct a blob imageBlob = cv2.dnn.blobFromImage from the image (cv2.resize (image, (300,300), 1.0,( 300,300), (104.0, 177.0, 123.0), swapRB=False Crop=False) # use OpenCV's deep learning-based face detector to locate faces in input images detector.setInput (imageBlob) detections = detector.forward () # ensure at least one face was found if len (detections) > 0: # assuming there is only one face in each image So find the bounding box with the highest probability I = np.argmax (detections [0,0,:, 2]) confidence = detections [0,0, I, 2] # ensure that the maximum probability detection also means our minimum probability test (thus helping to filter out weak detection) if confidence > confidence_low: # calculate the (x) of the face bounding box Y) coordinate box = detections [0,0, I, 3:7] * np.array ([w, h, w, h]) (startX, startY, endX, endY) = box.astype ("int") # extract ROI and grab ROI dimension face = image [startY:endY, startX:endX] (fH FW) = face.shape [: 2] # make sure the width and height of the face are large enough, if fW

< 20 or fH < 20: continue # 为人脸 ROI 构造一个 blob，然后将 blob 通过我们的人脸嵌入模型来获得人脸的 128-d 量化 faceBlob = cv2.dnn.blobFromImage(face, 1.0 / 255, (96, 96), (0, 0, 0), swapRB=True, crop=False) embedder.setInput(faceBlob) vec = embedder.forward() # 将人名+对应的人脸嵌入添加到各自的列表中 knownNames.append(name) knownEmbeddings.append(vec.flatten()) total += 1# 保存编码文件print("serializing {} encodings...".format(total))data = {"embeddings": knownEmbeddings, "names": knownNames}f = open(embeddings_path, "wb")f.write(pickle.dumps(data))f.close() 加载人脸检测器和编码器：检测器：使用基于Caffe的DL人脸检测器来定位图像中的人脸。编码器：模型基于Torch，负责通过深度学习特征提取来提取人脸编码。接下来，让我们抓取图像路径并执行初始化。遍历 imagePaths。从路径中提取人名。构造了一个 blob。然后，通过将 imageBlob 通过检测器网络来检测图像中的人脸。检测列表包含定位图像中人脸的概率和坐标。假设我们至少有一个检测，将进入 if 语句的主体。假设图像中只有一张脸，因此提取具有最高置信度的检测并检查以确保置信度满足用于过滤弱检测的最小概率阈值。假设已经达到了这个阈值，提取面部 ROI 并抓取/检查尺寸以确保面部 ROI 足够大。然后，我们将利用编码器并提取人脸编码。继续构建另一个 blob。随后，将 faceBlob 通过编码器。这会生成一个 128 维向量 (vec) 来描述面部。然后我们简单地将名称和嵌入 vec 分别添加到 knownNames 和 knownEmbeddings 中。继续循环图像、检测人脸并为数据集中的每个图像提取人脸编码的过程。循环结束后剩下的就是将数据转储到磁盘。运行结果： loading face detector... loading face recognizer... quantifying faces... processing image 1/19 processing image 2/19 processing image 3/19 processing image 4/19 processing image 5/19 processing image 6/19 processing image 7/19 processing image 8/19 processing image 9/19 processing image 10/19 processing image 11/19 processing image 12/19 processing image 13/19 processing image 14/19 processing image 15/19 processing image 16/19 processing image 17/19 processing image 18/19 processing image 19/19 serializing 19 encodings... Process finished with exit code 0 训练人脸识别模型已经为每张脸提取了 128 维编码--但是我们如何根据这些嵌入来识别一个人呢？答案是我们需要在嵌入之上训练一个"标准"机器学习模型（例如 SVM、k-NN 分类器、随机森林等）。今天我们使用SVM实现打开 train_face.py 文件并插入以下代码： from sklearn.preprocessing import LabelEncoderfrom sklearn.svm import SVCimport pickleembeddings_path='output/embeddings.pickle'recognizer_path='output/recognizer.pickle'lable_path='output/le.pickle'# 加载编码模型print("[INFO] loading face embeddings...")data = pickle.loads(open(embeddings_path, "rb").read())# 给label编码print("[INFO] encoding labels...")le = LabelEncoder()labels = le.fit_transform(data["names"])# 训练用于接受人脸 128-d 嵌入的模型，然后产生实际的人脸识别recognizer = SVC(C=1.0, kernel="linear", probability=True)recognizer.fit(data["embeddings"], labels)# 保存模型f = open(recognizer_path, "wb")f.write(pickle.dumps(recognizer))f.close()# 保存lablef = open(lable_path, "wb")f.write(pickle.dumps(le))f.close() 导入包和模块。我们将使用 scikit-learn 的支持向量机 (SVM) 实现，这是一种常见的机器学习模型。定义变量。 embeddings_path：序列化编码。 recognizer_path：这将是我们识别人脸的输出模型。它基于 SVM。 lable_path：标签编码器输出文件路径加载编码。然后初始化 scikit-learn LabelEncoder 并编码名称标签。训练模型。本文使用的是线性支持向量机 (SVM)，但如果您愿意，您可以尝试使用其他机器学习模型进行试验。训练模型后，我们将模型和标签编码器保存到电脑上。运行train_face.py 脚本。识别图像中的人脸新建脚本文件recognize_face.py，插入一下代码： import numpy as npimport pickleimport cv2import os 导入包，然后我们需要新增一个resize方法。 def resize(image, width=None, height=None, inter=cv2.INTER_AREA): dim = None (h, w) = image.shape[:2] # 如果高和宽为None则直接返回 if width is None and height is None: return image # 检查宽是否是None if width is None: # 计算高度的比例并并按照比例计算宽度 r = height / float(h) dim = (int(w * r), height) # 高为None else: # 计算宽度比例，并计算高度 r = width / float(w) dim = (width, int(h * r)) resized = cv2.resize(image, dim, interpolation=inter) # return the resized image return resized 等比例resize图像,定义变量： image_path = '11.jpg'detector_path = 'face_dete_model'embedding_path = 'nn4.small2.v1.t7'recognizer_path = 'output/recognizer.pickle'label_path = 'output/le.pickle'confidence_low = 0.5 这六个变量的含义如下： image_path ：输入图像的路径。 detector_path：OpenCV 深度学习人脸检测器的路径。使用这个模型来检测人脸 ROI 在图像中的位置。 embedding_path : OpenCV 深度学习人脸编码模型的路径。我们将使用这个模型从人脸 ROI 中提取 128 维人脸嵌入--然后将把数据输入到识别器中。 recognizer_path ：识别器模型的路径。 label_path : 标签编码器的路径。 confidence_low：过滤弱人脸检测的可选阈值。接下来是代码的主体部分： # 加载序列化人脸检测器print("[INFO] loading face detector...")protoPath = os.path.sep.join([detector_path, "deploy.proto.txt"])modelPath = os.path.sep.join([detector_path,"res10_300x300_ssd_iter_140000_fp16.caffemodel"])detector = cv2.dnn.readNetFromCaffe(protoPath, modelPath)# 加载我们序列化的人脸编码模型print("[INFO] loading face recognizer...")embedder = cv2.dnn.readNetFromTorch(embedding_path)# 加载实际的人脸识别模型和标签编码器recognizer = pickle.loads(open(recognizer_path, "rb").read())le = pickle.loads(open(label_path, "rb").read())# 加载图像，将其调整为宽度为 600 像素（同时保持纵横比），然后抓取图像尺寸image = cv2.imread(image_path)image = resize(image, width=600)(h, w) = image.shape[:2]# 从图像构建一个 blobimageBlob = cv2.dnn.blobFromImage( cv2.resize(image, (300, 300)), 1.0, (300, 300), (104.0, 177.0, 123.0), swapRB=False, crop=False)# 应用 OpenCV 的基于深度学习的人脸检测器来定位输入图像中的人脸detector.setInput(imageBlob)detections = detector.forward()# loop over the detectionsfor i in range(0, detections.shape[2]): # 提取与预测相关的置信度（即概率） confidence = detections[0, 0, i, 2] # filter out weak detections if confidence >

Confidence_low: # calculate the (x, y) coordinates of the face bounding box box = detections [0,0, I, 3:7] * np.array ([w, h, w, h]) (startX, startY, endX, endY) = box.astype ("int") # extract face ROI face = image [startY:endY, startX:endX] (fH FW) = face.shape [: 2] # make sure the width and height of the face are large enough, if fW

< 20 or fH < 20: continue # 为人脸 ROI 构造一个 blob，然后将 blob 通过我们的人脸嵌入模型来获得人脸的 128-d 量化 faceBlob = cv2.dnn.blobFromImage(face, 1.0 / 255, (96, 96), (0, 0, 0), swapRB=True, crop=False) embedder.setInput(faceBlob) vec = embedder.forward() # 执行分类以识别人脸 preds = recognizer.predict_proba(vec)[0] j = np.argmax(preds) proba = preds[j] name = le.classes_[j] # 绘制人脸的边界框以及相关的概率 text = "{}: {:.2f}%".format(name, proba * 100) y = startY - 10 if startY - 10 >

10 else startY + 10 cv2.rectangle (image, (startX, startY), (endX, endY), (0,0,255), 2) cv2.putText (image, text, (startX, y), cv2.FONT_HERSHEY_SIMPLEX, 1, (0,0,255), 2) # display result cv2.imshow ("Image", image) cv2.waitKey (0)

We load three models in this block. At the risk of redundancy, I would like to explicitly remind you of the differences between models:

Detector: a pre-trained Caffe DL model used to detect the position of human faces in the image.

Embedder: a pre-trained Torch DL model used to calculate our 128murd face embedding.

Recognizer: linear SVM face recognition model.

Both 1 and 2 are pre-trained, which means they are provided to you as is by OpenCV

Load the tag encoder, which contains the names of the people that our model can recognize.

Load the image into memory and build a blob.

Locate the face in the image through our detector.

You will recognize this block from step 1. Let me explain it again here:

Traverses the detection and extracts the confidence of each test.

Then the confidence level is compared with the command line minimum probability detection threshold to ensure that the calculated probability is greater than the minimum probability.

We extract face ROI and make sure that its spatial dimension is large enough.

The following is the face recognition ROI code:

First, construct a faceBlob) and pass it through the encoder to generate a 128D vector describing the face

Then, we pass the vec through our SVM recognizer model, and the result is our prediction of the person in the facial ROI.

We take the highest probability index and query our tag encoder to find the name.

Identify each face (including the "unknown") person in the loop:

After constructing a text string containing the name and probability.

Then draw a rectangle around the face and place the text above the box.

Finally, we visualize the results on the screen until we press a key.

It can be seen that the accuracy of using machine learning is still relatively low, but the advantage is high speed.

Face recognition by camera

Here I use video instead, the code and the image of the face recognition steps are the same, directly on the code.

Create a new recognize_video.py script and insert the following code:

Import numpy as npimport pickleimport timeimport cv2import osdef resize (image, width=None, height=None, inter=cv2.INTER_AREA): dim = None (h W) = image.shape [: 2] # return if width is None and height is None directly if the height and width are None: return image # check whether the width is None if width is None: # calculate the height ratio and calculate the width r = height / float (h) dim = (int (w * r)) Height) # height is None else: # calculate width ratio And calculate the height r = width / float (w) dim = (width, int (h * r)) resized = cv2.resize (image, dim) Interpolation=inter) # return the resized image return resizedout_put='output.avi'video_path = '1.mp4'detector_path =' face_dete_model'embedding_path = 'nn4.small2.v1.t7'recognizer_path =' output/recognizer.pickle'label_path = 'output/le.pickle'confidence_low = 0.1 load our serialized face detector from diskprint ("[INFO] loading face detector...") protoPath = os.path.sep.join ([detector_path) "deploy.proto.txt") modelPath = os.path.sep.join ([detector_path, "res10_300x300_ssd_iter_140000_fp16.caffemodel"]) detector = cv2.dnn.readNetFromCaffe (protoPath, modelPath) # load our serialized face embedding model from diskprint ("[INFO] loading face recognizer...") embedder = cv2.dnn.readNetFromTorch (embedding_path) # load the actual face recognition model along with the label encoderrecognizer = pickle.loads (open (recognizer_path) "rb") .read () le = pickle.loads (open (label_path, "rb"). Read () # initialize the video stream, then allow the camera sensor to warm upprint ("[INFO] starting video stream...") # vs=cv2.VideoCapture (0) # camera vs=cv2.VideoCapture (video_path) # Video time.sleep (2.0) # start the FPS throughput estimatorwriter=None# loop over frames from the video file streamwhile True: # grab the frame from the threaded video stream ret_val Frame = vs.read () if ret_val is False: break # resize the frame to have a width of 600pixels (while # maintaining the aspect ratio), and then grab the image # dimensions frame = resize (frame, width=600) (h, w) = frame.shape [: 2] # construct a blob from the image imageBlob = cv2.dnn.blobFromImage (cv2.resize (frame, (300,300), 1.0,300,300) (104.0, 177.0, 123.0), swapRB=False, crop=False) # apply OpenCV's deep learning-based face detector to localize # faces in the input image detector.setInput (imageBlob) detections = detector.forward () # loop over the detections for i in range (0, detections.shape [2]): # extract the confidence (i.e., probability) associated with # the prediction confidence = detections [0,0, I 2] # filter out weak detections if confidence > confidence_low: # compute the (x, y)-coordinates of the bounding box for # the face box = detections [0,0, I, 3:7] * np.array ([w, h, w, h]) (startX, startY, endX EndY) = box.astype ("int") # extract the face ROI face = frame [startY:endY, startX:endX] (fH, fW) = face.shape [: 2] # ensure the face width and height are sufficiently large if fW

< 20 or fH < 20: continue # construct a blob for the face ROI, then pass the blob # through our face embedding model to obtain the 128-d # quantification of the face faceBlob = cv2.dnn.blobFromImage(face, 1.0 / 255, (96, 96), (0, 0, 0), swapRB=True, crop=False) embedder.setInput(faceBlob) vec = embedder.forward() # perform classification to recognize the face preds = recognizer.predict_proba(vec)[0] j = np.argmax(preds) proba = preds[j] name = le.classes_[j] # draw the bounding box of the face along with the # associated probability text = "{}: {:.2f}%".format(name, proba * 100) y = startY - 10 if startY - 10 >

10 else startY + 10 cv2.rectangle (frame, (startX, startY), (endX, endY), (0,0,255), 2) cv2.putText (frame, text, (startX, y), cv2.FONT_HERSHEY_SIMPLEX, 0.45,0,0,255) 2) if writer is None and out_put is not None: fourcc = cv2.VideoWriter_fourcc (* "MJPG") writer = cv2.VideoWriter (out_put, fourcc, 20, (frame.shape [1], frame.shape [0]), True) # if writer is not None Write the frame of face recognition to disk if writer is not None: writer.write (frame) # show the output frame cv2.imshow ("Frame", frame) key = cv2.waitKey (1) & 0xFF # if the `q` key was pressed, break from the loop if key = ord ("Q"): break# do a bit of cleanupcv2.destroyAllWindows () vs.release () if writer is not None: writer.release ()

Running result:

This is the answer to the question about how to achieve face recognition in Opencv+SVM. I hope the above content can be of some help to you. If you still have a lot of doubts to be solved, you can follow the industry information channel for more related knowledge.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.