A case study on the realization of Eye Control Mouse by Opencv 07/06 Update SLTechnology News&Howtos

A case study on the realization of Eye Control Mouse by Opencv

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/01 Report--

This article mainly explains the "Opencv to achieve eye control of the mouse case analysis", the content of the article is simple and clear, easy to learn and understand, the following please follow the editor's ideas slowly in depth, together to study and learn "Opencv to achieve eye control of the mouse case analysis" bar!

Before we start the project, we need to introduce third-party libraries.

# For monitoring web camera and performing image minipulationsimport cv2# For performing array operationsimport numpy as np# For creating and removing directoriesimport osimport shutil# For recognizing and performing actions on mouse pressesfrom pynput.mouse import Listener

First, let's take a look at how Pynput's Listener works. Pynput.mouse.Listener creates a background thread that records mouse movements and mouse clicks. This is a simplified code that prints the coordinates of the mouse when you press the mouse:

From pynput.mouse import Listenerdef on_click (x, y, button, pressed): "Args: X: the x-coordinate of the mouse y: the y-coordinate of the mouse button: 1 or 0, depending on right-click or left-click pressed: 1 or 0, whether the mouse was pressed or released"if pressed:print (x, y) with Listener (on_click = on_click) as listener: listener.join ()

Now, in order to achieve our goal, let's extend this framework. First, however, we need to write code to crop the eye bounding box. We will call this function later inside the on_click function. We use Haar cascading object detection to determine the bounding box of the user's eyes. You can download the detector file here and let's do a simple demonstration of how it works:

Import cv2# Load the cascade classifier detection objectcascade = cv2.CascadeClassifier ("haarcascade_eye.xml") # Turn on the web cameravideo_capture = cv2.VideoCapture (0) # Read data from the web camera (get the frame) _, frame = video_capture.read () # Convert the image to grayscalegray = cv2.cvtColor (frame, cv2.COLOR_BGR2GRAY) # Predict the bounding box of the eyesboxes = cascade.detectMultiScale (gray, 1.3,10) # Filter out images taken from a bad angle with errors# We want to make sure both eyes were detected And nothing elseif len (boxes) = = 2:eyes = [] for box in boxes: # Get the rectangle parameters for the detected eyex, y, w, h = box # Crop the bounding box from the frameeye = frame [YRV y + h, XRV x + w] # Resize the crop to 32x32eye = cv2.resize (eye, (32) 32)) # Normalizeeye = (eye-eye.min ()) / (eye.max ()-eye.min ()) # Further crop to just around the eyeballeye = np.hstack [10:-10,5 eye.min 5] # Scale between [0,255] and convert to int datatypeeye = (eye * 255) .astype (np.uint8) # Add the current eye to the list of 2 eyeseyes.append (eye) # Concatenate the two eye images into oneeyes = np.hstack (eyes)

Now, let's use this knowledge to write a function for cropping the eye image. First, we need a helper function to standardize:

Def normalize (x): minn, maxx = x.min (), x.max () return (x-minn) / (maxx-minn)

This is our eye cutting function. If an eye is found, it will return an image. Otherwise, it returns None:

Def scan (image_size= (32,32)): _, frame = video_capture.read () gray = cv2.cvtColor (frame, cv2.COLOR_BGR2GRAY) boxes = cascade.detectMultiScale (gray, 1.3,10) if len (boxes) = 2:eyes = [] for box in boxes:x, y, w, h = boxeye = frame [YRV y + h, x frame x + w] eye = cv2.resize (eye, image_size) eye = normalize (eye) eye = Malaysia [10:-10] Eyes.append (eye) return (np.hstack (eyes) * 255) .astype (np.uint8) else:return None

Now, let's write our automation, which will run each time the mouse button is pressed. (suppose we had root defined the variable in the code as the directory where we want to store the image):

Def on_click (x, y, button, pressed): # If the action was a mouse PRESS (not a RELEASE) if pressed:# Crop the eyes eyes = scan () # If the function returned None, something went wrongif not eyes is None:# Save the image filename = root + "{} {} .jpeg" .format (x, y, button) cv2.imwrite (filename, eyes)

Now we can recall the implementation of pynput, Listener, and complete the code implementation:

Import cv2import numpy as npimport osimport shutilfrom pynput.mouse import Listener root = input ("Enter the directory to store the images:") if os.path.isdir (root): resp = "" while not resp in ["Y", "N"]: resp = input ("This directory already exists. If you continue, the contents of the existing directory will be deleted. If you would still like to proceed, enter [Y]. Otherwise, enter [N]: ") if resp =" Y ": shutil.rmtree (root) else: exit () os.mkdir (root) # Normalization helper functiondef normalize (x): minn, maxx = x.min (), x.max () return (x-minn) / (maxx-minn) # Eye cropping functiondef scan (image_size= (32,32): _, frame = video_capture.read () gray = cv2.cvtColor (frame, cv2.COLOR_BGR2GRAY) boxes = cascade.detectMultiScale (gray) 1.3,10) if len (boxes) = = 2: eyes = [] for box in boxes: X, y, w, h = frame [YRV y + h, XRV x + w] eye = cv2.resize (eye, image_size) eye = normalize (eye) eye = normalize [10:-10] 5return eyes.append (eye) return (np.hstack (eyes) * 255) .astype (np.uint8) else:return None def on_click (x, y, button, pressed): # If the action was a mouse PRESS (not a RELEASE) if pressed:# Crop the eyes eyes = scan () # If the function returned None, something went wrongif not eyes is None:# Save the image filename = root + "{} {} .jpeg" .format (x, y, button) cv2.imwrite (filename Eyes) cascade = cv2.CascadeClassifier ("haarcascade_eye.xml") video_capture = cv2.VideoCapture (0) with Listener (on_click = on_click) as listener: listener.join ()

When you run this command, each time you click the mouse (if both eyes are in sight), it will automatically crop the webcam and save the image to the appropriate directory. The file name of the image will contain mouse coordinate information and whether it is right-or left-click.

This is a sample image. In this image, I left-click the coordinates (385686) on a monitor with a resolution of 2560x1440:

The cascade classifier is very accurate, and so far I haven't seen any errors in my data catalog. Now, let's write code to train the neural network to predict the position of the mouse given your eye image.

Import numpy as npimport osimport cv2import pyautoguifrom tensorflow.keras.models import * from tensorflow.keras.layers import * from tensorflow.keras.optimizers import *

Now, let's add a cascade classifier:

Cascade = cv2.CascadeClassifier ("haarcascade_eye.xml") video_capture = cv2.VideoCapture (0)

Normalize:

Def normalize (x): minn, maxx = x.min (), x.max () return (x-minn) / (maxx-minn)

Capture the eyes:

Let's define the size of the monitor. You must change the following parameters according to the resolution of your computer screen:

# Note that there are actually 2560x1440 pixels on my screen# I am simply recording one less, so that when we divide by these# numbers, we will normalize between 0 and 1. Note that mouse# coordinates are reported starting at (0,0), not (1,1) width, height = 2559, 1439

Now, let's load the data (again, suppose you have defined root). We don't care whether we click the right mouse button or the left mouse button, because our goal is to predict the position of the mouse:

Filepaths = os.listdir (root) X, Y = [], [] for filepath in filepaths:x, y, _ = filepath.split ('') x = float (x) / widthy = float (y) / heightX.append (cv2.imread (root + filepath)) Y.append ([x, y]) X = np.array (X) / 255.0 Y = np.array (Y) print (X.shape, Y.shape)

Let's define our model architecture:

Model = Sequential () model.add (Conv2D (32, 3, 2, activation = 'relu', input_shape = (12, 44, 3)) model.add (Conv2D (64, 2, activation =' relu')) model.add (Flatten ()) model.add (Dense (32, activation = 'relu') model.add (Dense (2, activation =' sigmoid')) model.compile (optimizer = "adam", loss = "mean_squared_error") model.summary ()

This is our summary:

The next task is to train the model. We will add some noise to the image data:

Epochs = 200for epoch in range (epochs): model.fit (X, Y, batch_size = 32)

Now let's use our model to move the mouse in real time. Note that this requires a lot of data to work properly. However, as a proof of concept, you will notice that there are actually only 200 images, and it does move the mouse to the regular area you want to view. Of course, this is uncontrollable unless you have more data.

While True: eyes = scan () if not eyes is None: eyes = np.expand_dims (eyes / 255.0, axis = 0) x, y = model.predict (eyes) [0] pyautogui.moveTo (x * width, y * height)

This is an example of a proof of concept. Please note that we trained very little data before this screen recording. This is a video of our mouse automatically moving to the terminal application window based on our eyes. Like I said, it's easy because there's so little data. With more data, it is expected to be stable enough to control with higher specificity. With just a few hundred images, you can only move them to the entire area of your gaze. In addition, if you do not take any images in a specific area of the screen (such as the edge) throughout the data collection process, the model is unlikely to predict in that area.

Thank you for your reading, the above is the content of "the case study of Opencv to achieve eye control mouse". After the study of this article, I believe you have a deeper understanding of the case study of Opencv to achieve eye control mouse, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.