How does python handle videos with different number of frames 07/09 Update SLTechnology News&Howtos

How does python handle videos with different number of frames

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces "how python handles videos with different number of frames". In daily operation, I believe many people have doubts about how python handles videos with different number of frames. Xiaobian consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful for you to answer the doubts about "how python handles videos with different frames". Next, please follow the editor to study!

The most important step in training and testing an effective machine learning model is to collect a large amount of data and use it to train it effectively. Small batches (Mini-batches) help solve this problem by training with a small amount of data in each iteration.

However, as a large number of machine learning tasks are executed on video data sets, there is a problem of effective batch processing of unequal-length videos. Most methods rely on clipping the video to an equal length to extract the same number of frames during the iteration. But this is not particularly useful in scenarios where we need to get information from each frame to effectively predict something, especially in self-driving cars and motion recognition.

We can create a processing method that can handle videos of different lengths.

In Glenn Jocher's Yolov3, I used LoadStreams as the basis to create the LoadStreamsBatch class.

Class initialization def_ _ init__ (self, sources='streams.txt', img_size=416, batch_size=2, subdir_search=False): self.mode = 'images' self.img_size = img_size self.def_img_size = None videos = [] if os.path.isdir (sources): if subdir_search: for subdir, dirs Files in os.walk (sources): for file in files: if 'video' in magic.from_file (subdir + os.sep + file) Mime=True): videos.append (subdir + os.sep + file) else: for elements in os.listdir (sources): if not os.path.isdir (elements) and 'video' in magic.from_file (sources + os.sep + elements Mime=True): videos.append (sources + os.sep + elements) else: with open (sources 'r') as f: videos = [x.strip () for x in f.read (). Splitlines () if len (x.strip ())] n = len (videos) curr_batch = 0 self.data = [None] * batch_size self.cap = [None] * batch_size self.sources = videos self.n = n self.cur_pos = 0 # start thread to read frame for I from video stream S in enumerate (videos): if curr_batch = = batch_size: break print ('% g videos% g:% s... '% (self.cur_pos+1, n, s) End='') self.cap [curr _ batch] = cv2.VideoCapture (s) try: assert self.cap [curr _ batch] .isOpened () except AssertionError: print ('curr% s) self.cur_pos+=1 continue w = int (self.cap [curr _) Batch] .get (cv2.CAP_PROP_FRAME_WIDTH)) h = int (self.cap [curr _ batch] .get (cv2.CAP_PROP_FRAME_HEIGHT)) fps = self.cap [curr _ batch] .get (cv2.CAP_PROP_FPS)% 100frames = int (self.cap [curr _ batch] .get (cv2.CAP_PROP_FRAME_COUNT)) _ Self.data [I] = self.cap [curr _ batch] .read () # guarantee first frame thread = Thread (target=self.update, args= ([I, self.cap [curr _ batch], self.cur_pos+1]), daemon=True) print ('success (% gx%g at% .2f FPS having% g frames).% (w, h, fps) Frames)) curr_batch+=1 self.cur_pos+=1 thread.start () print ('') # New line if all (v is None for v in self.data): return # check common shape s = np.stack ([letterbox (x, new_shape=self.img_size) [0] .shape for x in self.data] 0) # the shape of reasoning self.rect = np.unique (s, axis=0). Shape [0] = = 1 if not self.rect: print ('WARNING: Different stream shapes detected. For optimal performance supply similarly-shaped streams.')

In the * _ _ init__* function, take four parameters. Although img_size is the same as the original version, the other three parameters are defined as follows:

Sources: it takes a directory path or text file as input.

Batch_size: required batch size

Subdir_search: you can toggle this option to ensure that relevant files in all subdirectories are searched when the directory is passed as a sources parameter

I first check whether the sources parameter is a directory or a text file. If it is a directory, I will read everything in the directory (if the subdir_search parameter is True, the subdirectory will be included), otherwise I will read the path of the video in the text file. The path to the video is stored in the list. Use cur_pos to track the current location in the list.

The list iterates with the maximum value of batch_size and checks to skip incorrect or non-existent videos. They are sent to the letterbox function to resize the image. There is no change from the original version unless all videos are malfunctioning / unavailable.

Def letterbox (img, new_shape= (416,416), color= (114114114), auto=True, scaleFill=False, scaleup=True): # rectangular https://github.com/ultralytics/yolov3/issues/232 shape= img.shape that adjusts the image to 32 pixel multiples [: 2] # current shape [height, width] if isinstance (new_shape, int): new_shape= (new_shape) New_shape) # ratio r = min (new_shape [0] / shape [0], new_shape [1] / shape [1]) if not scaleup: # only scaled down Do not scale up (for better test drawings) r = min (r, 1.0) # calculate fill ratio = r, r # aspect ratio new_unpad = int (round (shape [1] * r)), int (round (shape [0] * r)) dw, dh = new_shape [1]-new_unpad [0] New_shape [0]-new_unpad [1] # fill if auto: # minimum rectangle dw, dh = np.mod (dw, 64), np.mod (dh, 64) # fill elif scaleFill: # stretch dw, dh = 0.0,0.0 new_unpad = new_shape ratio = new_shape [0] / shape [1] New_shape [1] / shape [0] # aspect ratio dw / = 2 # will be filled with dh / = 2 if shape [::-1]! = new_unpad: # change size img = cv2.resize (img, new_unpad, interpolation=cv2.INTER_LINEAR) top, bottom = int (round (dh-0.1), int (round (dh + 0.1) left, right = int (round (dw-0.1)) Int (round (dw + 0.1) img = cv2.copyMakeBorder (img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) # add boundary return img, ratio, (dw, dh) fixed interval search frame function

There is a small change in the update function. We also store the default image size so that all videos can be extracted for processing, but because of the unequal length, one video is completed earlier than the other. When I explain the next part of the code, it will be clearer, which is the * _ _ next__* function.

Def update (self, index, cap, cur_pos): # read the next frame in the daemon thread n = 0 while cap.isOpened (): n + = 1 # _, self.imgs [index] = cap.read () cap.grab () if n = 4: # read every 4 frames Self.data [index] = cap.retrieve () if self.def_img_size is None: self.def_img_size = self.data [index] .shape n = 0 time.sleep (0.01) # waiting iterator

If the frame exists, it is passed to the letterbox function as usual. In the case where frame is None, this means that the video has been fully processed, and we check that all videos in the list have been processed. If there is more video to process, the cur_pos pointer is used to get the location of the next available video.

If the video is no longer extracted from the list, but some videos are still being processed, a blank frame is sent to other processing components, that is, it dynamically resizes the video based on the remaining frames in the other batch.

Def _ _ next__ (self): self.count + = 1 img0 = self.data.copy () img = [] for I, x in enumerate (img0): if x is not None: img.append (letterbox (x, new_shape=self.img_size) Auto=self.rect) [0]) else: if self.cur_pos = = self.n: if all (v is None for v in img0): cv2.destroyAllWindows () raise StopIteration else: img0 [I] = np.zeros ( Self.def_img_size) img.append (letterbox (IMG0 [I]) New_shape=self.img_size, auto=self.rect) [0]) else: print ('% g move% g:% s. '% (self.cur_pos+1, self.n, self.sources [self.cur _ pos]) End='') self.cap [I] = cv2.VideoCapture (self.sources [self.cur _ pos]) fldr_end_flg = 0 while not self.cap.isOpened (): print ('Failed to open% s'% self.sources [self.cur _ pos]) Self.cur_pos+=1 if self.cur_pos = = self.n: img0 [I] = np.zeros (self.def_img_size) img.append (letterbox (IMG0 [I]) New_shape=self.img_size Auto=self.rect) [0]) fldr_end_flg = 1 break self.cap [I] = cv2.VideoCapture (self.sources [self.cur _ pos]) if fldr_end_flg: continue w = int (cap .get (cv2.CAP_PROP_FRAME_WIDTH) h = int (cap.get (cv2.CAP_PROP_FRAME_HEIGHT)) fps = cap.get (cv2.CAP_PROP_FPS)% 100frames = int (cap.get (cv2.CAP_PROP_FRAME_COUNT)) _ Self.data [I] = self.cap [I] .read () # guarantee the first frame img0 [I] = self.data [I] img.append (letterbox (self.data [I], new_shape=self.img_size, auto=self.rect) [0]) thread = Thread (target=self.update, args= ([I, self.cap [I], self.cur_pos+1])) Daemon=True) print ('success (% gx%g at% .2f FPS having% g frames).% (w, h, fps, frames) self.cur_pos+=1 thread.start () print ('') # New line # Stack img = np.stack (img 0) # convert img = img [:,:-1] .transpose (0,3,1,2) # BGR to RGB, bsx3x416x416 img = np.ascontiguousarray (img) return self.sources, img, img0, None to this The study on "how python handles videos with different number of frames" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.