algorithm to save frames from a video file

Question:

Is it possible to save a desired amount of images (DESIRED_FRAMES_TO_SAVE) from a video file spread-out evenly throughout the entire video file footage?

Hopefully this makes sense I have a video file that is 8 seconds in length and I would like to save 60 Frames of the video file in sequential order.

Trying to throw something together in open CV the code works but I know it should be improved. For example the if count % 3 == 0: I got from calculating the total frames of the video file which is 223 and then dividing by DESIRED_FRAMES_TO_SAVE or 60 I come up with about ~3.72…so in other words I think about every 3 or 4 frames I should save one to come up with ~DESIRED_FRAMES_TO_SAVE by the end of the video file. Sorry this is an odd question but would anyone have any advice on how to rewrite this better without to while loops?

import cv2

VIDEO_FILE = "hello.mp4"
DESIRED_FRAMES_TO_SAVE = 60


count = 0
cap = cv2.VideoCapture(VIDEO_FILE)

'''
# calculate total frames
while True:
  success, frame = cap.read()
  if success == True:
    count += 1

  else:
      break

total_frames = count
print("TOTAL_FRAMES",total_frames)
'''

count = 0

while True:
    success, frame = cap.read()
    name = './raw_images/' + str(count) + '.jpg'

    if success == True:

        if count % 3 == 0:
            cv2.imwrite(name, frame)
            print(count)

        elif count > DESIRED_FRAMES_TO_SAVE:
            break

        count += 1

    else:
        break
      
cap.release() 
cv2.destroyAllWindows() 
Asked By: bbartling

||

Answers:

OpenCV is for computer vision. It can read video files, but it’s not made for that. It has real limitations. It presents you with this broken abstraction of "frames have indices".

You have to be prepared for variable frame rate, which is a real thing. Many video files don’t keep track of frames by "index" but by timestamp. Video files also may not know how many frames they contain. Any frame count is just the duration (which is a real thing) multiplied by the average frame rate (which is guesswork).

The most reliable solution is to just read the entire video, look at each frame’s timestamp, and decide whether to keep it. That just involves a little bit of math and some counting. Pseudocode:

# basic definitions

vid = cv.VideoCapture(...)
fps = vid.get(cv.CAP_PROP_FPS)
duration_msec = vid.get(cv.CAP_PROP_FRAME_COUNT) / fps * 1000

frames_to_save = 60

indices = range(frames_to_save)
timestamps_msec = [k / frames_to_save * duration_msec for k in indices]
# any monotonically increasing list, values from 0 .. duration_msec
# either:

frames_saved = 0
while frames_saved < frames_to_save:
    (ok, frame) = vid.read()
    if not ok: break
    timestamp_msec = vid.get(cv.CAP_PROP_POS_MSEC)
    scheduled_msec = timestamps_msec[frames_saved] # which one do we wait for?
    if timestamp_msec >= scheduled_msec:
        # ...save the frame...
        frames_saved += 1

If you feel like gambling, you can try to "seek". First you calculate the ideal timestamps you want to keep, and then you "seek" to those positions.

# or:

for (k, ts_msec) in zip(indices, timestamps_msec):
    vid.set(cv.CAP_PROP_POS_MSEC, ts_msec)
    (ok, frame) = vid.read()
    if not ok: break
    # ...save frame... (you can use k for a number)

Seeking, in general (not just OpenCV), has two operating modes. In any case, a library will tell you exactly where you ended up, so it’s not "imprecise" (for some notions of "imprecise").

  • Either it puts you down at the closest possible frame (OpenCV’s choice), i.e. either one frame or its immediate neighbor, which is expensive because it has to decode many (but not necessarily all) frames to reach the target.

  • Or it puts you down at some keyframe near/ahead of where you really wanted to go, and you have to go the rest of the way on foot.

CAP_PROP_POS_FRAMES can be a lie. It merely takes the timestamp and divides by the average frame rate. The timestamp is the only truth in a video.

Even CAP_PROP_FRAME_COUNT can be a lie. It is calculated from the true duration and the average frame rate. Sadly, OpenCV doesn’t expose the video duration, but it is a fundamental value inside of OpenCV. Take the number of frames and multiply it by the (average) frame rate to get back the duration.

All of this assumes the use of the ffmpeg backend, which is the most common (and default) choice for video files in OpenCV. Some other specialist backends, like built-in AVI+MJPEG backend, do have total information, due to the file format.

Answered By: Christoph Rackwitz

Working full code…will have to look into ffmeg as well:

import numpy as np
import cv2

VIDEO_FILE = "hello.mp4"
DESIRED_FRAMES_TO_SAVE = 60


count = 0
cap = cv2.VideoCapture(VIDEO_FILE)

length = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
print("FRAMES TOTAL LEN: ", length)

fps = cap.get(cv2.CAP_PROP_FPS)
print("FPS: ", fps)

duration_msec = cap.get(cv2.CAP_PROP_FRAME_COUNT) / fps * 1000
print("DURATION M SECONDS: ", duration_msec)

indices = range(DESIRED_FRAMES_TO_SAVE)
timestamps_msec = [k / DESIRED_FRAMES_TO_SAVE * duration_msec for k in indices]
print("TIMESTAMPTS M SECONDS: ", timestamps_msec)


frames_saved = 0
while frames_saved < DESIRED_FRAMES_TO_SAVE:
    name = f'./raw_images/{frames_saved}_.jpg'
    (ok, frame) = cap.read()
    if not ok:
        break

    timestamp_msec = cap.get(cv2.CAP_PROP_POS_MSEC)
    scheduled_msec = timestamps_msec[frames_saved]
    if timestamp_msec >= scheduled_msec:
        cv2.imwrite(name, frame)
        print("Processing ", frames_saved)
        frames_saved += 1

cap.release()
cv2.destroyAllWindows()
Answered By: bbartling
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.