How to decode a video (memory file / byte string) and step through it frame by frame in python?

According to this post, you can’t use cv.VideoCapture for decoding in memory stream.
you may decode the stream by “piping” to FFmpeg.

The solution is a bit complicated, and writing to disk is much simpler, and probably cleaner solution.

I am posting a solution using FFmpeg (and FFprobe).
There are Python bindings for FFmpeg, but the solution is executing FFmpeg as an external application using subprocess module.
(The Python binding is working well with FFmpeg, but piping to FFprobe is not).
I am using Windows 10, and I put ffmpeg.exe and ffprobe.exe in the execution folder (you may set the execution path as well).
For Windows, download the latest (statically liked) stable version.

I created a standalone example that performs the following:

  • Generate synthetic video, and save it to WebM file (used as input for testing).
  • Read file into memory as binary data (replace it with your blob from the server).
  • Pipe the binary stream to FFprobe, for finding the video resolution.
    In case the resolution is known from advance, you may skip this part.
    Piping to FFprobe makes the solution more complicated than it should have.
  • Pipe the binary stream to FFmpeg stdin for decoding, and read decoded raw frames from stdout pipe.
    Writing to stdin is done in chunks using Python thread.
    (The reason for using stdin and stdout instead of named pipes is for Windows compatibility).

Piping architecture:

 --------------------  Encoded      ---------  Decoded      ------------
| Input WebM encoded | data        | ffmpeg  | raw frames  | reshape to |
| stream (VP9 codec) | ----------> | process | ----------> | NumPy array|
 --------------------  stdin PIPE   ---------  stdout PIPE  -------------

Here is the code:

import numpy as np
import cv2
import io
import subprocess as sp
import threading
import json
from functools import partial
import shlex

# Build synthetic video and read binary data into memory (for testing):
#########################################################################
width, height = 640, 480
sp.run(shlex.split('ffmpeg -y -f lavfi -i testsrc=size={}x{}:rate=1 -vcodec vp9 -crf 23 -t 50 test.webm'.format(width, height)))

with open('test.webm', 'rb') as binary_file:
    in_bytes = binary_file.read()
#########################################################################


# https://stackoverflow.com/questions/5911362/pipe-large-amount-of-data-to-stdin-while-using-subprocess-popen/14026178
# https://stackoverflow.com/questions/15599639/what-is-the-perfect-counterpart-in-python-for-while-not-eof
# Write to stdin in chunks of 1024 bytes.
def writer():
    for chunk in iter(partial(stream.read, 1024), b''):
        process.stdin.write(chunk)
    try:
        process.stdin.close()
    except (BrokenPipeError):
        pass  # For unknown reason there is a Broken Pipe Error when executing FFprobe.


# Get resolution of video frames using FFprobe
# (in case resolution is know, skip this part):
################################################################################
# Open In-memory binary streams
stream = io.BytesIO(in_bytes)

process = sp.Popen(shlex.split('ffprobe -v error -i pipe: -select_streams v -print_format json -show_streams'), stdin=sp.PIPE, stdout=sp.PIPE, bufsize=10**8)

pthread = threading.Thread(target=writer)
pthread.start()

pthread.join()

in_bytes = process.stdout.read()

process.wait()

p = json.loads(in_bytes)

width = (p['streams'][0])['width']
height = (p['streams'][0])['height']
################################################################################


# Decoding the video using FFmpeg:
################################################################################
stream.seek(0)

# FFmpeg input PIPE: WebM encoded data as stream of bytes.
# FFmpeg output PIPE: decoded video frames in BGR format.
process = sp.Popen(shlex.split('ffmpeg -i pipe: -f rawvideo -pix_fmt bgr24 -an -sn pipe:'), stdin=sp.PIPE, stdout=sp.PIPE, bufsize=10**8)

thread = threading.Thread(target=writer)
thread.start()


# Read decoded video (frame by frame), and display each frame (using cv2.imshow)
while True:
    # Read raw video frame from stdout as bytes array.
    in_bytes = process.stdout.read(width * height * 3)

    if not in_bytes:
        break  # Break loop if no more bytes.

    # Transform the byte read into a NumPy array
    in_frame = (np.frombuffer(in_bytes, np.uint8).reshape([height, width, 3]))

    # Display the frame (for testing)
    cv2.imshow('in_frame', in_frame)

    if cv2.waitKey(100) & 0xFF == ord('q'):
        break

if not in_bytes:
    # Wait for thread to end only if not exit loop by pressing 'q'
    thread.join()

try:
    process.wait(1)
except (sp.TimeoutExpired):
    process.kill()  # In case 'q' is pressed.
################################################################################

cv2.destroyAllWindows()

Remark:

  • In case you are getting an error like “file not found: ffmpeg…”, try using full path.
    For example (in Linux): '/usr/bin/ffmpeg -i pipe: -f rawvideo -pix_fmt bgr24 -an -sn pipe:'

Leave a Comment