JavaScript: Extract video frames reliably

[2021 update]: Since this question (and answer) has first been posted, things have evolved in this area, and it is finally time to make an update; the method that was exposed here went out-of-date, but luckily a few new or incoming APIs can help us better in extracting video frames:

The most promising and powerful one, but still under development, with a lot of restrictions: WebCodecs

This new API unleashes access to the media decoders and encoders, enabling us to access raw data from video frames (YUV planes), which may be a lot more useful for many applications than rendered frames; and for the ones who need rendered frames, the VideoFrame interface that this API exposes can be drawn directly to a <canvas> element or converted to an ImageBitmap, avoiding the slow route of the MediaElement.
However there is a catch, apart from its current low support, this API needs that the input has been demuxed already.
There are some demuxers online, for instance for MP4 videos GPAC’s mp4box.js will help a lot.

A full example can be found on the proposal’s repo.

The key part consists of

const decoder = new VideoDecoder({
  output: onFrame, // the callback to handle all the VideoFrame objects
  error: e => console.error(e),
});
decoder.configure(config); // depends on the input file, your demuxer should provide it
demuxer.start((chunk) => { // depends on the demuxer, but you need it to return chunks of video data
  decoder.decode(chunk); // will trigger our onFrame callback  
})

Note that we can even grab the frames of a MediaStream, thanks to MediaCapture Transform‘s MediaStreamTrackProcessor.
This means that we should be able to combine HTMLMediaElement.captureStream() and this API in order to get our VideoFrames, without the need for a demuxer. However this is true only for a few codecs, and it means that we will extract frames at reading speed…
Anyway, here is an example working on latest Chromium based browsers, with chrome://flags/#enable-experimental-web-platform-features switched on:

const frames = [];
const button = document.querySelector("button");
const select = document.querySelector("select");
const canvas = document.querySelector("canvas");
const ctx = canvas.getContext("2d");

button.onclick = async(evt) => {
  if (window.MediaStreamTrackProcessor) {
    let stopped = false;
    const track = await getVideoTrack();
    const processor = new MediaStreamTrackProcessor(track);
    const reader = processor.readable.getReader();
    readChunk();

    function readChunk() {
      reader.read().then(async({ done, value }) => {
        if (value) {
          const bitmap = await createImageBitmap(value);
          const index = frames.length;
          frames.push(bitmap);
          select.append(new Option("Frame #" + (index + 1), index));
          value.close();
        }
        if (!done && !stopped) {
          readChunk();
        } else {
          select.disabled = false;
        }
      });
    }
    button.onclick = (evt) => stopped = true;
    button.textContent = "stop";
  } else {
    console.error("your browser doesn't support this API yet");
  }
};

select.onchange = (evt) => {
  const frame = frames[select.value];
  canvas.width = frame.width;
  canvas.height = frame.height;
  ctx.drawImage(frame, 0, 0);
};

async function getVideoTrack() {
  const video = document.createElement("video");
  video.crossOrigin = "anonymous";
  video.src = "https://upload.wikimedia.org/wikipedia/commons/a/a4/BBH_gravitational_lensing_of_gw150914.webm";
  document.body.append(video);
  await video.play();
  const [track] = video.captureStream().getVideoTracks();
  video.onended = (evt) => track.stop();
  return track;
}
video,canvas {
  max-width: 100%
}
<button>start</button>
<select disabled>
</select>
<canvas></canvas>

The easiest to use, but still with relatively poor browser support, and subject to the browser dropping frames: HTMLVideoElement.requestVideoFrameCallback

This method allows us to schedule a callback to whenever a new frame will be painted on the HTMLVideoElement.
It is higher level than WebCodecs, and thus may have more latency, and moreover, with it we can only extract frames at reading speed.

const frames = [];
const button = document.querySelector("button");
const select = document.querySelector("select");
const canvas = document.querySelector("canvas");
const ctx = canvas.getContext("2d");

button.onclick = async(evt) => {
  if (HTMLVideoElement.prototype.requestVideoFrameCallback) {
    let stopped = false;
    const video = await getVideoElement();
    const drawingLoop = async(timestamp, frame) => {
      const bitmap = await createImageBitmap(video);
      const index = frames.length;
      frames.push(bitmap);
      select.append(new Option("Frame #" + (index + 1), index));

      if (!video.ended && !stopped) {
        video.requestVideoFrameCallback(drawingLoop);
      } else {
        select.disabled = false;
      }
    };
    // the last call to rVFC may happen before .ended is set but never resolve
    video.onended = (evt) => select.disabled = false;
    video.requestVideoFrameCallback(drawingLoop);
    button.onclick = (evt) => stopped = true;
    button.textContent = "stop";
  } else {
    console.error("your browser doesn't support this API yet");
  }
};

select.onchange = (evt) => {
  const frame = frames[select.value];
  canvas.width = frame.width;
  canvas.height = frame.height;
  ctx.drawImage(frame, 0, 0);
};

async function getVideoElement() {
  const video = document.createElement("video");
  video.crossOrigin = "anonymous";
  video.src = "https://upload.wikimedia.org/wikipedia/commons/a/a4/BBH_gravitational_lensing_of_gw150914.webm";
  document.body.append(video);
  await video.play();
  return video;
}
video,canvas {
  max-width: 100%
}
<button>start</button>
<select disabled>
</select>
<canvas></canvas>

For your Firefox users, Mozilla’s non-standard HTMLMediaElement.seekToNextFrame()

As its name implies, this will make your <video> element seek to the next frame.
Combining this with the seeked event, we can build a loop that will grab every frame of our source, faster than reading speed (yeah!).
But this method is proprietary, available only in Gecko based browsers, not on any standard tracks, and probably gonna be removed in the future when they’ll implement the methods exposed above.
But for the time being, it is the best option for Firefox users:

const frames = [];
const button = document.querySelector("button");
const select = document.querySelector("select");
const canvas = document.querySelector("canvas");
const ctx = canvas.getContext("2d");

button.onclick = async(evt) => {
  if (HTMLMediaElement.prototype.seekToNextFrame) {
    let stopped = false;
    const video = await getVideoElement();
    const requestNextFrame = (callback) => {
      video.addEventListener("seeked", () => callback(video.currentTime), {
        once: true
      });
      video.seekToNextFrame();
    };
    const drawingLoop = async(timestamp, frame) => {
      if(video.ended) {
        select.disabled = false;
        return; // FF apparently doesn't like to create ImageBitmaps
                // from ended videos...
      }
      const bitmap = await createImageBitmap(video);
      const index = frames.length;
      frames.push(bitmap);
      select.append(new Option("Frame #" + (index + 1), index));

      if (!video.ended && !stopped) {
        requestNextFrame(drawingLoop);
      } else {
        select.disabled = false;
      }
    };
    requestNextFrame(drawingLoop);
    button.onclick = (evt) => stopped = true;
    button.textContent = "stop";
  } else {
    console.error("your browser doesn't support this API yet");
  }
};

select.onchange = (evt) => {
  const frame = frames[select.value];
  canvas.width = frame.width;
  canvas.height = frame.height;
  ctx.drawImage(frame, 0, 0);
};

async function getVideoElement() {
  const video = document.createElement("video");
  video.crossOrigin = "anonymous";
  video.src = "https://upload.wikimedia.org/wikipedia/commons/a/a4/BBH_gravitational_lensing_of_gw150914.webm";
  document.body.append(video);
  await video.play();
  return video;
}
video,canvas {
  max-width: 100%
}
<button>start</button>
<select disabled>
</select>
<canvas></canvas>

The least reliable, that did stop working over time: HTMLVideoElement.ontimeupdate

The strategy pause – draw – play – wait for timeupdate used to be (in 2015) a quite reliable way to know when a new frame got painted to the element, but since then, browsers have put serious limitations on this event which was firing at great rate and now there isn’t much information we can grab from it…

I am not sure I can still advocate for its use, I didn’t check how Safari (which is currently the only one without a solution) handles this event (their handling of medias is very weird for me), and there is a good chance that a simple setTimeout(fn, 1000 / 30) loop is actually more reliable in most of the cases.

Leave a Comment