FFmpeg: Extract foreground [moving] objects from video

This is a somewhat crude implementation, but given the right source material, an acceptable result can be generated. It is based on FFmpeg's 'maskedmerge' filter, which takes three input streams: a background, an overlay, and a mask (which is used to manipulate the pixels of the overlay layer).

ffmpeg \
   -i background.png \
   -i video.mkv \
   -filter_complex \
      color=#00ff00:size=1280x720 [matte];
      [1:0] format=rgb24, split[mask][video];
      [0:0][mask] blend=all_mode=difference, 
         curves=m='0/0 .1/0 .2/1 1/1', 
         format=rgb24 [mask];
      [matte][video][mask] maskedmerge,format=rgb24
   " \
   -shortest \
   -pix_fmt yuv422p \

For this process, a still background image is needed. An extracted frame from the video will do, or if the background is constantly obscured, it may be necessary to manually create a clean image from multiple frames (stacking multiple frames may produce better results too).

The background image is 'difference' blended with the video, to produce the mask which will be used with the 'maskedmerge' filter. This video stream is then converted to grayscale and adjusted to maximise the contrast levels. [N.B. The video format changes multiple times with different filter effects, and so 'format=rgb24' is set in each filterchain for colour compatibility.]

The curves and equilisation filtering is a bit hard to explain, and due to to lack of a real time preview, somewhat "hit and miss". Basically, a 'threshold' filter is being built, where just black and white areas are created. The eq/curve filters here progressively squeeze the tones together in such a way that only the wanted areas are solid white. This will change for each project, and the shown filter chain has been progressive "hacked together" for this specific video.[N.B. 'maskedmerge' interprets tonality as levels of pixel opacity in the overlay layer]

The first 'smartblur' filter fills out (dilates) the areas to create more solid structures in the mask. The second 'smartblur' filter blends the edges of the mask to create a softer cutout. Additional 'smartblur' filters can be used on the background and on the video stream it is blended with, which will act as a noise filter to cull stray momentary differences.

The final element is a new background for the extracted elements to sit upon. In this example, a simple green matte is generated. This, along with the created mask, and original video, are are provided as input for the 'maskedmerge' filter.

There are many ways this can be implemented, adjusted, and improved. In the example above, everything is done within one filtergraph, but it can be separated out into multiple passes (this would be useful for manually fixing errors in the mask). [N.B. Timing can be an issue when running this all in a single filtergraph (where the mask layer didn't match up with the overlay). 29.97fps videos proved particularly troublesome. Repeated use of 'setpts=PTS' in filter graph might help, but it this case, it was fixed by converting the video to 25fps beforehand.]

ffmpeg maskedmerge: https://ffmpeg.org/ffmpeg-filters.html#maskedmerge
source video: ぷに (Puni) https://www.youtube.com/watch?v=B0o8cQa-Kd8

FFmpeg: Create a video composite of colourised macroblock motion-vectors

# Generate video motion vectors, in various colours, and merge together
# NB: Includes fixed 'curve' filters for issue outlined in blog post

ffplay \
   -flags2 +export_mvs \
   -i video.mkv \
   -vf \
         split=3 [original][original1][vectors];
         [vectors] codecview=mv=pf+bf+bb [vectors];
         [vectors][original] blend=all_mode=difference128,
            split=3 [yellow][pink][black];
         [yellow] curves=r='0/0 0.1/0.5 1/1':
                         g='0/0 0.1/0.5 1/1':
                         b='0/0 0.4/0.5 1/1' [yellow];
         [pink] curves=r='0/0 0.1/0.5 1/1':
                       g='0/0 0.1/0.3 1/1':
                       b='0/0 0.1/0.3 1/1' [pink];
         [original1][yellow] blend=all_expr=if(gt(X\,Y*(W/H))\,A\,B) [yellorig];
         [pink][black] blend=all_expr=if(gt(X\,Y*(W/H))\,A\,B) [pinkblack];

# Process:
# 1: Three copies of input video are made
# 2: Motion vectors are applied to one stream
# 3: The result of #2 is 'difference128' blended with an original video stream
#    The brightness and contrast are adjusted to improve clarity
#    Three copies of this vectors result are made
# 4: Curves are applied to one vectors stream to create yellow colour
# 5: Curves are applied to another vectors stream to create pink colour
# 6: Original video stream and yellow vectors are combined diagonally
# 7: Pink vectors stream and original vectors stream are combined diagonally
# 8: The results of #6 and #7 are combined diagonally (opposite direction)

NB: At time of writing, the latest version of FFmpeg (N-81396-g0d8b6a1) has a bug (feature?) where upper and lower bounds of 'curves' filter must be set for accurate results. This is contrary to what's written in official documentation.

alternate version:

see related: http://oioiiooixiii.blogspot.com/2016/04/ffmpeg-display-and-isolate-macroblock.html
source video: 足太ぺんた (Asibuto Penta) https://www.youtube.com/watch?v=Djdm7NaQheU