May | 2017 | cascade∮stream

Motion estimation in video processing refers to finding a sub-block in a reference picture that most accurately resembles or predicts a block in a target picture. The horizontal and vertical displacement between the target block and the reference block is represented by a two dimensional motion vector.

Spatial domain block matching techniques are most commonly used for motion estimation in video compression encoders and other video processing applications. They work by computing a distance criterion value between a candidate reference block and the target block, according to some candidate motion vector. A common distance criterion function is sum of absolute differences (SAD). The computations need to be repeated for every candidate motion vector, and the one with the smallest distance criterion value is taken as the correct motion vector.

The phase correlation approach uses a frequency domain transformation to find the motion vector in a single iteration. In some applications it could obtain the motion vector with less computations than the spatial domain block matching approach. It may also find the motion vector more accurately when there are extraneous differences between target and reference, such as a different illumination levels or image noise.

We studied motion estimation by phase correlation and wrote a notebook article on it, which you can see on this Jupyter notebook page.

We started with a known test image from which we could see the true motion. Here are the first two frames of the pedestrian_area test clip:

pedestrian_area_frame0_frame1