The International Telecommunications Union publishes Recommendation ITU-R BT.709 which specifies how image data is coded for transmission as video. This includes a specification for the representation of color which applies to almost all modern video transmission including broadcast, cable/satellite, or over the internet (but not including emerging practices for high dynamic range/wide color gamut).
We conventionally understand any color to be a combination of red, green, and blue primaries, but the photoreceptors in the human eye do not function on the basis of these primaries, and BT.709 does not include the first principles needed to understand color representation. It only offers the following specification:
– Red (R)
– Green (G)
– Blue (B)
||Assumed chromaticity for
equal primary signals
In this post, we take a deeper look at Publication No. 15 of the Commission Internationale de l’Eclairage, CIE 15:2004, which includes the 1931 chromaticity specification, still the defacto standard for quantifying color in a camera and display device-invariant way.
Motion estimation in video processing refers to finding a sub-block in a reference picture that most accurately resembles or predicts a block in a target picture. The horizontal and vertical displacement between the target block and the reference block is represented by a two dimensional motion vector.
Spatial domain block matching techniques are most commonly used for motion estimation in video compression encoders and other video processing applications. They work by computing a distance criterion value between a candidate reference block and the target block, according to some candidate motion vector. A common distance criterion function is sum of absolute differences (SAD). The computations need to be repeated for every candidate motion vector, and the one with the smallest distance criterion value is taken as the correct motion vector.
The phase correlation approach uses a frequency domain transformation to find the motion vector in a single iteration. In some applications it could obtain the motion vector with less computations than the spatial domain block matching approach. It may also find the motion vector more accurately when there are extraneous differences between target and reference, such as a different illumination levels or image noise.
We studied motion estimation by phase correlation and wrote a notebook article on it, which you can see on this Jupyter notebook page.
We started with a known test image from which we could see the true motion. Here are the first two frames of the pedestrian_area test clip: