Video compression is a computationally difficult stage in most digital media production and distribution pipelines. The goal is to deliver the best quality presentation for the number of bits transmitted to the viewer. Practical solutions include the additional functions of image processing, quality modeling, and packaging for distribution. This is the domain of our work.
Solutions exist for these functions. Often what is needed is a new approach to integrate existing components, map them to a new hardware device, or optimize them in some key dimension required for a new application.
Dimensions for optimization
- picture quality
- latency (delay)
- improved control interface
- development time/maintenance cost
Modern encoding standards such as AVC/H.264, HEVC, VP9, or AV1 have many internal tools for finding and removing spatial and temporal redundancy in a sequence of images. It is the video compression encoder’s job to decide coding modes and allocate bits as efficiently as possible. Coding elements are predicted by previously encoded elements so the effects of each mode decision on quality and bit rate can be subtle. We have studied and worked on the internal processes of video encoders over thousands of hours and have much to offer to help others make progress with them.
Low level compression tools in a modern encoder
- block decomposition
- intra prediction
- motion prediction (estimation + compensation)
- domain transform
- entropy coding
- rate-distortion optimization
The typical video pipeline process for an encoding system may include filters for image scaling, color correction, sharpening, and noise reduction. These improve quality for the viewer and make the compression function more efficient.
Other difficult image processing functions that have been used on still pictures are now emerging for video due to the ever increasing computational capacity of processor devices. Object recognition has been performed on video using deep learning convolutional neural networks. Geometric transformations can simulate a view perspective from a different place than the actual camera location. Applying image processing algorithms such as these on video can lead to compelling new applications.
Bad video is plain to see for many viewers, but it is much more difficult for software to determine. Such an algorithm effectively has to model the combined operation of the human visual system and perception by the brain. Video compression encoders have primitive quality models built in to quantify distortion and help make coding mode decisions. More accurate and elaborate models are a current area of research for a number of academic and commercial organizations.
What is a true measure of video quality? It is your opinion, combined with the opinions of other observers. The International Telecommunication Union publishes a number of recommendations on how to assess video quality subjectively using real people to give their opinions. Cascade Stream is currently hosting such test sessions.
Packaging for Distribution
Streaming content accounts for an increasing share of viewers compared to traditional broadcast, cable, and satellite program distribution. Adaptive bit rate (ABR) methods, such as HTTP Live Streaming (HLS) and MPEG DASH, use segments of compressed video that are packaged according to a specification. The video compression encoder itself must be aware of some aspects of the packaging specifications for the system to work correctly.
Streaming media technology has seen a very rapid evolution since the early days when Netflix started putting content on line to supplement its mail-order DVD rental business. Today’s state-of-the-art, like dynamic packaging systems running on edge servers of content data delivery networks (CDN), will soon be obsolete as new approaches, like dynamic transcoding, are enabled by increasingly powerful and efficient processors.