Efficient Motion Estimation and Predictive Coding Methods for Compression of Spatio-temporal Sequences.

Thumbnail Image
Shinde, Tushar Shankar
Tiwari, Anil Kumar
Journal Title
Journal ISSN
Volume Title
Indian Institute of Technology Jodhpur
A sudden increase in popularity of digital camera due to ease in availability of imaging devices and smartphones demand to focus on effective storage and transmission mechanism for a large number of images and videos. Video data contains significant redundancies and could be significantly compressed via motion estimation between successive frames. The fact that motion estimation is a time-consuming process has attracted many researchers to improve the computational complexity but at the expense of matching accuracy. The computational complexity of motion estimation becomes of critical importance for real-time video compression applications such as surveillance videos. Moreover,an increase in the importance of human skeleton body joint motion information poses stiff storage challenges. Hence, the Thesis aims to design and develop novel approaches for efficient motion search and compression of Spatio-temporal skeleton sequences. To this end, we studied the effect of various block matching techniques to achieve a trade-off between computational complexity and block matching accuracy. A novel efficient direction-oriented search algorithm has been proposed in this Thesis to address the problem. The proposed algorithm firstly aims to dynamically switch between search regions based on the location of minimum distortion error. The search region dimension is also made adaptive for faster convergence. Then the computational complexity is reduced by using a proposed horizontal, vertical wings diamond search pattern and two 45 inclined hexagon-shaped direction-oriented search patterns. For further speed-up in the search process, partial distortion calculations are employed. A method for selection of optimal threshold value based on the distortion statistics for different partial distortion calculations is presented. The results indicate that significant improvement in speed-up can be achieved while maintaining better matching performance. For directional motion video sequences, the proposed method even outperforms the full search algorithm with a significantly lower computational cost. The massive amount of data in surveillance video coding demands high compression rates with lower computational requirements for efficient storage and archival. Hence, reducing computational complexity is a pressing task, especially for surveillance videos. The presence of significant background proportion in surveillance videos makes its special case for coding. The existing surveillance video coding methods propose separate search mechanisms for background and foreground regions. However, they still suffer from misclassification and inefficient search strategies since it does not consider the inherent motion characteristics of the foreground regions. A background-foreground-boundary aware block matching algorithm is proposed in this Thesis to exploit the unique features of the surveillance videos. A novel three-step framework is presented for boundary aware block matching process. For this, firstly, the blocks are categorized into three classes, namely, background, foreground, and boundary blocks. Secondly, the motion search is performed by employing different search strategies for each category. The zero-motion vector-based search is employed for background blocks. On the other hand, to exploit fast and directional motion characteristics of the boundary and foreground blocks, the eight rotating uni-wing diamond search patterns are presented. Thirdly, the speed-up is achieved through the novel region-based sub-sampled structure. The results demonstrate that two to four times speed-up over existing methods can be achieved through this scheme while maintaining better matching accuracy for surveillance video coding applications. Moreover, the increasing importance of skeleton information in surveillance big data features analysis demands significant storage space. The development of an effective and efficient solution for storage is still a challenging task. To this end, we propose a new framework for the lossless compression of skeleton sequences by exploiting both spatial and temporal prediction and coding redundancies. In this framework, firstly, we propose a set of skeleton prediction modes, namely,spatial differential-based, motion vector-based, relative motion vector-based, and trajectory-based skeleton prediction mode. These modes can effectively handle both spatial and temporal redundancies present in the skeleton sequences. Secondly, we further enhance performance by introducing a novel approach to handle coding redundancy. Our proposed scheme can significantly reduce the size of skeleton data while maintaining exactly the same skeleton quality due to the lossless compression approach. Experiments are conducted on standard surveillance and Posetrack action datasets containing challenging test skeleton sequences. The proposed skeleton coding method outperforms the traditional direct coding methods by about 70%. In conclusion, the Thesis developed the novel mechanisms for efficient motion search and compression of Spatio-temporal skeleton sequences.
Shinde, Tushar Shankar. (2020). Efficient Motion Estimation and Predictive Coding Methods for Compression of Spatio-temporal Sequences (Doctor's thesis). Indian Institute of Technology Jodhpur, Jodhpur.