Adaptive integer kernels and dyadic approximation error analysis for state-of-the-art video codecs

Wang, Qiuwei

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/83921

Title:	Adaptive integer kernels and dyadic approximation error analysis for state-of-the-art video codecs
Authors:	Wang, Qiuwei
Degree:	M.Phil.
Issue Date:	2010
Abstract:	In this thesis, new integer kernels are found and adaptive transform coding techniques are proposed to improve the coding efficiency of state-of-the-art video codecs with detailed analyses. The nonorthogonality error analysis is extended and improved. An error caused by the dyadic fraction approximation due to the integerization of transform coding is defined and followed by deep investigation. The desire for the removal of the mismatch between encoder and decoder has been ever increasing. In the state-of-the-art video coding standard - the H.264/AVC - the transform coding stage was thus integerized to cope with this desire. One of our objectives is to improve the coding efficiency of video codec based on this "integer framework". We propose a new DCT-like integer kernel IK(5,7,3) and revitalize another DCT-like integer kernel IK(13,17,7) for the transform coding process of hybrid video coding. Making use one of these kernels together with the H.264/AVC Kernel IK(1,2,1), we are able to design new multiple-kernel schemes which give better coding performance over that of the conventional approaches. All these schemes make use of the Adaptive Kernel Mechanism (AKM) at macroblock level, which requires heavy computation during the encoding process. We subsequently discovered that a rate-distortion feature extracted from a pair of kernels gives an intrinsic property that can be used to select a better kernel for a two-kernel macroblock-level AKM system. This is a power tool with theoretical interest and practical uses. In order to reduce computation substantially, we make use of this tool to make an analysis and design of a frame-level AKM and come up with a simple solution that the kernel IK(1,2,1) be used for I- and P-Frames and the kernel IK(5,7,3) be used for B-Frames coding. This proposed frame-level AKM is similar, or even better, than the proposed macroblock-level AKM. Furthermore it substantially reduces computation and certainly gives a good improvement in terms of the PSNR and bitrate compared to those obtained from the H.264/AVC default arrangement and other macroblock-level AKM schemes available in the literature. Nowadays, the demand for large-size (e.g. 16×16) integer transform kernels is increasing due to the explosive increase of resolution of videos. However, the orthogonality constraint for designing 16×16 integer kernels is much stronger than that for designing 4×4 kernels. Hence, several kernel designs violating the constraint in a controllable manner which roughly ensures the orthogonality have been proposed. An error analysis by Dong et al. showed that the well-controlled nonorthogonality noise is approximately negligible as compared to the quantization noise. In this thesis, we enhance the original analysis by pointing out three problems found in derivations and also giving two comments. Nevertheless, the problems are defects only, hence do not affect the overall justifications to the nonorthogonality analysis. Although the integerization of transform coding process ensures no mismatch between encoder and decoder, it also introduces a by-product and we define it as the "dyadic approximation error" which can largely affect the visual quality of a reconstructed video sequence. We derive the analytical forms of the dyadic approximation error, and compare the significances among possible error terms (i.e. the quantization error, nonorthogonality error, and dyadic approximation error) using various transform kernels. We conclude that the dyadic approximation error is much larger than the nonorthogonality error, and it is comparable to the quantization error for fine quantization. We point out that the existence of this error is equivalent to scaling each frequency component by a position dependent scalar which is slightly larger or smaller than the unity, and also quantizing them with different stepsizes. Hence in the reconstruction process, many distorted frequency components are used, and eventually a reconstructed frame with frequency artifacts is generated. The conditions to eliminate the effect of dyadic approximation error for 16×16 transform kernels are found by experimental work. On the whole, inspired by the establishment of the "integer framework" since the emergence of the H.264/AVC, we carry out a comprehensive investigation on the old problems under the new constraint, starting from the optimization of coding performance to the analyses of errors.
Subjects:	Hong Kong Polytechnic University -- Dissertations Video compression Coding theory Digital video
Pages:	xiii, 86 leaves : ill. (some col.) ; 30 cm.
Appears in Collections:	Thesis

Access

View full-text via https://theses.lib.polyu.edu.hk/handle/200/6068

Show full item record

Page views

150

Last Week
1

Last month

Citations as of May 4, 2025

Google Scholar^TM

Check

Access

Page views

Google ScholarTM

Google Scholar^TM