Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/89767
Title: Learning deep neural networks for image compression and enhancement
Authors: Cai, Jianrui
Degree: Ph.D.
Issue Date: 2020
Abstract: Digital cameras convert the CCD/CMOS sensor data into displayable full-color images by a set of cascaded modules, which are often called image signal processing (ISP), and compress the generated images to save storage space. However, the in-camera ISP may not be effective enough to generate photographically pleasing images due to the limited in-camera computational resources or poor imaging conditions, while the commonly used image compression techniques such as JPEG and WEBP may sacrifice much the image quality. To improve the perceptual quality of camera output images, in this thesis, we aim to develop new image enhancement and compression technologies by learning deep neural network models. For the image enhancement problem, it aims to improve the perceptual quality of an image. Generally, it can be divided into two parts, image restoration and image color mapping. For the image restoration problem, algorithms mainly focus on how to hallucinate the high-frequency detail, while for the image color mapping issue, methods aim to correct the low-frequency color tone. As for the image compression problem, we focus on the lossy image compression (LIC) task, which aims to reduce the storage space while maintaining the image quality. As one of the fundamental image restoration topics, image deblurring aims to remove the blurry artifacts caused by camera shake, object motion, and out-of-focus. In chapter 2, we propose a Dark and Bright Channel Priors embedded Network (DBCPeNet) to plug the channel priors into a neural network for effective dynamic scene deblurring. A novel trainable dark and bright channel priors embedded layer (DBCPeL) is developed to aggregate both channel priors and blurry image representations, and a sparse regularization is introduced to regularize the DBCPeNet model learning. Furthermore, we present an effective multi-scale network architecture, namely image full scale exploitation (IFSE), which works in both coarse-to-fine and fine-to-coarse manners for better exploiting information flow across scales. Single image super-resolution is another important problem in image restoration task. In chapter 3, we build a real-world super-resolution (RealSR) dataset where paired low-resolution (LR) and high-resolution (HR) images on the same scene are captured by adjusting the focal length of a digital camera. An image registration algorithm is developed to progressively align the image pairs at different resolutions. With the new constructed dataset, we can benchmark the real-world single image super-resolution problem. Besides, considering that the degradation kernels are naturally non-uniform in our dataset, we present a Laplacian pyramid based kernel prediction network (LP-KPN), which efficiently learns per-pixel kernels to recover the HR image.
As for the image color mapping problem, image contrast enhancement aims to adjust the contrast of the image, especially when the image is captured under bad lighting conditions (e.g., under/over-exposure). Different from those multi-exposure fusion based solutions, single image contrast enhancement (SICE) improves the visibility of the photo with only the given single low-contrast image. In chapter 4, we propose, for the first time to our best knowledge, to use a CNN to train a SICE enhancer. To achieve this goal, we construct a dataset of low-contrast and high-contrast image pairs. The SICE dataset contains 589 elaborately selected high-resolution multi-exposure sequences with 4,413 images. Thirteen representative multi-exposure image fusion and stack-based high dynamic range imaging algorithms are employed to generate the contrast enhanced images for each sequence, and subjective experiments are conducted to screen the best quality one as the reference image of each scene. With the constructed dataset, a CNN-based SICE enhancer is trained to improve the contrast of an under-/over-exposure image, which demonstrates significantly better performance than previous SICE methods. Finally, in chapter 5, considering that the commonly used LIC methods (i.e., JPEG, JPEG 2000 and WEBP) often introduce visible artifacts (i.e., blurring and ringing), we develop a convolutional neural network (CNN) based lossy image compressor. Specifically, we learn a single CNN to perform LIC at multiple bpp rates. A simple yet effective Tucker Decomposition Network (TDNet) is developed, where a tucker decomposition layer (TDL) is introduced to decompose the latent image representation into a set of projection matrices and a core tensor. By changing the rank of core tensor and its quantization, we can adjust the bpp rate of latent image representation within a single CNN. Furthermore, an iterative non-uniform quantization scheme is presented to optimize the quantizer, and a coarse-to-fine training strategy is introduced to reconstruct the decompressed images. In summary, in this thesis, we present two novel real-world image enhancement datasets, which provide good platforms for researchers to train and test their deep models, and develop several deep neural network models for image enhancement and compression, which demonstrate state-of-the-art performance.
Subjects: Image processing
Hong Kong Polytechnic University -- Dissertations
Pages: xix, 147 pages : color illustrations
Appears in Collections:Thesis

Show full item record

Page views

36
Last Week
0
Last month
Citations as of Apr 28, 2024

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.