Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/81842
Title: HEVC based screen content coding and transcoding using machine learning techniques
Authors: Kuang, Wei
Advisors: Chan, Yui-lam (EIE)
Siu, Wan-chi (EIE)
Keywords: Digital video
Coding theory
Video compression
Machine learning
Issue Date: 2019
Publisher: The Hong Kong Polytechnic University
Abstract: Screen content video is one of the emerging videos, and it usually shows mixed content with both of nature image blocks (NIBs) and computer-generated screen content blocks (SCBs). Since High Efficiency Video Coding (HEVC) is only optimized for NIBs while SCBs exhibit different characteristics, new techniques are necessary for SCBs. Screen Content Coding (SCC) extension was developed on top of HEVC to explore new coding tools for screen content videos. SCC employs two additional coding modes, intra block copy (IBC) mode and palette (PLT) mode for intra-prediction. However, the exhaustive mode searching makes the computational complexity of SCC increase dramatically. Therefore, in this thesis, some novel machine learning based techniques are suggested to simplify both encoding and transcoding of SCC. A fast intra-prediction algorithm for SCC by content analysis and dynamic thresholding is firstly proposed. A scene change detection method is adopted to obtain a learning frame in each scene, and the learning frame is encoded by the original SCC encoder to collect learning statistics. The prediction models are tailor-made for the following frames in the same scene according to the video content and QP of the learning frame. Simulation results show that the proposed scheme can achieve remarkable complexity reduction while preserving the coded video quality. Afterwards, we propose a decision tree based framework for fast intra mode decision by investigating various features in training sets. To avoid the exhaustive mode searching process, a framework with a sequential arrangement of decision trees is proposed to check each mode separately by inserting a classifier before checking a mode. As compared with the previous approaches that both IBC and PLT modes are checked for SCBs, the proposed coding framework is more flexible which facilitates either IBC or PLT mode to be checked for SCBs such that computational complexity is further reduced. Simulation results show that the proposed scheme can provide significant complexity saving with negligible loss of coded video quality. To avoid the necessity of hand-crafted features, a deep learning based fast prediction network DeepSCC is then proposed by using convolutional neural network (CNN), which contains two parts, DeepSCC-I and DeepSCC-II. Before fed to DeepSCC, incoming coding units (CUs) are divided into two categories: dynamic coding tree units (CTUs) and stationary CTUs. For dynamic CTUs with different content as their collocated CTUs, DeepSCC-I takes raw sample values as the input to make fast predictions. For stationary CTUs with the same content as their collocated CTUs, DeepSCC-II additionally utilizes the optimal mode maps of the stationary CTU to further reduce the computational complexity. Simulation results show that the proposed scheme further improves the complexity reduction. Finally, we propose a fast HEVC to SCC transcoder. To migrate the legacy screen content videos from HEVC to SCC to improve the coding efficiency, a fast transcoding framework is proposed by analyzing various features from 4 categories. They are the features from the HEVC decoder, static features, dynamic features, and spatial features. First, the CU depth level collected from the HEVC decoder is utilized to early terminate the CU partition in SCC. Second, a flexible encoding structure is proposed to make early mode decisions with the help of various features. Simulation results show that the proposed scheme dramatically shortens the transcoding time.
Description: xvi, 145 pages : color illustrations
PolyU Library Call No.: [THS] LG51 .H577P EIE 2019 Kuang
URI: http://hdl.handle.net/10397/81842
Rights: All rights reserved.
Appears in Collections:Thesis

Files in This Item:
File Description SizeFormat 
991022289514203411_link.htmFor PolyU Users168 BHTMLView/Open
991022289514203411.pdfFor All Users3.25 MBAdobe PDFView/Open
Show full item record
PIRA download icon_1.1View/Download Contents

Page view(s)

16
Citations as of May 6, 2020

Download(s)

2
Citations as of May 6, 2020

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.