Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/91729
Title: Detection, analysis and vulnerabilities tracking of Android third-party libraries
Authors: Zhan, Xian
Degree: Ph.D.
Issue Date: 2021
Abstract: Third-party libraries (TPLs) have become a signifcant part of the Android ecosystem. Developers can employ various TPLs with different functionalities to facilitate their app development. Unfortunately, the popularity of TPLs also brings new challenges and even threats. TPLs may carry malicious or vulnerable code, which can infect popular apps to pose threats to mobile users. Besides, the code of third-party libraries could constitute noises in some downstream tasks (e.g., malware and repackaged app detection). Thus, researchers have developed various tools to identify TPLs. However, no existing work has studied these TPL detection tools in detail; different tools focus on different applications with performance differences, but little is known about them. To better understand existing TPL detection tools and dissect TPL detection techniques, we conduct a comprehensive empirical study to fill the gap by evaluating and comparing all publicly available TPL detection tools based on six criteria: accuracy of module decoupling, effectiveness, efficiency, the capability of version identifcation, the capability of code obfuscation-resilience, and ease of use. We reveal their advantages and disadvantages based on a systematic and thorough empirical study. We also conduct a user study to evaluate the usability of each tool. The results show that LibScout outperforms others regarding effectiveness, LibRadar takes less time than others in TPL detection and is also regarded as the most easy-to-use one, and ORLIS performs the best in defending against code obfuscation techniques. We further summarize the lessons learned from different perspectives, including users, tool implementation, and researchers. Besides, we enhance these open-sourced tools by fixing their limitations to improve their detection ability. Finally, we build an extensible framework that integrates all existing available TPL detection tools, providing an online service for the research community. We make publicly available the evaluation dataset and enhanced tools. We believe our work provides a clear picture of existing TPL detection techniques and also gives a roadmap for future research.
Based on the previous study, we conclude the disadvantages and advantages of these state-of-the-art tools. We find existing tools usually cannot exactly identify the specific tool versions and the recall usually is very low. Hence, propose a new tool ATVHUNTER that can pinpoint the precise vulnerable in-app TPL versions and provide detailed information about the vulnerabilities and TPLs. We propose a two-phase detection approach to identify specific TPL versions. Specifically, we extract the Control Flow Graphs as the coarse-grained feature to match potential TPLs in the TPL database and then extract opcode in each basic block of CFG as the fine-grained feature to identify the exact TPL versions. We build a comprehensive TPL database (189,545 unique TPLs with 3,006,676 versions) as the reference database, which has meaningful industrial value. Experimental results show ATVHUNTER outperforms state-of-the-art TPL detection tools, achieving 90.55% precision and 88.79% recall within 10.27s per app, and is also resilient to widely-used obfuscation techniques and scalable for large-scale TPL detection. TPL is a double-edged sword; more and more severe vulnerabilities are reported in TPLs, which may cause users privacy leakage or financial loss. Besides, various vulnerable TPLs are scattered in different apps and the composition information of an app is not transparent. Without a doubt, that increases more security risks to app users. However, none of the existing research reveals the entire threat landscape of vulnerabilities in various TPLs. Many research questions are unsolved. Only a few previous studies present several case studies. The community still lacks a comprehensive understanding of these vulnerable TPLs. To fill this gap, we leverage ATVHUNTER to conduct a large-scale analysis on open-source and commercial apps to reveal the whole threat landscape of these vulnerable TPLs. To achieve this goal, we first collect 1,196 known vulnerabilities from authoritative vulnerability databases and identify their corresponding carriers, 983 different Android TPLs with their corresponding affected 38,533 versions from about 300 million Android TPLs. Next, we systematically study the dataset and present the full threats landscape to readers. We then exploit ATVHUNTER to conduct a large-scale analysis on these apps. We identify 30,293 apps with vulnerable TPL versions, which account for about 40.2% of all apps with TPLs in our dataset. As for the open-source apps, we find that vulnerable apps account for about 20.7% of the apps with TPLs. We also measure these vulnerable apps from different perspectives and release the relevant data to facilitate future research. Finally, we investigate different parties' responses to the threats of vulnerable TPLs. Meanwhile, we formulate implications for developers and researchers. With the new insights and observations, this study sheds light on the vulnerabilities of TPLs and the security issues in the Android app eco-system, which can benefit developers and researchers. In sum, in this thesis, we studied TPL related work from three aspects. We first compare these state-of-the-art TPL detection tools. Based on the previous research, we design a new TPL detection tool, which can achieve better performance than existing tools. Based on our new tool, we exploit the security problems from Android TPLs.
Subjects: Application software -- Development
Android (Electronic resource)
Open source software
Mobile computing
Hong Kong Polytechnic University -- Dissertations
Pages: xx, 154 pages : color illustrations
Appears in Collections:Thesis

Show full item record

Page views

9
Citations as of May 15, 2022

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.