Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/91718
Title: Identifying privacy issues in mobile apps via synthesizing static analysis and NLP
Authors: Yu, Le
Degree: Ph.D.
Issue Date: 2021
Abstract: Recent years have witnessed a sharp increase of malicious apps that access or steal users' personal information. To address users' concerns about privacy risks, researchers proposed a promising detection approach that looks for the inconsistency between an app's permissions and its description. Unfortunately, using description and permission will lead to many false positives because descriptions often fail to declare all sensitive operations. Moreover, the permission is coarse-grained, which cannot describe which type of personal information is accessed by the app itself (or third party library). In this thesis, we focus on combining static analysis technique to discover the behaviors contained in bytecode and using natural language processing (NLP) technique to processing software artifacts (i.e., description, privacy policy, and user reviews) so that we can discover the privacy issues precisely. We propose to detect the privacy issues of mobile apps with the following steps: (1) We propose exploiting the app's privacy policy and its bytecode to remove the false alerts of identifying the inconsistency between app's permissions and its description. If users report bugs in user reviews, to help developers fix them, we locate the bugs in app bytecode. (2) If the app developers provide a privacy policy to notify users which types of personal information are accessed by the app, to determine whether these privacy policies are trustworthy or not, we propose a novel approach to automatically identify five kinds of problems in privacy policy. (3) For those apps that do not provide privacy policies, we develop a novel system to automatically construct correct and readable descriptions to facilitate the generation of privacy policy for Android apps. (4) We propose a system to determine if the app complies with privacy requirements or not.
For (1), to remove the false alerts of state-of-the-art systems identifying privacy issues of apps (e.g., AutoCog, CHABADA), we develop a system TAPVerifier, which automatically analyzing the bytecode to discover the over-claimed permissions and extracting the behaviors of accessing personal information from privacy policy. The result shows that our system can remove up to 59.4 percent false alerts of the state-of-the-art systems. If the users describing the function errors in user reviews, we propose a system ReviewSolver to locate these errors in app bytecode by exploiting the context information in user reviews and then correlating the reviews and bytecode through their semantic meanings. The results show that ReviewSolver outperforms ChangeAdvisor in terms of correctly mapping more reviews to code. For (2), we develop a system PPChecker that employs NLP techniques to analyze privacy policies, and adopts program analysis approaches to inspect apps to identify five kinds of problems in privacy policy. Applying PPChecker to 2,500 popular apps, we find that 1,850 apps (i.e., 74.0%) have at least one kind of problems. For (3), to generate privacy policy for apps, we further propose a system AutoPPG that conducts static code analysis to characterize its behaviors related to users' personal information, and then applies NLP techniques to generating correct and accessible sentences for describing these behaviors. Experimental results indicate that: AutoPPG creates correct and easy-to-understand descriptions for privacy policies; the privacy policies constructed by AutoPPG usually reveal more operations related to users' personal information than existing privacy policies. For (4), we first summarize existing privacy requirements from both governments and app markets. Then, we propose a system PrivacyPromoter, to check whether the app bytecode and privacy policy are compliant with these privacy requirements or not. The experimental results show that our system can detect violations with high precision and recall rate.
Subjects: Privacy, Right of
Smartphones
Application software
Mobile computing -- Security measures
Hong Kong Polytechnic University -- Dissertations
Pages: xxvi, 251 pages : color illustrations
Appears in Collections:Thesis

Show full item record

Page views

7
Citations as of May 15, 2022

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.