Abstract
Repackaged Android apps are the major source of Android malware, which not only compromise the pecuniary profit of original authors, but also pose threat to security and privacy of mobile users. Although a large number of birthmark based approaches have been proposed for Android repackaging detection, the majority of them heavily rely on the code instruction details, thus suffering from the following two limitations: (1) subject to code/resource obfuscation technologies; (2) fail to large scale repackaging detection. In this paper, we propose a novel behavior based approach for Android repackaging detection to meet scalability and obfuscation-resilience at the same time. As the repackaged app always keeps the basic functionalities of the original one for leveraging its popularity, they usually have similar behaviors. This observation inspires us to design the new behavior based birthmark for Android repackaging detection, namely, API dependency graph. To further improve the detection performance, we also introduce a system dependency summary graph based ADG extraction approach for high efficiency birthmark construction. We implement a prototype system named ACFinder and evaluate our system using 13,917 apps of 22 categories collected from APK-DL. Experiments show that ACFinder can extract behavior birthmark efficiently (average 52.9s per app), and that our behavior birthmark is resilient to complex code obfuscation technologies (average app similarity all are 1.0 for 11 code obfuscation algorithms) and capable to large scale detection (average 0.37s per app pair).