Android Malware Insights 2018

This page acts as an appendix, in which we include information that complement those in the experiments page.

Features

Here are the features we extracted from the APK's and used in our classification experiments and some details about the tools and techniques we used to extract them

Static Features

Static features, as their name implies, were extracted statically from the APK's in the datasets. In other words, the apps were NOT executed in any virtual environment. The features gather information about the app, its components, permissions, and source code. Those features were extracted using Aion's extractStaticFeatures method under module "data_inference.featureExtraction" module. Lastly, static features were extracted with the help of the static analysis tool androguard. Here's a complete list of the static features extracted from each app:

minSDKVersion: The minimum SDK version supported by the app (as declared in the AndroidManifest.xml file)
maxSDKVersion: The maximum SDK version supported by the app (as declared in the AndroidManifest.xml file)
# activities: The total number of activities declared by the app in the AndroidManifest.xml file
# services: The total number of services declared by the app in the AndroidManifest.xml file
# receivers: The total number of broadcast receivers declared by the app in the AndroidManifest.xml file
# providers: The total number of content providers declared by the app in the AndroidManifest.xml file
# permissions: The total number of permissions requested by the app
AOSP permissions/Total permissions: The percentage of Android permissions in the permissions requested by the app
Declared permissions/Total permissions: The percentage of custom permissions in the permissions requested by the app
Dangerous permissions/Total permissions: The percentage of dangerous permissions in the permissions requested by the app
# classes: The total number of classes found in the classes.dex file
# strings: The total number of strings found in the strings.xml file
# sensitive API calls: The number of calls for every single API package in the sensitiveAPKCalls list found here. Total: 27
compiler: The compiler used to compile the app (as fingerprinted by APKiD). Added separately after extraction.
Total: 40 features

Dynamic Features

Unlike their static counterparts, dynamic features are meant to represent the runtime behavior of apps. In order to extract such features, we deployed each app (malicious and benign) to a Genymotion Android Virtual Machine (AVD) and started it. To simulate human interaction with the app, we used a homemade tool we wrote called Droidutan. Our tool is based on AndroidViewClient and is meant to randomly interact with UI elements of the app. For example, if it finds a button, it will tap it.

We define an app's runtime behavior in terms of the API calls it issues during runtime. To capture the API calls made by an app while being tested/executed using Droidutan, we relied on droidmon, which dumps the sensitive API calls made by an app to the system log. After execution, we gather such dumped calls and represent them as a trace (i.e., sequence), of API calls. Dynamic features are, in essence, counts of every category of API call captured by droidmon and listed in its hooks.json file. The total number of dynamic features is, therefore, 37 features.

Citations

Zhou+2012 @inproceedings{jiang2012dissecting, title={Dissecting android malware: Characterization and evolution}, author={Jiang, Xuxian and Zhou, Yajin}, booktitle={2012 IEEE Symposium on Security and Privacy}, pages={95--109}, year={2012}, organization={IEEE}}
Hurier+2017 @inproceedings{hurier2017euphony, title={Euphony: Harmonious unification of cacophonous anti-virus vendor labels for Android malware}, author={Hurier, M{\'e}d{\'e}ric and Suarez-Tangil, Guillermo and Dash, Santanu Kumar and Bissyand{\'e}, Tegawend{\'e} F and Traon, Yves Le and Klein, Jacques and Cavallaro, Lorenzo}, booktitle={Proceedings of the 14th International Conference on Mining Software Repositories}, pages={425--435}, year={2017}, organization={IEEE Press}}
Li+2017 @article{li2017understanding, title={Understanding android app piggybacking: A systematic study of malicious code grafting}, author={Li, Li and Li, Daoyuan and Bissyand{\'e}, Tegawend{\'e} F and Klein, Jacques and Le Traon, Yves and Lo, David and Cavallaro, Lorenzo}, journal={IEEE Transactions on Information Forensics and Security}, volume={12}, number={6}, pages={1269--1284}, year={2017}, publisher={IEEE}}
Wei+2017 @inproceedings{wei2017deep, title={Deep ground truth analysis of current android malware}, author={Wei, Fengguo and Li, Yuping and Roy, Sankardas and Ou, Xinming and Zhou, Wu}, booktitle={International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment}, pages={252--276}, year={2017}, organization={Springer}}

Photos by Fotogrph | Design by TEMPLATED.

Miscellaneous Information

Features

Static Features

Dynamic Features

Citations