αCyber Enhancing Robustness of Android Malware Detection

These systems have been successfully deployed in HG model-based anti-malware industry [8,14,25]. However, successful deployment may also encourage attackers beat HG-based model to bypass detection. Rational look at this issue, as shown by the considerable profits, malicious attackers are complex and dispersed within the organization ecosystem 1, which makes them powerful: by exploiting vulnerabilities and / or use of social engineering means (e.g., induction of installation), the attacker application can be downloaded destruction device command and control (C & C) server and execute them on demand. This mechanism makes it become a reality-based attacks and avoid poisoning attacks HG models: under the cover of the application of the injection, disrupting the relationship between the data HG (non-iid) nature. , Poisoning attacks), the target application (ie, new malware) can be better "protection" to bypass detection (ie avoid attack).

In this paper, based on HG order to explore the robustness of the classifier in Android malware detection, on the basis of previous work [14, 25], we first build a general model based on the classification of HG: We are from the Android application API calls to extract runtime execution sequence to capture their behavior; then we further analyze high-level semantic relations if the two applications and other similar behavior, whether or not co-exist on the same smart phone, its unique international mobile can equipment identity (IMEI) number, and whether they have signed the same developer or by the same company (ie, contact.); after that, we propose a structured HG to model this complex relationship, using embedding method membered path based representation nodes to learn (i.e., node representation). , Apps) is fed back to a downstream classifier. HG constructed based classifiers, we first consider the Android malware's current capabilities and knowledge, a novel and practical attack against HG Data Model (HG- attack). Then, in order to effectively combat malicious attacks HG, we further propose a resilient defense and elegant model (named Rad-HGC) to enhance the robustness of HG classifier in Android malware detection. Commitment from the collection of the effectiveness αCyber Tencent security laboratory demonstration system development (Figure 2), set the model we propose defense Rad-HGC elasticity of actual combat malware attacks HG HG-Attack-based execution of large-scale experimental results and actual samples. The main contribution of this paper work is as follows:

New and useful data for HG against attacks : how to make the best of injecting poison node classifier misclassified

Defense Model: How to find poisoning node (that is, in order to detect the target node (ie, new malware) To solve this problem, without affecting detection accuracy, based on HG enhance the robustness of the model of attack? .

HG-based anti-model of Android malware attack practical and robust system: We get from Tencent security laboratory, two large-scale real sample collection: (1) a first set of data is accumulated historical data, including 70,184 users 1,389,408 uploaded app (i.e., IMEIs) and the resulting HG (i.e., referred to as HG-1, the five different types of entities and six nodes 1,389,408 20,576,125 sides relation type composition); (2) of two sets of data are generated based on the HG-1, the data set further integrates 2817 the mobile user to upload new application 13129 (i.e. IMEI). Based on these data collection, we developed a system named αCyber, set the model proposed defense Rad-HGC elasticity of actual hostility by HG-Attack malicious software attacks.

feature:

Based on the information content and relationships.

Content-Based Feature Extraction: API call sequence runs to capture their behavior. For example, the order (startActivity, checkConnect, sendSMS, finishActivity) API call said they did not consider the user's intention to send SMS messages and malicious Trojan horse of tiger eye.

Based on the feature content of the relationship extraction: (1) R1: app-invoke-API represents the execution of an application at runtime whether to invoke the API call. (2) R2: appexist-IMEI indicates whether the application is present (i.e.) is attached to the smart phone (i.e. IMEI). (3) R3: app-certify-signature means the app through a signature (that is, each application running on the Android platform must be signed by the developer). (4) R4: the package name (also known as Google Play ID) is a unique name to identify a specific application. Companies usually start their package name (for example, com.tencent) using reverse domain name. mobileqq). We extract the domain name from the package name to refer to an application (such as mobileqq) and its affiliation (such as tencent.com); then we generate app- associated to describe an app is associated with a. (5) R5: To show the smart phone has a specific set of application developers signed, we extract the IMEI-have-signature to indicate whether the device has a specific signature. (6) R6: To show a group associated with the specific application of the smartphone installed, we generated imei - owners association to describe whether the smart phone has a particular association.

 

attack:

First generating a set of meta-path guidance random walk, as a training corpus skip-gram model; so, learn embedded largely depends on corpus generated . Poisoning by injection and destination nodes in a given G, the attacker will inevitably change a set of possible random walk, thus affecting the training set and the subsequent embedding. Figure 5 shows that the application has more malicious neighbors, the higher the probability of being classified as malicious. Based on this observation, to detect the target node vt bypass, an attacker can design strategies, to subtly disrupt the relationship injection poisoning node nature of the data ( i.e., by injecting node, the link target node benign optimization application ) . For this reason, in the case where G is given by injection optimally poisoning nodes in G VP, while the predicted probability vt malicious minimized, so that the problem of attack against conversion to maximize the target node vt benign application connection possibilities to help vt bypass the detection. To solve this problem, we first assume vp vp in the vt G successfully injected together, it should be able to vt with G in benign applications (i.e., after injection should have a high connectivity).

vp can only be injected into the lack of resistance of those devices.

 

As the signature can indicate ownership of an application (ie., The signature can only be used with the app developer has a specific corresponding private key), app with the destination node has the same signature if the equipment is installed, we assume that the attacker is able to access the device.

defense:

Infected nodes vt has a lower properties: (1) high connectivity (2) due to its connection malicious applications (i.e. vt) and benign applications, the execution tag paused on the G, it is considered malicious or benign may sex may be low. Therefore, we will one node in G vi probability of being injected poisoning node is expressed as:

 

In this section, we use four sets of experimental research and large-scale authentic sample set performance αCyber ​​from Tencent Security Laboratory overall assessment: (1) We first evaluate the performance of our proposed hostile attack model HG-Attack; (2 ) model Rad-HGC defense raised against the validity of HG were evaluated; (3) of the Rad-HGC sensitivity parameters, scalability and stability evaluation; (4) Finally, we compared the Rad-HGC and performance of other popular Android malware detection system.

Guess you like

Origin www.cnblogs.com/yvlian/p/12469263.html