US20220201016A1

Movatterモバイル変換

Info

Publication number: US20220201016A1
Application number: US17/654,652
Authority: US
Inventors: Matthew E. Kelly; Jeffrey Dye; Dan E. Summers; David Arnett; Michael E.H. Dunten
Original assignee: Bank of America Corp
Current assignee: Bank of America Corp
Priority date: 2019-06-28
Filing date: 2022-03-14
Publication date: 2022-06-23
Also published as: US20200412745A1; US11316873B2

Abstract

The system collects startup commands associated with network-attached computing devices. A startup command is automatically executed by a device on which the startup command is stored upon startup of the device and is associated with a device identifier for the device. For each startup command, a corresponding command tag is determined for the startup command. Using the device identifier associated with each startup command and the command tag determined for each startup command, a proportion of the plurality of devices is determined that are associated with each command tag. Based on the determined proportion of the plurality of devices that are associated with each command tag, a suspicious command tag is determined. A report is stored that includes the suspicious command tag, suspicious startup command(s) associated with the suspicious command tag, and the device identifier associated with each suspicious startup command.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No. 16/456,637 filed Jun. 28, 2019, by Matthew E. Kelly et al., and entitled “DETECTING MALICIOUS THREATS VIA AUTOSTART EXECUTION POINT ANALYSIS,” which is incorporated herein by reference.

TECHNICAL FIELD

The present disclosure relates generally to malware detection. More particularly, in certain embodiments, the present disclosure is related to systems and methods for detecting malicious threats via Autostart Execution Point analysis.

BACKGROUND

The operating systems of computing devices generally include multiple startup commands, which may, for example, be associated with Autostart Execution Point (ASEP) entries, or other commands that are initiated automatically upon startup of a computing device. Startup commands generally cause applications and/or services to be automatically executed upon startup of the operating system of the computing device. Startup commands stored in the device (e.g., in a registry associated with the operating system or any other appropriate file location) determine which applications and/or services are automatically executed and how they are executed, when the operating system is started (e.g., booted up or rebooted). In many cases, the applications that are automatically started by these startup commands have a helpful or benign effect on system performance and usability. However, certain startup commands can be put in place by malware or malicious attacks on a device, and these startup commands can cause malware to be automatically executed or reloaded upon startup of the device. These malicious startup commands can result in malware that is persistent and that is reinstalled on the device upon each startup, even after the associated malware had seemingly been removed using a malware detection and removal tool such as antivirus software.

SUMMARY

Certain malware can be detected and removed using existing malware detection and removal tools when the malware is being executed by the infected device or when files known to be associated with the malware are detected on the device. However, in some cases, one or more startup commands related to the malware can persist in the memory of the device (e.g., in a registry entry or any other file location), and malware that was believed to have been removed may be re-installed and re-executed upon startup of the device. Such malicious startup commands result in persistent malware that may be automatically reinstalled even after attempts to remove the malware from the device. Conventional malware detection and removal tools might detect a limited number of malicious startup commands that are already known to be associated with malware (e.g., when a startup command includes a string of characters that matches a string of characters known to be associated with malware). However, malicious startup commands are increasingly designed to avoid such detection by mimicking legitimate startup commands that are associated with trusted startup processes. These malicious startup commands cannot be identified using conventional tools.

In one embodiment, the system described in the present disclosure collects startup commands associated with network-attached computing devices. Each startup command is generally a command that is automatically executed by a device on which the startup command is stored upon startup of the device, and each startup command is associated with a device identifier for the device on which the command is stored. The system determines, for each startup command, a corresponding command tag for the startup command using a verb list. The system determines, using the device identifier associated with each startup command and the command tag determined for each startup command, a proportion of the plurality of devices that are associated with each command tag. The system determines, based on the determined proportion of the plurality of devices that are associated with each command tag, a suspicious command tag. The suspicious command tag is generally associated with a relatively small proportion of the devices (e.g., the suspicious command tag may be associated with less than a threshold proportion of the plurality of devices). The system stores a report that includes the suspicious command tag, one or more suspicious startup commands associated with the suspicious command tag, and the device identifier associated with each suspicious startup command.

The systems and methods described in the present disclosure provide technical solutions to the technical problems and challenges described above by first transforming startup commands associated with a plurality of network-attached computing devices into command tags, which have a standard format that is more amenable to further analysis. A model employing statistical analyses may be used to identify suspicious command tags based on the frequency that each of the command tags is observed in the plurality of devices. Generally command tags that are observed in a smaller proportion for the devices may be more likely to be associated with malware and are good candidates for further review.

The systems and methods described in the present disclosure also improve the underlying operation of computer systems used to detect malware. For example, the systems described in the present disclosure may detect persistent malware more efficiently and effectively while expending fewer processing resources than in previous systems. By transforming the startup commands into command tags, the systems and methods can more effectively and efficiently identify suspicious startup commands. This is because, for example, relationships between related or similar startup commands may not be identifiable from the commands themselves, which may include intentionally misleading strings of text or commands that may appear safe based on inspection of the startup command alone. However, the command tags, which are based on the startup commands and generally reflect the underlying function of different key portions of the commands, are more amendable to the analysis (e.g., statistical frequency analysis) described herein for identifying suspicious command tags and the corresponding suspicious startup commands.

The systems and methods described in the present disclosure may be integrated into practical applications monitoring and detecting malware in devices operated within the network of an entity such as a company, institution, or government agency. For example the systems described in the present disclosure may facilitate detection of malware in company devices to prevent attacks which may compromise sensitive customer or client information.

Certain embodiments of the present disclosure may include some, all, or none of these advantages. These advantages and other features will be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings and claims.

BRIEF DESCRIPTION OF THE DRAWINGS

For a more complete understanding of this disclosure, reference is now made to the following brief description, taken in connection with the accompanying drawings and detailed description, wherein like reference numerals represent like parts.

FIG. 1 is a schematic diagram of an example system for detecting suspicious startup commands;

FIG. 2 is a schematic diagram illustrating the determination of suspicious startup commands using the system ofFIG. 1;

FIG. 3 is a flowchart of a method for operating the system ofFIG. 1 in order to detect suspicious startup commands; and

FIG. 4 is an example of a device configured to implement the system ofFIG. 1 in order to detect suspicious startup commands.

DETAILED DESCRIPTION

As described above, the system and methods described in the present disclosure provide technical solutions to the technical problems discussed above by first transforming startup commands associated with a plurality of network-attached computing devices into more readily analyzable command tags and then using a statistical analyses to identify suspicious command tags that have an increased probability of being associated with malicious startup commands that are associated with malware. The systems and methods described in the present disclosure are more efficient and effective than conventional methods of malware detection and can be used to identify startup commands that are associated with malware that was unknown to a user.

The present disclosure encompasses the recognition of a need to identify not only startup commands that are known to be malicious but also startup commands that display suspicious properties, which suggest that the commands are more likely to be associated with malware. As described above, conventional tools generally cannot account for the wide variety of techniques used to camouflage or mask malicious startup commands in order to make them appear to be legitimate startup commands. The present disclosure also encompasses the recognition that malicious startup commands, even when camouflaged to mimic legitimate startup commands, include artifacts that can be used to determine whether the commands are suspicious (e.g., suspected of being associated with malware). As described in this disclosure, these artifacts can be identified using statistical analysis.

FIG. 1 shows anexample system100 for detecting suspicious startup commands in a plurality ofcomputing devices102. Thesystem100 is generally configured to collectstartup commands104 and associateddevice identifiers106 from a plurality of network-attachedcomputing devices102 and determine a subset of thesestartup commands104 that correspond to suspicious startup commands108 (i.e., with an increased probability of being malicious or associated with malware). In contrast to conventional malware detection systems, thesystem100 is configured to perform functions that facilitate improved detection of previously undefined malicious startup commands, for example, before any symptoms of the associated malware are necessarily detected. These malicious startup commands may otherwise go undetected using conventional tools.

Theexample system100 comprisesdevices102, acollection server110, athreat detection device112, amalware analysis tool114, and adownstream administration component116. Thesystem100 may be configured as shown or in any other suitable configuration.

Devices

102 are generally any computing devices capable of storing and executingstartup commands104. For example,devices102 may be operated on a network that is administrated via theadministration component116. Thedevices102 are configured to allow collection ofstartup commands104 for each device102 (e.g., via extraction by thecollection server110 or by sending to collection server110). Thedevices102 may also be configured to allow themalware analysis tool114 to access files and other information stored on thedevices102. In the illustrative example ofFIG. 1,device102astores asuspicious startup command108 that is associated with malware.

Thecollection server110 is generally a device that is configured to collectstartup commands104 associated with the plurality of network-attachedcomputing devices102 anddevice identifiers106 that link each of the collectedstartup commands104 to thecorresponding device102 from which they were collected. Eachstartup command104 is a command that is automatically executed upon startup of thecorresponding device102 on which the command is stored. As described in greater detail below, each of the startup commands104 generally includes one or more command strings, which may be used by thesystem100 to generate acorresponding command tag122 for eachstartup command104. Thecollection server110 may, for example, be configured to increase the efficiency and accuracy of collecting, extracting, or otherwise receiving startup commands104 from devices102 (e.g., for efficient interfacing withdevices102 and for storage and organization of the collected startup commands104). Thecollection server110 may be implemented using the hardware, memory and interfaces ofdevice400 described with respect toFIG. 4 below.

Thethreat detection device112 is generally a device that is configured to receive the startup commands104 from thecollection server110 and to use these startup commands104 to determine suspicious startup commands108. Thethreat detection device112 includes a memory to store amalicious verb list118, a statisticalthreat analysis model120, received startup commands104, command tags122 generated from the received startup commands104, suspicious command tags124 identified amongst the command tags122 using thestatistical model120, and suspicious startup commands108 associated with the suspicious command tags124. Thethreat detection device112 may also store one ormore reports128 generated by thethreat detection device112 and a database of known malicious command tags126. In contrast to thecollection server110, thethreat detection device112 may be configured to facilitate efficient transformation of the startup commands104 intocommand tags122 and to facilitate statistical analyses associated with thestatistical model120 used to identify suspicious startup commands108. Thethreat detection device112 may be implemented using the hardware, memory and interfaces ofdevice400 described with respect toFIG. 4 below.

For each of the startup commands104, thethreat detection device112 determines acommand tag122 using themalicious verb list118. Theverb list118 includes a predefined tag for each of a set of known command strings. A command string generally represents a portion of a known command. For example, a command string may be associated with an identifier of an executable application (e.g., powershell.exe or rundll.exe) or a storage location of a files used by application (e.g., “temp” for a temporary file storage location). As described in greater detail below, eachcommand tag122 generally includes one or more tags that correspond to predefined command strings that appear in thecorresponding startup command104.

Thethreat detection device112 uses thestatistical model120 to determine, using the command tags122, a subset of the startup commands104 corresponding to suspicious startup commands108. In some embodiments, the suspicious command tags124 are determined based at least in part on the proportion of thedevices102 in which thesame command tag124 is identified. For example, if thesame command tag124 is determined in a small proportion (e.g., in a less than a threshold percentage of monitored devices102), then thecommand tag124 may be asuspicious command tag124, and any startup commands104 associated with thesuspicious command tag124 are suspicious startup commands108. Eachsuspicious command tag124 may be associated with more than onesuspicious startup command108 because different startup commands104 can be associated with thesame command tag122. In some embodiments, suspicious command tags124 are identified using the record of known malicious command tags126 (e.g., by matching a text string of a command tag to a string of text in an entry found in the record of known malicious command tags126). One ormore reports128 may be generated by thethreat detection device112. The report(s)128 may include any information generated by and/or stored in thethreat detection device112. For example, the report(s)128 may include a list of suspicious command tags126 identified by thethreat detection device112 along with the associated suspicious startup commands108. The report(s)128 generally facilitate further analysis of suspicious startup commands108.

Themalware analysis tool114 is generally a device that is configured to receive suspicious command tags124 and/or suspicious startup commands108 from thethreat detection device112 and determine whethersuspect code130 corresponding to the suspicious startup commands108 is associated with malware.Suspect code130 generally includes thesuspicious startup command108 along with any underlying code, data, and/or arguments used by thecorresponding device102 to execute the potentially malware-related processes associated with thesuspicious startup command108. As shown in the illustrative example ofFIG. 1,suspect code130 is identified ondevice102a, and themalware analysis tool114 uses information provided by thethreat detection device112 to retrievesuspect code130 from thesuspect computing device102aand determine whether thiscode130 is safe (i.e., not associated with malware) or malicious (e.g., is associated with malware). Themalware analysis tool114 may be implemented using the hardware, memory and interfaces ofdevice400 described with respect toFIG. 4 below.

Themalware analysis tool114 may further generate one ormore reports132, which may be transmitted to one or both of the downstream administration component116 (e.g., to inform an administrator ofinfected device102a) and the threat analysis server112 (e.g., to update information in themalicious verb list118 and/or the threat analysis model120). For example, thereport132 may be received by thedownstream administration component116 such that an administrator of the network associated withdevices102 may review the results to determine whether further action should be taken (e.g., to quarantine or disable the malware-infecteddevice102a). For example, thereport132 from themalware analysis tool114 may be received by thethreat detection device112 and used to identify additional terms or phrases to include in themalicious verb list118. Thereport132 may also or alternatively be used by thethreat detection device112 to update statistical information of the model120 (e.g., to update a frequency at which a givenstartup command104 orcommand tag122 is observed in devices102).

Thedownstream administration component116 is generally any computing device operated by an administrative entity associated withdevices102. Thedownstream component116 is configured to receive one or more alerts (e.g., alerts134 and/or136) and/or reports (e.g., reports128 and/or132) from each of thethreat detection device112 and themalware analysis tool114. In some embodiments, thedownstream administration component116 may configure operating parameters of thethreat detection device112,malware analysis tool114, and/orcollection server110. For example, an administrator may use the administration component to update or otherwise modify theverb list118, the list of knownmalicious tags126, thestatistical model120, and any other operating parameters of thethreat detection device112 to adjust how suspicious command tags108 are identified and how results of this identification are reported.

In an example operation of thesystem100 shown inFIG. 1, startup commands104 are collected fromdevices102 by thecollection server110. The startup commands104 include startup commands from malware-containingdevice102aand malware-free devices102bsuch that at least onestartup command104 fromdevice102aincludes some evidence of the presence of malware on thedevice102a. Thecollection server110, for example, may be configured to accessdevices102 on a regular schedule (e.g., once daily) to collect startup commands104 for review by thethreat detection device112. Thecollection server110 may also perform further functions to organize and/or format the collected startup commands104 in any appropriate manner for subsequent analysis by thethreat detection device112.

Threat detection device

112 receives the startup commands104 and uses theverb list118 and thestatistical model120 to identify suspicious startup commands108.FIG. 2 shows an example of the determination of asuspicious startup command224 by thethreat detection device112 based on astartup command202 received fromdevice102a. In general, thethreat detection device112 usesmalicious verb list118 andthreat analysis model120 to identify thesuspicious startup command224.

As shown inFIG. 2, theexample startup command202 includes at least afirst string204, asecond string206, andthird string208. Thefirst string204 may, for example, correspond to an executable application (e.g., “powershell.exe.”) that is used to execute thecommand202. Thesecond string206 corresponds to an action to perform in the executable application associated with thefirst string204. Thethird string208 corresponds to a file location of a file (e.g., a script-containing file) on which to perform the action associated withstring206 in the application associated withstring204.

Thethreat detection device112 uses theverb list118 to transform thestartup command202 into acommand tag210, which has a standard format that is amenable to analysis using thethreat analysis model120. A portion of anexample verb list118 is shown in TABLE 1. As demonstrated in the example of TABLE 1, theverb list118 stores a predefined tag (third column) for each command string (first column) and for certain combinations of command strings (first and second columns). For example, certain command strings, such as “*IEX*” may be associated with a related command string such as “powershell” (fifth row of TABLE 1), such that the string pair of “*IEX*” and “powershell” have a unique tag of “powershell_iex”. Certain command strings, such as “*Temp*,” may not have related command strings or applications such that the associated tag is only based on the command string alone.

TABLE 1

Portion of example verb list

Command string	Related string	Tag

Temp	none	temp
\Temp\	none	temp_path
-Version	powershell	powershell_version
IEX	powershell	powershell_iex
invoke-expression	powershell	powershell_iex_2
NoP	powershell	powershell_noprofile
hidden	powershell	powershell_hidden_window
net.webclient	powershell	powershell_webclient_downloadstring
downloadfile	powershell	powershell_downloadfile
downloadstring	powershell	powershell_downloadstring
-Enc	powershell	powershell_encoded
new-process	powershell	powershell_newprocess
frombase64string	powershell	powershell_base64encoding
-ExecutionPolicy	powershell	powershell_ep
BitsTransfer	powershell	powershell_bitstransfer
*ShOpenVerbShortcut	rundll	rundll_ShOpenVerbShortcut
FileProtocolHandler	rundll	rundll_FileProtocolHandler
*javascript	rundll	rundll_javascript
powershell	None	powershell
rundll	none	rundll

Referring again toFIG. 2, thethreat analysis device112 uses theverb list118 to generate thecommand tag210 with afirst tag212 of “powershell,” asecond tag214 of “temp,” and athird tag216 of “powershell_iex” from thecommand202 of “powershell.exe IEX ‘C:\temp\script.ps1’.” Thecommand tag210 is generally a simplified version ofstartup command202. Portions of thecommand202 which are not likely to be associated with malware do not generally have an associated tag (e.g., in the “Tag” column of TABLE 1).

After being generated, thecommand tag210 is processed using thedevice identifiers106 and thestatistical model120, which includes one or more ofstatistical frequency analysis218,threat intelligence analysis220, and/or adatabase222 of command tags known to be malicious in order to determine whether thecommand tag210 is suspicious and has an increased probability of being associated with malware. Thedevice identifiers106 are used to determine in what proportion of thedevices102 eachcommand tag122 is observed.Statistical frequency analysis218 involves evaluating the frequency at whichdifferent command tags122 occur for the plurality ofdevices102 shown inFIG. 1. In general, command tags122 that occur more frequently (i.e., in greater than a threshold proportion of devices102) have a lower probability of being associated with malware, while command tags122 that occur less frequently (i.e., in less than or equal to a threshold proportion of devices102) have an increased probability of being associated with malware, as described in greater detail below.

Statistical frequency analysis

218 may be used for example to determine if thecommand tag210 is very common (e.g., occurring in between about 70% to about 100% of devices102), moderately common (e.g., occurring in between about 40% to about 70% of devices102), uncommon (e.g., occurring in between about 10% to about 40% of devices102), very uncommon (e.g., occurring in between about 1% to about 10% of devices102), or rare (e.g., occurring in less than about 1% of devices102). Thethreat detection device112 may use these threat levels or rankings of the frequency ofcommand tag210 indevices102 and an associated threat ranking to determine whether the associatedstartup command202 is trusted or suspicious. For example, asuspicious startup command224 may have acommand tag210 with a frequency that is in the uncommon range, while a trusted startup command may have a command tag with a frequency in at least the moderately common range.

Statistical frequency analysis

218 may be used to determine a proportion (e.g., percentage) ofdevices102 in which thecommand tag210 is observed, and if thecommand tag210 occurs in less than a threshold percentage ofdevices102, thecommand tag210 is considered a suspicious command tag corresponding to asuspicious startup command224. Different threshold proportions may be used as appropriate for a given application. For example, if it is desired for thesystem100 to be more selective in the identification of suspicious startup commands108, the threshold proportion may be set to a lower value (e.g., of less 1%). For instance, in an example case, the threshold proportion value is set to a relatively selective value of 0.5%, and thesuspicious startup command224 is associated with acommand tag210 that is identified in 1.5% ofdevices102. Since 1.5% is greater than the selective threshold proportion of 0.5%, thethreat detection device112 does not identify thecommand tag210 as asuspicious command tag124. Alternatively, if it is desired for thesystem100 to be more inclusive in the identification of suspicious startup commands108, a higher threshold proportion may be used. For instance, in an example case, the threshold proportion value may be set to a relatively inclusive value of 2% for the samesuspicious startup command224, which is associated with acommand tag210 that is identified in 1.5% ofdevices102. Since 1.5% is less than the more inclusive threshold proportion of 2%, thethreat detection device112 identifies thecommand tag210 as asuspicious command tag124 when the more inclusive threshold value is used.

Threat intelligence analysis

220 generally involves a comparison of thecommand tag210 and/or thecorresponding startup command202 to known malware-related command tags and startup commands, respectively. For example,threat intelligence analysis220 may include determining whether thecommand tag210 matches a tag that is known to be malicious usingdatabase222 of known malicious command tags. For example, themodel120 may compare strings of text in thecommand tag210 to strings of text stored in thedatabase222 of known malicious command tags. Based on a determination of an approximate or exact match, the model determines that thecommand tag210 is associated with asuspicious startup command224. Different matching criteria may be used as appropriate for a given application. For example, an approximate match may correspond to 80% or greater of the strings of text in thecommand tag210 matching text of known malicious command tags stored in thedatabase222. In some embodiment, an exact (i.e., 100%) match or near exact (e.g., greater than 99%) match between the strings of text in thecommand tag210 and the text of known malicious command tags stored in thedatabase222 is used to determine that thecommand tag210 is associated with thesuspicious startup command224. If thecommand tag210 is sufficiently uncommon (e.g., with a statistical frequency with an uncommon or unique ranking and/or that is observed in less than threshold proportion or number of devices102), thetag210 may be flagged for further review by system100 (e.g., using malware analysis tool114) or by an administrator associated withdownstream component116. Thethreat detection device112 may also determine the correspondingsuspicious startup command224 for thetag210. Suspicious startup commands108 ofFIG. 1 includesuspicious startup command224 ofFIG. 2 along with any other suspicious startup commands108 identified by thethreat detection device112.

Returning toFIG. 1, once thethreat detection device112 determinescommand tags122 and suspicious startup commands124, thethreat detection device112 may send an alert134 and/or a report to thedownstream administration component116. For example, the alert134 may be sent if at least one of the command tags122 is determined to have a high probability of being associated with malware on at least one of thedevices102. Thethreat detection device112 determines that a command tag has a high probability of being associated with malware if thecommand tag122 approximately or exactly matches (e.g., according to matching criteria that are the same as or similar to those described above) a predefined command tag known to be malicious from the known malicious command tags126. In order to prevent or reduce the number ofunnecessary alerts134, an alert134 may not be transmitted for other suspicious startup commands126 that are not associated withpredefined command tags126 that are known to be malicious. Instead, as shown inFIG. 1, these suspicious startup commands may be transmitted to themalware analysis tool114 for further evaluation before an alert134 is sent.

Themalware analysis tool114 receives the suspicious startup commands108 and uses these startup commands108 to determinesuspect code130 stored on thedevices102. For example, the malware analysis tool may use information in an internal database and/oraccess devices102 to determinesuspect code130 associated with each of the suspicious startup commands108. In some embodiments, eachsuspicious startup command108 corresponds to one or more instances ofsuspect code130 on one ofdevices102. In other embodiments, an instance ofsuspect code130 may be determined from a combination of suspicious startup commands (i.e., two or more startup commands may be associated with the same single instance of suspect code130).

Themalware analysis tool114 sends a request for thesuspect code130 fromdevices102 and, responsive to this request, themalware analysis tool114 receives thesuspect code130. Themalware analysis tool114 then evaluates whether thesuspect code130 corresponds to the presence of malware. For example, themalware analysis tool114 may test an instance ofsuspect code130 by executing thecode130 in a controlled environment (e.g., a secure processing space of the malware analysis tool114). If thesuspect code130 displays known behaviors of malware (e.g., attempting to access security sensitive applications or services) thesuspicious startup command108 associated with thesuspect code130 is determined to be a malicious startup command.

Based on this analysis, themalware analysis tool114 may generate one ormore alerts136 and/or a report that includes the results of the malware analysis. For example, an alert136 may be transmitted to thedownstream administration component116 to inform an administrator of malicious startup commands identified on one or more of thedevices102. Thereport132 is generally transmitted to theadministration component116 to inform the administrator of results of any analysis performed. The results report may, for example, include a list of one or more startup commands104 that should be flagged for additional review or monitoring by the administrator.

Analysis results and/or other related data from themalware analysis tool114 may also be received by thethreat detection device112, where this information may be used to further improve the identification of suspicious startup commands108 by updating one or both of theverb list118 and thestatistical model120. For example, if anew command tag122 is determined to be associated with malware and thiscommand tag122 has never before been identified by thethreat detection device112, then themalicious verb list118 and/or the list of known malicious command tags126 may be updated to include appropriate entries for identifying thiscommand tag122 in the future and determining that thetag122 is associated with the presence of malware. Moreover, thestatistical model120 may also be updated to include statistical information about this new command tag122 (e.g., a proportion or percentage of thedevices102 in which thecommand tag122 is identified). Thethreat detection device112 may also be configured to monitor statistical information about thisnew command tag122 during ongoing operation so that historical information about the relative frequency of thiscommand tag122 can be monitored over time.

FIG. 3 is a flowchart of amethod300 for detecting one or more suspicious startup commands108 using thesystem100 ofFIG. 1. Thesystem100 may implementmethod300 to identify and report suspicious startup commands108 associated with one or more of thedevices102. In general,method300 facilitates the efficient and effective identification of suspicious startup commands108 indevices102, while also allowing for further evaluation of these suspicious startup commands108 to determine whether these startup commands are malicious (i.e., associated with malware) or safe (i.e., not associated with malware).

Atstep304, the received startup commands104 are transformed intocommand tags122 using the verb list118 (e.g., as described with respect toFIGS. 1 and 2 above). For example, thethreat detection device112 ofFIG. 1 may receive the startup commands104 from thecollection server110, access theverb list118, and compare portions (e.g., command strings) of each startup command to predefined command portions (e.g., the “Command strings” of TABLE 1) and/or related strings associated with the portions (e.g., the “Related strings” of TABLE 1) to lookup the corresponding tag for the startup command portion (e.g., in the “Tag” column of TABLE 1). Startup command portions (e.g., strings) that do not have a corresponding tag are typically flagged for review by an administrator and included in report(s) generated inmethod300. This process is generally repeated for all portions of each of the startup commands104 to identify all relevant tags for thestartup command104. These tags are then appropriately combined (e.g., concatenated) to generate thecommand tag122 for thestartup command104.

Atstep306, thethreat detection device112 determines, for a given command tag, whether thecommand tag122 corresponds to a knownmalicious startup command126. For example, thetool112 may access a record ofpredefined command tags126 that are known to be malicious stored in a database (e.g., database260 ofFIG. 2) and determine whether thecommand tag122 approximately or exactly matches (e.g., according to matching criteria that are the same as or similar to those described above) one of the predefined command tags126. Thethreat detection device112 may determine that acommand tag122 has a high probability of being associated with malware if thecommand tag122 matches apredefined command tag126 known to be malicious.

Atstep308, if thecommand tag122 matches acommand tag126 known to be associated with malware, an alert134 is transmitted by thethreat detection device112. Generally, the alert134 is not transmitted for suspicious startup commands108 that are not determined to have a high probability of being associated with malware instep306. Instead, as shown inFIG. 3,method300 proceeds to step310 to use thethreat analysis model120 to identify suspicious startup commands108 that may be transmitted to themalware analysis tool114 for further evaluation. While the alert134 is generally transmitted to thedownstream component116 to inform an administrator of suspicious startup command(s)108, the alert134 may also or alternatively be transmitted to theinfected device102ato inform a user of thedevice102aof the presence of a malware-relatedstartup command122.

Atstep310, thethreat detection device112 uses thethreat analysis model120 to identify one ormore command tags122 that may be associated with malware. Thethreat analysis model120 generally involves statistical frequency analysis, which is used to identifycommand tags122 that may correspond to malicious startup commands (i.e., that correspond to suspicious startup commands). The suspicious startup commands108 are flagged as candidates for further analysis by device120 (e.g., in

steps

312 and314, described below). Statistical frequency analysis generally involves determining the frequency at whichdifferent command tags122 occur (e.g., the proportion of thedevices102 in which the command tag is observed) and, based at least in part on the frequencies (e.g., or proportions), identify suspicious command tags124 (e.g., as described with respect toFIGS. 1 and 2 above). In general, command tags122 that occur more frequently (e.g., in a larger percentage of devices102) are considered to have a lower probability of being associated with malware or other malicious processes. Meanwhile, command tags122 that occur less frequently (e.g., in a smaller percentage of devices102) are considered to have an increased probability of being associated with malware or other malicious processes.

For example, statistical frequency analysis may be used by thethreat analysis model120 to determine in what proportion (e.g., or percentage) of thedevices102 being monitored eachcommand tag122 is observed and to flagcertain command tags122 as suspicious if thesecommand tags122 occur in less than a threshold proportion (e.g., or percentage) of thedevices102. As an example, statistical frequency analysis may be used to determine that a first command tag is observed in 50% of devices102 (e.g., such that the first command tag is considered “moderately common”). Meanwhile, statistical frequency analysis may be used to determine that a second command tag is observed in 0.5% of devices102 (e.g., such that the second command tag is considered “rare”). Thethreat detection device112 compares these calculated proportions to a predetermined threshold proportion below which acommand tag122 is considered suspicious. For instance, if the threshold is 5%, the first, “moderately common” command tag is not determined to be suspicious, while the second, “rare” command tag is determined to be suspicious. Thethreat detection device112 then determines one or more suspicious startup commands108 that have the suspicious second command tag.

Atstep312, thethreat detection device112 may transmit a suspiciousstartup command report128 to thedownstream administration component116. Thisreport128 generally includes the suspicious startup commands108 identified by thethreat detection device112. For example, thethreat detection device112 may be configured to determine which suspicious startup commands128 to include in a report128 (e.g., based on predetermined report generation parameters) and in which order the suspicious startup commands108 should be presented in the report128 (e.g., based on the frequency of command tags associated with the suspicious startup commands). For example, thethreat detection device112 may generate a list that includes at least a portion of the suspicious startup commands108 that are determined, reorganize the list based on the frequency of eachstartup command108 in devices102 (or the frequency of acommand tag122 associated with each startup command), and transmit thereport128 to thedownstream administration component116.

Atstep314, themalware analysis tool114 receives the suspicious command tags124 and/or the suspicious startup commands108 and retrieves correspondingsuspect code130 from thedevices102 for evaluation. Themalware analysis tool114 may use information in an internal database and/or may access information indevices102 to determinesuspect code130 for each of the suspicious startup commands108. For example, a database may store (e.g., in one or more tables) identifiers of code and/or code locations for a set of known startup commands. Themalware analysis tool114 may use the information in the database to determine whatcode130 to access and where to access thecode130 in a givendevice102. Generally, themalware analysis tool114 sends a request for thesuspect code130 fromdevices102, and, responsive to this request, themalware analysis tool114 receives thesuspect code130. To evaluate thesuspect code130, themalware analysis tool114 may test an instance ofsuspect code130 by executing thecode130 in a controlled environment (e.g., a secure processing space of the malware analysis tool114). If thesuspect code130 displays known behaviors of malware (e.g., attempting to access security sensitive applications or services) thesuspicious startup command108 associated with thesuspect code130 is determined to be a malicious startup command. Themalware analysis tool114 may also or alternatively employ one or more alternate or additional methods of malware detection as appreciated by those skilled in the art.

Atstep316, themalware analysis tool114 determines, based on the results of the evaluation performed instep314, whether thesuspect code130 is malicious code (i.e., determines whether the suspicious startup commands correspond to the presence of malware in one or more of devices102). Atstep318, if malicious code is detected, an alert136 is transmitted by themalware analysis tool114. The alert136 may be transmitted for example to thedownstream administration component116 to inform an administrator of the presence of malware. The alert136 may also be transmitted to theinfected device102a.

Atstep320, themalware analysis tool114 may generate ananalysis report132. Theanalysis report132 may include, for example, a list of one or more startup commands122 that should be flagged for additional review or monitoring by the administrator. The analysis report may be transmitted to thedownstream administration component116 to inform administrator of the results of analyses performed.

Atstep322, theanalysis report132 may be used by thethreat detection device112 to update themalicious verb list118, thethreat analysis model120, and/or the list of known malicious command tags126 associated with thethreat detection device112, as described above. For example, if anew command tag122 is determined to be associated with malware by themalware analysis tool114 and thiscommand tag122 has never before been identified by thethreat detection device112, then themalicious verb list118 may be updated to include appropriate entries for identifying thiscommand tag122 and determining that thecommand tag122 is associated with the presence of malware. Moreover, thethreat analysis model120 may also be updated to include statistical information about this new command tag (e.g., a frequency at which this command tag occurs in devices102).

FIG. 4 illustrates an embodiment of adevice400 configured to implement one or more of the components ofsystem100 illustrated inFIG. 1, such ascollection server110,threat detection device112, andmalware analysis tool114. Thedevice400 comprises aprocessor402, amemory404, and anetwork interface406. Thedevice400 may be configured as shown or in any other suitable configuration.

Theprocessor402 comprises one or more processors operably coupled to thememory404. Theprocessor402 is any electronic circuitry including, but not limited to, state machines, one or more central processing unit (CPU) chips, logic units, cores (e.g. a multi-core processor), field-programmable gate array (FPGAs), application specific integrated circuits (ASICs), or digital signal processors (DSPs). Theprocessor402 may be a programmable logic device, a microcontroller, a microprocessor, or any suitable combination of the preceding. Theprocessor402 is communicatively coupled to and in signal communication with thememory404. The one or more processors are configured to process data and may be implemented in hardware or software. For example, theprocessor402 may be 8-bit, 16-bit, 32-bit, 64-bit or of any other suitable architecture. Theprocessor402 may include an arithmetic logic unit (ALU) for performing arithmetic and logic operations, processor registers that supply operands to the ALU and store the results of ALU operations, and a control unit that fetches instructions from memory and executes them by directing the coordinated operations of the ALU, registers and other components.

The one or more processors implement various instructions. For example, the one or more processors are configured to execute instructions to implement thecollection server110, thethreat detection device112, and themalware analysis tool114. In this way,processor402 may be a special purpose computer designed to implement the functions disclosed herein, such as some or all ofmethod300. In an embodiment, thecollection server110, thethreat detection device112, and themalware analysis tool114 are each implemented using logic units, FPGAs, ASICs, DSPs, or any other suitable hardware. Thecollection server110, thethreat detection device112, and themalware analysis tool114 are configured as described inFIG. 1 above.

Thememory404 stores startup commands104,malicious verb list118,threat analysis model120, command tags122, suspicious startup commands124,suspect code130,report parameters406,alert parameters408,malware analysis utilities410, and/or any other data or instructions. The startup commands104,malicious verb list118,threat analysis model120, command tags122, suspicious startup commands124,suspect code130,report parameters406,alert parameters408, andmalware analysis utilities410 may comprise any suitable set of information, instructions, logic, rules, or code operable to execute the function described herein. Thememory404 comprises one or more disks, tape drives, or solid-state drives, and may be used as an over-flow data storage device, to store programs when such programs are selected for execution, and to store instructions and data that are read during program execution. Thememory404 may be volatile or non-volatile and may comprise read-only memory (ROM), random-access memory (RAM), ternary content-addressable memory (TCAM), dynamic random-access memory (DRAM), and static random-access memory (SRAM).

As described above, each of the startup commands104 is generally a command that is automatically executed when the device on which the command is stored is started up (e.g., turned on, booted up, restarted, etc.). Themalicious verb list118 includes predefined tags for commonly observed portions of startup commands (e.g., as shown in the example of TABLE 1). Themalicious verb list118 is used bydevice400 to transform the startup commands104 into command tags122.

As described above, thethreat analysis model120 is used bydevice400 to identify suspicious startup commands124, and the suspicious startup commands124 are used to determinesuspect code130, which may be accessed by the device to determine whether thesuspect code130 is associated with malware using the malware analysis utilities410 (e.g., which may be used to implement functions of themalware analysis tool114 ofFIG. 1). Results generated by themalware analysis utilities410 generally include a risk level or a probability that a given instance ofsuspect code130 is associated with malware. The results may also be binary such that a givenstartup command104 is determined, for example, to be either “malicious” or “not malicious.”

Thereport parameters406 generally provide information and rules for generating and/or formatting reports generated by device400 (e.g., reports128 and132 ofFIG. 1). The reports may be based on the suspicious startup commands124, thesuspect code130, and/or any results generated by themalware analysis utilities410. For example, the report parameters may be used to configure thedevice400 to determine which suspicious startup commands to include in a report and in which order to present the startup commands (e.g., based on the frequency of command tags associated with the suspicious startup commands).

Thealert parameters408 generally provide information for configuring alerts sent bydevice400. For example, thealert parameters408 may include one or more alert thresholds, which are used to determine whether an alert134 and/or136 should be transmitted for a given suspicious startup command. For example, if a first suspicious startup command has a command tag that exactly matches a predefined command tag known to be associated with malware, the first suspicious startup command may be given a risk ranking of 100%. If a second suspicious startup command has a command tag that shares 80% of the tags in a predefined command tag known to be associated with malware, the second suspicious startup command may be given a risk ranking of 80%. For an example alert threshold is 95%, an alert134 and/or136 would be transmitted by thedevice400 for the first suspicious startup command (i.e., because the risk ranking exceeds the alert threshold), but an alert134 and/or136 would not be transmitted for the second suspicious startup command (i.e., because the risk ranking is less than the alert threshold). The alert threshold can generally be set to any appropriate value according to the needs of the administrators. In typical embodiments, however, the alert threshold is high (e.g., exceeding 90%) to ensure that an excessive number ofunnecessary alerts134 and/or136 are not generated by thedevice400.

Thenetwork interface406 is configured to enable wired and/or wireless communications. Thenetwork interface406 is configured to communicate data between thedevice400 and other network devices, systems, or domain(s). For example, thenetwork interface406 may comprise a WIFI interface, a local area network (LAN) interface, a wide area network (WAN) interface, a modem, a switch, or a router. Theprocessor302 is configured to send and receive data using thenetwork interface406. Thenetwork interface406 may be configured to use any suitable type of communication protocol as would be appreciated by one of ordinary skill in the art.

While several embodiments have been provided in the present disclosure, it should be understood that the disclosed systems and methods might be embodied in many other specific forms without departing from the spirit or scope of the present disclosure. The present examples are to be considered as illustrative and not restrictive, and the intention is not to be limited to the details given herein. For example, the various elements or components may be combined or integrated in another system or certain features may be omitted, or not implemented.

In addition, techniques, systems, subsystems, and methods described and illustrated in the various embodiments as discrete or separate may be combined or integrated with other systems, modules, techniques, or methods without departing from the scope of the present disclosure. Other items shown or discussed as coupled or directly coupled or communicating with each other may be indirectly coupled or communicating through some interface, device, or intermediate component whether electrically, mechanically, or otherwise. Other examples of changes, substitutions, and alterations are ascertainable by one skilled in the art and could be made without departing from the spirit and scope disclosed herein.

To aid the Patent Office, and any readers of any patent issued on this application in interpreting the claims appended hereto, applicants note that they do not intend any of the appended claims to invoke 35 U.S.C. § 112(f) as it exists on the date of filing hereof unless the words “means for” or “step for” are explicitly used in the particular claim.