This post is an overview of commonly seen basic static analysis techniques that malware analysts often will utilize in the course of their workflow. There exist dozens if not hundreds of utilities to ease the process of malware analysis and every investigator will have their own preferred method or technique which they swear works the best. Many of these utilities perform very similar functions but through slightly different techniques or present the results in a separate manner. Making decisions as to which tool to use is often a matter of the overall goal and how the investigator wishes to achieve said goal.
Malicious software, especially on Windows systems, often comes in the form of a Portable Executable (PE) file format (https://en.wikipedia.org/wiki/Portable_Executable). A binary of this type could exist as an executable, a DLL or some other type of distinct format. PE file structures can be quite complicated and will not be the focus of this post other than to discuss how they can be utilized to learn about different portions of a specific file. Sophisticated malware may attempt to hide information about itself by obscuring portions of the PE file such as hiding the main software entry-point, encrypting run-time information within the resource section or other areas or utilizing dummy-code and data to make it extremely difficult to determine what is important and what is not when performing a static analysis.
An initial observation for a given PE file can be performed with a wide variety of tools; this list includes utilities such as PE Insider, CFF Explorer, PEStudio, LordPE, PEView, PE Explorer and PEiD. Many of these are very similar with main differences lying in the representation of data and the utilized GUI. For this post, we will be utilizing mainly free / open-source utilities to perform all activities. As such, first lets use PEiD to learn some basic information about a PE file and see if we can derive any useful information from the inspected data.
The file we will be utilizing for this initial static analysis is labelled as ‘Potao_1stVersion_0C7183D761F15772B7E9C788BE601D29’ when retrieved from https://github.com/ytisf/theZoo/tree/master/malwares/Binaries/PotaoExpress. All analysis is performed within an isolated VM which lacks a network interface in order to prevent any external communication attempts. PEiD is a relatively simple utility designed to give investigators a first-look into a specific binary, with an image of the results against the specified Potao binary presented below.
We can immediately observe from the line at the bottom that the binary has been packed via UPX, a common utility used to obfuscate the contents of a file and hinder investigations. It should also be noted that the database which accompanies base versions of PEiD can easily be updated to include additional signature recognition. If we click the arrow besides ‘EP Section’ on the right side of the window we can take a look at the PE Sections which are detected, shown below.
The lack of standard PE file sections such as text, rdata, data or idata tends to indicate the binary is either encrypted or otherwise obfuscated and it is likely that running the application will result in further unpacking or decrypting data which is potentially stored at various locations in the file. Fortunately, it is possible to acquire and utilize the Ultimate Packer for eXecutables (UPX) to perform reverse-packing operations, especially when it is immediately obvious which version was used to pack it. This would be more difficult if a custom packing or encryption routine was used to perform this obfuscation but it appears to be generic in this case. So lets open up UPX and see what we can do in terms of unpacking. Since UPX is a command-line utility, it is necessary to run it within a command-window as shown below.
UPX is a relatively simple program but can help malware authors and analysts alike for separate goals. Below is the command I utilized in order to achieve decompression of the packed malware sample.
It is observed that UPX successfully unpacked the given file to ‘file’. Now lets try opening up PEiD once more and see if additional information might be observed, shown below.
We can see that although the specific compiler is still not detected, PEiD is now able to detect the various expected sections for the binary in addition to the potentially correct software entrypoint, to be examined in more detail later. Now that we have successfully unpacked the sample, lets try opening it up in PEView and see if we can learn any additional information about the sample.
It’s possible to learn quite a bit about of information from viewing the raw PE file in a utility such as this. This information can include any potential function exports, utilized resources, various metadata such as date-time of compilation and, most importantly typically, the function and DLL imports called from within the binary. Knowing these can provide some context to further analysis in terms of what to expect from malware execution. Knowing a specific binary calls WININET.dll or CRYPT32.dll can indicate it will attempt to communicate over the internet or utilize cryptographic functions for purposes yet to be determined. Often a malicious executable may store additional code in the resources section and utilize Windows functions such as LoadResource in order to call it later during run-time. Software such as ‘Resource Hacker’ can help to determine if that is the case by performing a detailed inspection of the contained resources in a particular binary file. This malware sample does not appear to have any interesting information contained within the .rsrc section other than a Microsoft Word icon which is presented to the user instead of a standard executable icon, presumably to attempt to trick users into thinking the file is a Word file rather than an application. An example of this is shown below.
Another useful utility in assessing a particular sample is ‘Strings’, which by default attempts to detect and extract ASCII strings which are 4 or more characters in length from the binary. This is another command-line utility and execution is as simple as running ‘strings’ plus the filename you wish to scan. Unfortunately, many of the results are often garbage data but sometimes some interesting strings may appear. A portion of the strings scan for this particular sample is shown below.
We can observe some random ASCII strings as well as some data which references false company names and other fake information. This isn’t particularly useful but understanding the strings a malware contains can be a useful first step to analysis. This can also be utilized in debuggers and disassemblers such as Olly DBG and IDA for further research. Another useful tool for performing basic static analysis is Dependency Walker, which allows analysts to determine specifically what DLLs and dependencies from each the binary is calling upon execution. This can be helpful in providing further context to a specific binary and also in understanding the type of Operating System and Host it was designed for execution on. An image example from this utility is shown below.
Unfortunately, development of Dependency Walker ended around 2006 but we are in luck; a developer on GitHub has re-written it’s core functionality and upgraded it to handle modern Windows dependency features such as API-sets. This update is given at the following link (https://github.com/lucasg/Dependencies). The old version of Dependency Walker can throw lots of errors due to how modern systems handle nested and api-sets and using the more modern version given at the developer’s link above is useful for allowing a more logical and easier analysis for what is a true error and what is a false positive. An image of this software is shown below.
Dependencies / Dependency Walker can help both developers and analysts in understanding the behavior of a particular binary and what it relies upon within the host OS. This can give information relating to the intended target and functionality by understanding what DLLs or functions are being called.
When used together, these tools will provide analysts with a good amount of context to utilize in further advanced static or dynamic analysis. Learning about called imports and functions, stored strings and known dependencies as well as PE information such as entry points, compile time and image base helps to provide context to additional analysis performed on the binary sample. This is by no means an exhaustive list of static analysis tools but only some of the most commonly used utilities. Behavioral analysis and further discussion will be continued in additional postings.
PEiD : http://www.softpedia.com/get/Programming/Packers-Crypters-Protectors/PEiD-updated.shtml
PEView : http://wjradburn.com/software/
Resource Hacker : http://www.angusj.com/resourcehacker/
Strings : https://docs.microsoft.com/en-us/sysinternals/downloads/strings
UPX : https://upx.github.io/
Dependency Walker : http://www.dependencywalker.com/
Dependencies : https://github.com/lucasg/Dependencies