Published on February 19th, 2023 📆 | 1618 Views ⚑
0FLARE IDA Pro Script Series: MSDN Annotations Plugin for Malware Analysis
https://www.ispeech.org/text.to.speech
The FireEye Labs Advanced Reverse Engineering (FLARE) Team continues
to share knowledge and tools with the community. We started this blog
series with a script for Automatic
Recovery of Constructed Strings in Malware. As always, you can
download these scripts at the following location: https://github.com/fireeye/flare-ida.
We hope you find all these scripts as useful as we do.
Motivation
During my summer internship with the FLARE team, my goal was to
develop IDAPython plug-ins that speed up the reverse engineering
workflow in IDA Pro. While analyzing malware samples with the team, I
realized that a lot of time is spent looking up information about
functions, arguments, and constants at the Microsoft Developer Network
(MSDN) website. Frequently switching to the developer documentation
can interrupt the reverse engineering process, so we thought about
ways to integrate MSDN information into IDA Pro automatically. In this
blog post we will release a script that does just that, and we will
show you how to use it.
Introduction
The MSDN Annotations plug-in integrates information about functions,
arguments and return values into IDA Pro’s disassembly listing in the
form of IDA comments. This allows the information to be integrated as
seamlessly as possible. Additionally, the plug-in is able to
automatically rename constants, which further speeds up the analyst
workflow. The plug-in relies on an offline XML database file, which is
generated from Microsoft’s documentation and IDA type library files.
Features
Table 1 shows what benefit the plug-in provides to an analyst. On
the left you can see IDA Pro’s standard disassembly: seven arguments
get pushed onto the stack and then the CreateFileA function is called.
Normally an analyst would have to look up function, argument and
possibly constant descriptions in the documentation to understand what
this code snippet is trying to accomplish. To obtain readable constant
values, an analyst would be required to research the respective
argument, import the corresponding standard enumeration into IDA and
then manually rename each value. The right side of Table 1 shows the
result of executing our plug-in showing the support it offers to an analyst.
The most obvious change is that constants are renamed automatically.
In this example, 40000000h was automatically converted to
GENERIC_WRITE. Additionally, each function argument is renamed to a
unique name, so the corresponding description can be added to the disassembly.
Table 1: Automatic labelling of standard
symbolic constants
In Figure 1 you can see how the plug-in enables you to display
function, argument, and constant information right within the
disassembly. The top image shows how hovering over the CreateFileA
function displays a short description and the return value. In the
middle image, hovering over the hTemplateFile argument displays the
corresponding description. And in the bottom image, you can see how
hovering over dwShareMode, the automatically renamed constant displays
descriptive information.
Functions
Arguments
Constants
Figure 1: Hovering function names,
arguments and constants displays the respective descriptions
How it works
Before the plug-in makes any changes to the disassembly, it creates
a backup of the current IDA database file (IDB). This file gets stored
in the same directory as the current database and can be used to
revert to the previous markup in case you do not like the changes or
something goes wrong.
The plug-in is designed to run once on a sample before you start
your analysis. It relies on an offline database generated from the
MSDN documentation and IDA Pro type library (TIL) files. For every
function reference in the import table, the plug-in annotates the
function’s description and return value, adds argument descriptions,
and renames constants. An example of an annotated import table is
depicted in Figure 2. It shows how a descriptive comment is added to
each API function call. In order to identify addresses of instructions
that position arguments prior to a function call, the plug-in relies
on IDA Pro’s markup.
Figure 2: Annotated import table
Figure 3 shows the additional .msdn segment the plug-in creates in
order to store argument descriptions. This only impacts the IDA
database file and does not modify the original binary.
Figure 3: The additional segment added
to the IDA database
The .msdn segment stores the argument descriptions as shown in
Figure 4. The unique argument names and their descriptive comments are
sequentially added to the segment.
Figure 4: Names and comments inserted
for argument descriptions
To allow the user to see constant descriptions by hovering over
constants in the disassembly, the plug-in imports IDA Pro’s relevant
standard enumeration and adds descriptive comments to the enumeration
members. Figure 5 shows this for the MACRO_CREATE enumeration, which
stores constants passed as dwCreationDisposition to CreateFileA.
Figure 5: Descriptions added to the
constant enumeration members
Preparing the MSDN database file
The plug-in’s graphical interface requires you to have the QT
framework and Python scripting installed. This is included with the
IDA Pro 6.6 release. You can also set it up for IDA 6.5 as described
here (http://www.hexblog.com/?p=333).
As mentioned earlier, the plug-in requires an XML database file
storing the MSDN documentation. We cannot distribute the database file
with the plug-in because Microsoft holds the copyright for it.
However, we provide a script to generate the database file. It can be
cloned from the git repository at https://github.com/fireeye/flare-ida
together with the annotation plug-in.
You can take the following steps to setup the database file. You
only have to do this once.
- Download and install an offline version of the MSDN
documentationYou can download the Microsoft Windows SDK MSDN
documentation. The standalone installer can be downloaded from http://www.microsoft.com/en-us/download/details.aspx?id=18950.
Although it is not the newest SDK version, it includes all the
needed information and data extraction is straight-forward.As shown
in Figure 6, you can select to only install the help files. By
default they are located in C:Program FilesMicrosoft
SDKsWindowsv7.0Help1033.Figure 6: Installing a local copy of
the MSDN documentation - Extract the files
with an archive manager like 7-zip to a directory of your
choice. - Download and extract tilib.exe from Hex-Ray’s
download page at https://www.hex-rays.com/products/ida/support/download.shtmlTo allow the plug-in to rename constants, it needs to know
which enumerations to import. IDA Pro stores this information in
TIL files located in %IDADIR%/til/. Hex-Rays provides a tool
(tilib) to show TIL file contents via their download page for
registered users. Download the tilib archive and extract the
binary into %IDADIR%. If you run tilib without any arguments and
it displays its help message, the program is running
correctly. - Run MSDN_crawler/msdn_crawler.py
With these prerequisites fulfilled, you can run
the MSDN_crawler.py script, located in the MSDN_crawler directory.
It expects the path to the TIL files you want to extract (normally
%IDADIR%/til/pc/) and the path to the extracted MSDN
documentation. After the script finishes execution the final XML
database file should be located in the MSDN_data directory.
You can now run our plug-in to annotate your disassembly in IDA.
Running the MSDN annotations plug-in
In IDA, use File - Script file... (ALT + F7) to open the script
named annotate_IDB_MSDN.py. This will display the dialog box shown in
Figure 7 that allows you to configure the modifications the plug-in
performs. By default, the plug-in annotates functions, arguments and
rename constants. If you change the settings and execute the plug-in
by clicking OK, your settings get stored in a configuration file in
the plug-in’s directory. This allows you to quickly run the plug-in on
other samples using your preferred settings. If you do not choose to
annotate functions and/or arguments, you will not be able to see the
respective descriptions by hovering over the element.
Figure 7: The plug-in’s configuration
window showing the default settings
When you choose to use repeatable comments for function name
annotations, the description is visible in the disassembly listing, as
shown in Figure 8.
Figure 8: The plug-in’s preview of
function annotations with repeatable comments
Similar Tools and Known Limitations
Parts of our solution were inspired by existing IDA Pro plug-ins,
such as IDAScope and IDAAPIHelp. A special thank you goes out to Zynamics for their MSDN crawler and the IDA importer which greatly supported our development.
Our plug-in has mainly been tested on IDA Pro for Windows, though it
should work on all platforms. Due to the structure of the MSDN
documentation and limitations of the MSDN crawler, not all constants
can be parsed automatically. When you encounter missing information
you can extend the annotation database by placing files with
supplemental information into the MSDN_data directory. In order to be
processed correctly, they have to be valid XML following the schema
given in the main database file (msdn_data.xml). However, if you want
to extend partly existing function information, you only have to add
the additional fields. Name tags are mandatory for this, as they get
used to identify the respective element.
For example, if the parser did not recognize a commonly used
constant, we could add the information manually. For the CreateFileA
function’s dwDesiredAccess argument the additional information could
look similar to Listing 1.
|
Listing 1: Additional information enhancing the dwDesiredAccess
argument for the CreateFileA function
Conclusion
In this post, we showed how you can generate a MSDN database file
used by our plug-in to automatically annotate information about
functions, arguments and constants into IDA Pro’s disassembly.
Furthermore, we talked about how the plug-in works, and how you can
configure and customize it. We hope this speeds up your analysis process!
Stay tuned for the FLARE Team’s next post where we will release
solutions for the FLARE On Challenge (www.flare-on.com).
Gloss