This article provides an overview of ten Python libraries that are useful for malware analysis and reverse engineering, along with code examples for each library.
Abstract
The article begins by highlighting the versatility of Python in the field of cybersecurity, noting that it is a language that can be used for a wide range of tasks, from automation to app development. The author then introduces ten Python libraries that are particularly useful for malware analysis and reverse engineering. These libraries include PEfile, Lief, Capstone, Unicorn, Frida-Python, Ctypes, Yara-Python, Struct, Qiling, and Rz-pipe. Each library is described briefly, and code examples are provided to demonstrate how they can be used. The article concludes by encouraging readers to explore the author's Jupyter collection for more details on using Python for malware analysis and threat intelligence.
Opinions
The author believes that Python is a versatile language that is particularly useful in the field of cybersecurity.
The author recommends using the ten Python libraries described in the article for malware analysis and reverse engineering.
The author encourages readers to explore their Jupyter collection for more details on using Python for malware analysis and threat intelligence.
The author suggests that readers can support their work by buying them a coffee or subscribing via the Medium affiliate link.
The author invites readers to share their own favorite Python tools for malware analysis on Twitter.
10 Python Libraries for Malware Analysis and Reverse Engineering
With code example!
It is no secret that in Cybersecurity, Python is probably one of the most versatile languages. It can help you create automation, small snippets, and an even bigger app. Even if you don’t like to code, at some point in your career in cybersecurity, you will find Python useful. I may be biased, but there isn’t a single workday where I don’t use Python, even for the small tasks. 🤓
The good thing about Python is that it’s so easy to use that the community is very good at providing tools and more. Whatever you do, web development, system administration tasks, electronics, malware analysis… There is a Python library to cover you!
In this article, we will review ten very useful python libraries for malware analysis as well as reverse engineering with sample code that you can easily reuse. Stay till the end, you won’t regret it!
Consider becoming a Medium member if you appreciate my content and want to help me as a writer. It cost $5 per month and gives you unlimited access to Medium content. I’ll get a little commission if you sign up via my link and that will help supporting my community projects. Thanks!🤓
1 — PEfile
PEfile is a library used to parse the Portable Executable format. It is very useful for malware analysis as it allows to extract information about the file such as Import Table, headers information and more. It also has some packer detection mechanisms with PEiD signature embedded.
After installing PEfile, to start using it you can simply use the following piece of code. Once the PE loaded, the structure will be parsed and it will be possible to extract informations. The below screenshot shows the output.
pip install pefile
Pefile
2 — Lief
Lief is another Portable Executable parser, it also has the ability to scan Mach-O and ELF files. Lief has more features than Pefile, making it one of the perfect tools in your analysis toolbox for analyzing malicious executables.
To load an ELF or PE file, you can simply use the parse module after installation and import.
pip install lief
Lief
3 — Capstone
Capstone Engine is a framework for binary disassembly. I have been using it for a while for malware analysis. This is a powerful library that allows to disassemble binaries. It is particularly useful if you want to automate some of your reverse engineering analysis or identify known pattern for evasion techniques for example.
The below example shows how to load and disassemble an executable file.
pip install capstone
Capstone
4 — Unicorn
Unicorn engine is a multi-architecture CPU emulator, it allows to emulate some piece of code in the CPU level, which is particularly useful for shellcode analysis for example or simply to emulate a piece of code.
I personally use it to analyse some values at the CPU level, but the possibilities are endless. The below code is extracted from the official website, and this is a really good example to start using it.
pip install unicorn
Unicorn Engine
5 — Frida-Python
Frida is a dynamic binary instrumentation framework. For malware analysis, binary instrumentation is very powerful, and it can be used to extract useful information such as IP address, registry key, mutex… It can also be useful for unpacking malware.
The below example shows how to use it for Mutex extraction.
pip install frida
Frida Python
6 — Ctypes
Ctypes is a Python module, taken from the standard library, which allows, from a Python program, to call functions and procedures located in a compiled library, generally in C (.dll with Windows and .so with Linux).
Basically, it can be used to load compiled functions and use C code with Python. Under Windows OS, it is very useful to load the Windows API for example but also to rewrite pseudo code.
The below example shows how to start a process in suspended mode.
pip install ctypes
Ctypes
7 — Yara-Python
Yara is one of the most used tools for malware research, it is used to create signature detection and very useful for malware hunting. The python library allows using Yara in your scripts with your own set of rules.
If you are not familiar with Yara, I recommend having a look to my cheat sheet here.
pip install yara-python
Yara Python
8 — Struct
Python’s struct module is used to convert native Python data types such as strings and numbers to a byte string and vice versa. It is mainly used to manage binary data stored in files or from network connections, among other sources.
It is very useful in malware analysis when you have to deal with several data types and conversion.
To use it you have to know some basic element that will help you converting the data such as how to represent byte order, size and alignment as well as the format characters.
Byte Order, Size, and Alignment
Format Characters
Struct Example
9 — Qiling
Qiling framework is a nice binary emulation framework. It can be used for malware analysis by emulating a portable executable, elf file and even shellcode.
Once you have obtained the DLL and the registry, it is possible to start emulating an executable.
To emulate an executable, all you need to do is provide the path of you PE and path to the Windows DLL. You can also find one of my script here to extract api dynamically resolved using GetProcAddress.
pip install qiling
Qiling Calc.exe Emulation
10 — Rz-pipe
Finally Rz-pipe, is a python wrapper for Rizin an open-source disassembler which replaces Radare2. The wrapper allows to use the usual commands in python which is very handy for automation and analysis. To work, it requires to have Rizin installed in the path. Once installed you can easily use it to disassemble any binary.
Well done! Thank you for reading this article, I hope it has inspired you for your daily work and helps you sharpen your skills in python for cybersecurity.
If you read it all, it probably means you want to learn more about Python applied to cybersecurity. For a quick start, I recommend checking out my Jupyter collection where you’ll get more details on using python for malware analysis and threat intelligence. In addition, you will find a notebook containing all the code samples in this article.
In the meantime, if you want to stay up to date with my content and future publications, you can subscribe to my newsletter. More Python contents for cybersecurity are upcoming. 😉