Malware Bypass Research using Reinforcement Learning

Overview

MalwareRL

Malware Bypass Research using Reinforcement Learning

Background

This is a malware manipulation environment using OpenAI's gym environments. The core idea is based on paper "Learning to Evade Static PE Machine Learning Malware Models via Reinforcement Learning" (paper). I am extending the original repo because:

  1. It is no longer maintained
  2. It uses Python2 and an outdated version of LIEF
  3. I wanted to integrate new Malware gym environments and additional manipulations

Over the past three years there have been breakthrough open-source projects published in the security ML space. In particular, Ember (Endgame Malware BEnchmark for Research) (paper) and MalConv: Malware detection by eating a whole exe (paper) have provided security researchers the ability to develop sophisticated, reproducible models that emulate features/techniques found in NGAVs.

MalwareRL Gym Environment

MalwareRL exposes gym environments for both Ember and MalConv to allow researchers to develop Reinforcement Learning agents to bypass Malware Classifiers. Actions include a variety of non-breaking (e.g. binaries will still execute) modifications to the PE header, sections, imports and overlay and are listed below.

Action Space

ACTION_TABLE = {
    'modify_machine_type': 'modify_machine_type',
    'pad_overlay': 'pad_overlay',
    'append_benign_data_overlay': 'append_benign_data_overlay',
    'append_benign_binary_overlay': 'append_benign_binary_overlay',
    'add_bytes_to_section_cave': 'add_bytes_to_section_cave',
    'add_section_strings': 'add_section_strings',
    'add_section_benign_data': 'add_section_benign_data',
    'add_strings_to_overlay': 'add_strings_to_overlay',
    'add_imports': 'add_imports',
    'rename_section': 'rename_section',
    'remove_debug': 'remove_debug',
    'modify_optional_header': 'modify_optional_header',
    'modify_timestamp': 'modify_timestamp',
    'break_optional_header_checksum': 'break_optional_header_checksum',
    'upx_unpack': 'upx_unpack',
    'upx_pack': 'upx_pack'
}

Observation Space

The observation_space of the gym environments are an array representing the feature vector. For ember this is numpy.array == 2381 and malconv numpy.array == 1024**2. The MalConv gym presents an opportunity to try RL techniques to generalize learning across large State Spaces.

Agents

A baseline agent RandomAgent is provided to demonstrate how to interact w/ gym environments and expected output. This agent attempts to evade the classifier by randomly selecting an action. This process is repeated up to the length of a game (e.g. 50 mods). If the modifed binary scores below the classifier threshold we register it as an evasion. In a lot of ways the RandomAgent acts as a fuzzer trying a bunch of actions with no regard to minimizing the modifications of the resulting binary.

Additional agents will be developed and made available (both model and code) in the coming weeks.

Table 1: Evasion Rate against Ember Holdout Dataset*

gym agent evasion_rate avg_ep_len
ember RandomAgent 89.2% 8.2
malconv RandomAgent 88.5% 16.33


* 250 random samples

Setup

To get malware_rl up and running you will need the follow external dependencies:

  • LIEF
  • Ember, Malconv and SOREL-20M models. All of these then need to be placed into the malware_rl/envs/utils/ directory.

    The SOREL-20M model requires use of the aws-cli in order to get. When accessing the AWS S3 bucket, look in the sorel-20m-model/checkpoints/lightGBM folder and fish out any of the models in the seed folders. The model file will need to be renamed to sorel.model and placed into malware_rl/envs/utils alongside the other models.

  • UPX has been added to support pack/unpack modifications. Download the binary here and place in the malware_rl/envs/controls directory.
  • Benign binaries - a small set of "trusted" binaries (e.g. grabbed from base Windows installation) you can download some via MSFT website (example). Store these binaries in malware_rl/envs/controls/trusted
  • Run strings command on those binaries and save the output as .txt files in malware_rl/envs/controls/good_strings
  • Download a set of malware from VirusShare or VirusTotal. I just used a list of hashes from the Ember dataset

Note: The helper script download_deps.py can be used as a quickstart to get most of the key dependencies setup.

I used a conda env set for Python3.7:

conda create -n malware_rl python=3.7

Finally install the Python3 dependencies in the requirements.txt.

pip3 install -r requirements.txt

References

The are a bunch of good papers/blog posts on manipulating binaries to evade ML classifiers. I compiled a few that inspired portions of this project below. Also, I have inevitably left out other pertinent reseach, so if there is something that should be in here let me know in an Git Issue or hit me up on Twitter (@filar).

Papers

  • Demetrio, Luca, et al. "Efficient Black-box Optimization of Adversarial Windows Malware with Constrained Manipulations." arXiv preprint arXiv:2003.13526 (2020). (paper)
  • Demetrio, Luca, et al. "Adversarial EXEmples: A Survey and Experimental Evaluation of Practical Attacks on Machine Learning for Windows Malware Detection." arXiv preprint arXiv:2008.07125 (2020). (paper)
  • Song, Wei, et al. "Automatic Generation of Adversarial Examples for Interpreting Malware Classifiers." arXiv preprint arXiv:2003.03100 (2020). (paper)
  • Suciu, Octavian, Scott E. Coull, and Jeffrey Johns. "Exploring adversarial examples in malware detection." 2019 IEEE Security and Privacy Workshops (SPW). IEEE, 2019. (paper)
  • Fleshman, William, et al. "Static malware detection & subterfuge: Quantifying the robustness of machine learning and current anti-virus." 2018 13th International Conference on Malicious and Unwanted Software (MALWARE). IEEE, 2018. (paper)
  • Pierazzi, Fabio, et al. "Intriguing properties of adversarial ML attacks in the problem space." 2020 IEEE Symposium on Security and Privacy (SP). IEEE, 2020. (paper/code)
  • Fang, Zhiyang, et al. "Evading anti-malware engines with deep reinforcement learning." IEEE Access 7 (2019): 48867-48879. (paper)

Blog Posts

Talks

  • 42: The answer to life the universe and everything offensive security by Will Pearce, Nick Landers (slides)
  • Bot vs. Bot: Evading Machine Learning Malware Detection by Hyrum Anderson (slides)
  • Trying to Make Meterpreter into an Adversarial Example by Andy Applebaum (slides)
Owner
Bobby Filar
Security Data Science @ Elastic
Bobby Filar
Source code for TACL paper "KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation".

KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation Source code for TACL 2021 paper KEPLER: A Unified Model for Kn

THU-KEG 138 Dec 22, 2022
FastFace: Lightweight Face Detection Framework

Light Face Detection using PyTorch Lightning

Ömer BORHAN 75 Dec 05, 2022
A quantum game modeling of pandemic (QHack 2022)

Contributors: @JongheumJung, @YoonjaeChung, @GyunghunKim Abstract In the regime of a global pandemic, leaders around the world need to consider variou

Yoonjae Chung 8 Apr 03, 2022
Put blind watermark into a text with python

text_blind_watermark Put blind watermark into a text. Can be used in Wechat dingding ... How to Use install pip install text_blind_watermark Alice Pu

郭飞 164 Dec 30, 2022
EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks

EncT5 (Unofficial) Pytorch Implementation of EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks About Finetune T5 model for classification & r

Jangwon Park 34 Jan 01, 2023
Understanding the Generalization Benefit of Model Invariance from a Data Perspective

Understanding the Generalization Benefit of Model Invariance from a Data Perspective This is the code for our NeurIPS2021 paper "Understanding the Gen

1 Jan 15, 2022
RRxIO - Robust Radar Visual/Thermal Inertial Odometry: Robust and accurate state estimation even in challenging visual conditions.

RRxIO - Robust Radar Visual/Thermal Inertial Odometry RRxIO offers robust and accurate state estimation even in challenging visual conditions. RRxIO c

Christopher Doer 64 Dec 29, 2022
Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking We revisit and address issues with Oxford 5k and Paris 6k image retrieval benchm

Filip Radenovic 188 Dec 17, 2022
Implementation for the paper 'YOLO-ReT: Towards High Accuracy Real-time Object Detection on Edge GPUs'

YOLO-ReT This is the original implementation of the paper: YOLO-ReT: Towards High Accuracy Real-time Object Detection on Edge GPUs. Prakhar Ganesh, Ya

69 Oct 19, 2022
A framework for the elicitation, specification, formalization and understanding of requirements.

A framework for the elicitation, specification, formalization and understanding of requirements.

NASA - Software V&V 161 Jan 03, 2023
SkipGNN: Predicting Molecular Interactions with Skip-Graph Networks (Scientific Reports)

SkipGNN: Predicting Molecular Interactions with Skip-Graph Networks Molecular interaction networks are powerful resources for the discovery. While dee

Kexin Huang 49 Oct 15, 2022
Back to Basics: Efficient Network Compression via IMP

Back to Basics: Efficient Network Compression via IMP Authors: Max Zimmer, Christoph Spiegel, Sebastian Pokutta This repository contains the code to r

IOL Lab @ ZIB 1 Nov 19, 2021
Python implementation of NARS (Non-Axiomatic-Reasoning-System)

Python implementation of NARS (Non-Axiomatic-Reasoning-System)

Bowen XU 11 Dec 20, 2022
Implementation of the paper "Fine-Tuning Transformers: Vocabulary Transfer"

Transformer-vocabulary-transfer Implementation of the paper "Fine-Tuning Transfo

LEYA 13 Nov 30, 2022
Fermi Problems: A New Reasoning Challenge for AI

Fermi Problems: A New Reasoning Challenge for AI Fermi Problems are questions whose answer is a number that can only be reasonably estimated as a prec

AI2 15 May 28, 2022
Code for the paper "SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness" (NeurIPS 2021)

SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness (NeurIPS2021) This repository contains code for the paper "Smo

Jongheon Jeong 17 Dec 27, 2022
Official code for "Towards An End-to-End Framework for Flow-Guided Video Inpainting" (CVPR2022)

E2FGVI (CVPR 2022) English | 简体中文 This repository contains the official implementation of the following paper: Towards An End-to-End Framework for Flo

Media Computing Group @ Nankai University 537 Jan 07, 2023
Collective Multi-type Entity Alignment Between Knowledge Graphs (WWW'20)

CG-MuAlign A reference implementation for "Collective Multi-type Entity Alignment Between Knowledge Graphs", published in WWW 2020. If you find our pa

Bran Zhu 28 Dec 11, 2022
Spectrum Surveying: Active Radio Map Estimation with Autonomous UAVs

Spectrum Surveying: The Python code in this repository implements the simulations and plots the figures described in the paper “Spectrum Surveying: Ac

Universitetet i Agder 2 Dec 06, 2022
StyleGAN2-ADA-training-jupyter - Training custom datasets in styleGAN2-ADA by NVIDIA using Jupyter

styleGAN2-ADA-training-jupyter Training custom datasets in styleGAN2-ADA on Jupyter Official StyleGAN2-ADA by NIVIDIA Paper Training Generative Advers

Mang Su Hyun 2 Feb 24, 2022