Post

Part 3 - YARA Rule Engineering and Key Modules.

Approach

To understand the concept of rule engineering, we can break it down into three key phases.

First, we start by identifying the type of target file we’re dealing with. Next, we look for interesting artifacts that can be used to craft the detection rules. Finally, we convert these identified artifacts into patterns.

Desktop View

Building blocks for Rules

In a well-written YARA rule, it’s beneficial to include several key sections, although you can skip some of them if necessary. The essential section you must include is the “condition” section. However, I strongly recommend using all the following sections,

dark mode only light mode only

Believe me, If someone reviews your rule later, they will appreciate having the complete context and rationale behind it.

Here are some key concepts I’ll cover in this section. However, it’s important to note that covering every nuance of rule engineering is not feasible. For a more comprehensive understanding, I recommend the official YARA documentation.

Desktop View

Define Patterns

Patterns are the core of any rule, and the effectiveness of your rules largely depends on your expertise in file analysis. There are three primary ways to define patterns within a rule:

  • ASCII Strings: Define patterns based on plain text strings found within the file.
  • Hexadecimal Byte Sequences: Patterns are defined based on specific sequences of bytes (This is one of the most common approaches).
  • Regular Expressions: Define complex patterns Using regex.

The HEX patterns are written inside { } and REGEX patterns are written inside/ /

It’s good to know how hex patterns are written in the rule. I have covered a few commonly used patterns that may be useful when you get started with defining the rules.

Wildcards{ 6D 81 6C ?? ?? 72 65 }
Alternatives{ (6D | 7D) 61 6C 77 61 72 65 }
Jumps{ 6D 81 6C [1-3] 65 }

Wildcards are used when you don’t know the exact value. You can use placeholder characters ?? to represent unknown bytes.

Alternatives can be used when there is uncertainty. For example, you might specify a pattern to match either “AA” or “BB.”

Jumps are very useful when writing rules. There are different types of jumps you can define when creating rules as shown below.

1
2
3
4
5
[1] -> Junp one byte and look for the pattern defind after this jump.
[1-3] -> Specify the Range and look for the pattern defind after this jump.
[40-80] -> Specify the Range and look for the pattern defind after this jump.
[40-] -> Infinite jump and look for the pattern defind after this jump.
[-] -> Jump from 0 to Infinite and look for the pattern defind after this jump.

Sometimes it can be challenging to write a pattern without using regular expressions. As we know, there is a significant learning curve to crafting effective regex patterns. However, there’s no need to worry now, AI tools like ChatGPT can help define regex patterns for you. Just remember to review any patterns generated by these tools before using them into your YARA rules.

Define Modifiers

To define a string pattern within a rule, the string itself must be declared as a variable. You can use special keywords a.k.a modifiers, to instruct the YARA engine on how to handle these string patterns. Some of the common modifiers I’ve used when writing rules are listed below. As mentioned multiple times in this series, refer to the official YARA documentation for more detailed and up-to-date information.

KeywordString TypesNotes
asciiText, Regexmatch ASCII characters
nocaseText, RegexIgnore case, Text strings in YARA are case-sensitive by default
wideText, Regexmatch UTF16 characters,typical in many executable binaries
fullwordText, Regexmatch only if delimited by non-alphanumeric characters
xorTextsearch for strings with a single byte XOR applied to them
base64Textsearch for strings that have been base64 encoded
base64wideTextsearch for strings that have been base64 encoded & apply Wide

Define conditions

The condition section is mandatory and, in practice, quite straightforward. The condition section defines the criteria for the rule to trigger a successful match. I’ve included a few examples below, which are self-explanatory. More complex examples will be covered in the upcoming section, where I’ll explain their use cases.

1
2
3
4
5
6
7
8
9
all of them
any of them
2 of ($a,$b,$c)
3 of them
4 of ($a*)
$a and not $b
(not $a) and (filesize > 0)
math.entropy(0, filesize) >= 7.0
filesize < 60KB and ( 1 of ($x*) or all of ($s*) )

Define Loops

Logical operators and loops can be used within the rule. Frankly speaking, I haven’t used loops much in most of my rules.

1
2
3
4
for all of them : 
for all of ($a*) :
for any section in pe.sections : ( section.name == ".text" )
for any i in (0..pe.number_of_sections-1) : ( pe.sections[i].name == ".text" )

Define Scope

We can use a few keywords, such as global, private, and include, to define the scope of the rules.

The global keyword helps to enforce restrictions across all rules simultaneously. For example, if you want to apply a rule for files that are less than 1 MB, are Windows PE files, and are NOT signed, you can define these conditions using the global keyword instead of specifying them in each individual rule.

1
2
3
4
5
6
7
8
9
10
11
Import "pe"
global rule GlobalRule
    {
     condition:
         filesize < 1MB and 
     pe.is_pe and
    not pe.is_signed
    }
rule rule1 { …. }
rule rule2 { …. }
rule rule3 { …. }

The private Keyword suppress the output when match on a given file and prevent cluttering the outputs.

1
2
3
4
5
6
private rule rule1 {}
private rule rule2 {}
private rule rule3 {}
rule rule4{ …. }
rule rule5 { …. }
rule rule6 { …. }

The include Keyword help to organize the rules in multiple rule files. For example webshell.yara rule can include multiple webshell yara rules as shown below.

1
2
3
4
5
6
include "/sftp//yara/includes/c99.yar"
include "/sftp//yara/includes/chinaC.yar"
include "/sftp//yara/includes/asp.yar"
include "/sftp//yara/includes/php.yar"
include "/sftp//yara/includes/iis.yar"
include "/sftp//yara/includes/other.yar"

Define YARA Modules

Modules have functions which can be used when writing a YARA rule. They often do the heavy lifting so that we can write less code when developing rules. Consider it as modules we import in programming languages such as Python in order to reuse existing code to achieve something. Here are some of the key modules available when writing YARA rules. YARA developers continuously add new features to existing modules and create new ones, so be sure to check the official documentation for the most up-to-date details.

Desktop View

There is one caveat, though. The modules are highly dependent on the YARA version, hence it is important to check which YARA version is running on the target application. If the target application is running with a lower YARA version than the one you tested the rules with, then there may be a chance that it won’t work. This is one reason people often complain that the rules work perfectly fine in the test environment but not in production. Also, note that some of the tools did not implement all the YARA features into their tech stack due to various performance reasons. Read the documentation of the respective tech stack before writing the rules and include only the modules supported by the target tool

Here is the link to all available modules. Have a look before writing your next YARA rule; there may be something already there.

PE Module

The PE module exposes most of the fields present in a Microsoft Windows PE file format header. here are some of the commonly used functions when writting rules related to .exe and .dll ,

1
2
3
4
pe. is_pe
pe.timestamp 
pe.signatures.*
pe.signatures.serial 

Console Module

The console module helps the analysts in writing and debugging rules by logging information during execution, such as PE header details. I’ll primarily use this module for debugging rules or for file analysis itself.

VT Module

The VT (VirusTotal) module is a significant topic and is covered in greater detail in another section of this series. This module offers a wide range of features that can be utilized on the VirusTotal platform for both live and retrospective hunts.

Other commonly used modules are listed below:

1
2
3
4
math.entropy
dotnet.number_of_resources
hash.sha256
magic.mime_type

Identify File Types

When working with YARA, you may encounter different types of files, and identifying the file type can sometimes be daunting. However, don’t worry YARA provides multiple ways to define the file type, particularly based on byte sequences and their locations.

Desktop View

We can write rule conditions that depend on data stored at a certain file offset or memory virtual address, using the following functions,

int 8/16/32reads 8, 16, and 32 bits signed integers - little-endian format
uint 16/32reads 16, and 32 bits signed integers - little-endian format
int 8/16/32bereads 8, 16, and 32 bits signed integers - big-endian format
uint 16/32bereads 16, and 32 bits signed integers - big-endian format

In a little-endian format the byte order is reversed with the most significant byte on the right

Here are some of the most commonly used byte sequences, also known as magic numbers, that I have come across while writing rules:

Magic NumberDescription
uint16(0) == 0x5a4dMZ signature at offset 0
uint16be(0) == 0x4D5AMZ signature at offset 0
uint16(0) == 0x457fLinux ELF signature at offset 0
uint32be(0) == 0x7f454c46Linux ELF signature at offset 0
uint32(0) == 0xfeedfaceMacOS macho2
uint32(0) == 0xfeedfacfMacOS macho64
uint32(0) == 0xcefaedfeMacOS macho64_2
uint32(0) == 0xcffaedfeMacOS macho64_3
uint16(0) == 0xcfd0Word/Office Document
uint32(0) == 0x74725C7Brtf signature at offset 0
uint32(0) == 0x52617221rar signature at offset 0
uint32(0) == 0x04034b50zip signature at offset 0
uint16(0) == 0x1f8bgzip signature at offset 0
uint32(0) == 0x377abcaf7zip signature at offset 0
uint32(0) == 0x75737461tar signature at offset 0
uint16(0) == 0x004cWindows lnk signature at offset 0
uint32(0) == 0x25504446pdf signature at offset 0

I’ve tried to cover some key approaches to rule engineering and provided real-life examples. However, this is a broad and evolving topic. I highly recommend referring to the official documentation and other resources available online to learn more about rule engineering.

Desktop View

This post is licensed under CC BY 4.0 by the author.

Trending Tags