https://kennethghartman.com - Getting Started with YARA

YARA is an open-source tool primarily used in malware research and digital forensics to identify and classify malicious files and activities. It achieves this through a rule-based language that lets you define patterns—whether they’re textual strings or binary sequences—to match against files, memory, or processes. Here are some key points:

Rule Structure: YARA rules typically include a meta section (for descriptive information), a strings section (to define the literal or regular expression patterns), and a condition section (which specifies the logic that determines whether a file is considered a match).
Flexibility and Precision: The language allows for the use of Boolean operators and complex matching conditions, enabling you to craft highly specific signatures. This precision is invaluable when distinguishing between benign and malicious content.
Wide Adoption in Security: Due to its effectiveness, YARA is a go-to tool for incident response teams, threat intelligence analysts, and malware researchers. It’s often integrated into automated scanning workflows and forensic toolkits to help quickly sift through large datasets and pinpoint potential threats.
Efficiency: Its design is optimized for performance, allowing rapid scanning of files and data, which is essential in environments where speed and accuracy are crucial.

YARA stands for “Yet Another Ridiculous Acronym.”

It was created by Victor Alvarez while working at Panda Security as a tool for malware classification and pattern matching. Despite its humorous name, YARA has become a critical tool in malware analysis, threat hunting, and digital forensics.

The best way to run YARA rules on a host depends on the use case, whether it’s a one-time scan, continuous monitoring, remote scanning, or part of a forensic investigation. Here are some approaches based on different scenarios:

Running YARA

1. One-Time or Periodic Scans

If you need to run YARA manually or periodically on a system, use the command-line YARA scanner:

Installation

Linux/macOS:

  sudo apt install yara   # Debian-based
  sudo yum install yara   # RHEL-based
  brew install yara       # macOS

Windows:
Download prebuilt binaries from VirusTotal’s YARA GitHub.

Basic Usage

yara -r rules.yar /path/to/scan/

-r scans recursively through directories.
/path/to/scan/ specifies the directory or file to scan.

Example scanning for malware:

yara -r malware_rules.yar /home/user/

2. Continuous Monitoring (Live Process Scanning)

To scan running processes, use:

yara -p rules.yar

Or specify a process:

yara rules.yar `pidof target_process`

Understanding YARA Rules with Examples

YARA rules are used to define patterns that identify files, processes, or memory artifacts related to malware, exploits, or specific content. Each rule consists of metadata, strings, and condition sections.

Basic YARA Rule Syntax

rule ExampleRule {
    meta:
        author = "Kenneth G. Hartman"
        description = "Detects example binary"
        version = "1.0"
    
    strings:
        $a = "malicious_string"         // ASCII string
        $b = { 6A 40 68 00 30 00 00 }  // Hex pattern (byte sequence)
        $c = /Trojan:Win32\/Agent/      // Regular expression

    condition:
        any of them  // At least one string must match
}

Explanation of Components

meta (Metadata Section)
- Holds descriptive fields such as author, description, version, reference links, etc.
- Does not affect matching but is useful for tracking rules.
strings (Search Patterns)
- Defines text strings, hex byte sequences, or regular expressions to identify malware or artifacts.
- $a, $b, $c are variable names referencing specific patterns.
  - ASCII String: "malicious_string"
  - Hex Sequence: { 6A 40 68 00 30 00 00 }
  - Regex: /Trojan:Win32\/Agent/
condition (Logical Matching Condition)
- Defines how the rule triggers.
- In this case, any of them means if at least one string matches, the rule is triggered.

Different Types of YARA Rules with Examples

1. Detecting a Malware Signature via Multiple String Conditions

rule Malware_Sample {
    meta:
        author = "LTT Security"
        malware_family = "ExampleMalware"
    
    strings:
        $payload1 = "suspicious_function_call"
        $payload2 = "malware.exe"
        $hex_pattern = { E8 ?? ?? ?? ?? 83 C4 04 }

    condition:
        all of ($payload*) and $hex_pattern
}

Explanation

The rule matches if all strings (payload1, payload2) and the hex pattern appear in the file.
The * wildcard allows matching all variables starting with $payload.

2. Using Regular Expressions

rule Suspicious_PowerShell_Command {
    strings:
        $powershell = /powershell.exe\s+-[eE]nc\s+[A-Za-z0-9+\/=]+/
    
    condition:
        $powershell
}

Explanation

This detects Base64-encoded PowerShell commands (often used in attacks).
The regex matches:
- powershell.exe -enc <base64>
- powershell.exe -Enc <base64>

3. Matching an Entire File (File Size & Offset Constraints)

rule LargeExecutable {
    condition:
        filesize > 10MB and filesize < 50MB
}

Explanation

Detects files between 10MB and 50MB.

4. Searching for Malicious Functions in Memory

rule Memory_Allocation_API {
    strings:
        $heap_alloc = "VirtualAlloc"
        $heap_protect = "VirtualProtect"

    condition:
        for any of ($heap_*) : (uint16(0) == 0x5A4D)  // Check for PE file in memory
}

Explanation

Detects Windows API calls (VirtualAlloc, VirtualProtect), which are used for process injection.
uint16(0) == 0x5A4D ensures the file is a PE executable.

5. Detecting XOR-Encoded Malware

rule XOR_Encoded {
    strings:
        $xor_encoded = { EB ?? ?? ?? ?? 31 C9 83 E9 }  // Common XOR decryption stub

    condition:
        $xor_encoded at entrypoint
}

Explanation

Matches a malware XOR decryption stub at the binary’s entrypoint.

Combining Conditions for More Advanced Rules

rule AdvancedMalware {
    meta:
        family = "AdvancedMalware"
        reference = "https://malware-research.com"
    
    strings:
        $s1 = "malicious_string"
        $s2 = { 68 65 6C 6C 6F }  // "hello" in hex
        $s3 = /\/cmd\/exec/

    condition:
        (all of ($s*) and filesize < 1MB) or (uint16(0) == 0x5A4D and $s3)
}

Explanation

Matches if:
- All strings are found AND the file is smaller than 1MB, OR
- The file is a PE executable (0x5A4D = “MZ” header) and contains $s3.

Key Takeaways

✅ YARA rules are flexible—they support ASCII, hex, regex, and logical conditions.
✅ Optimizing conditions helps reduce false positives.
✅ Combining YARA with forensic tools (Velociraptor, GRR, Volatility) enhances malware detection.

How to Determine Which IOCs to Include in a YARA Rule

To create an effective YARA rule, you need to identify Indicators of Compromise (IOCs) that accurately detect the target threat while minimizing false positives. Here’s how to choose the best IOCs for YARA rules.

1. Identify IOC Types for YARA Rules

YARA primarily works with the following IOC types:

a) Unique Strings from Malware Samples

✔ Hardcoded strings within the malware binary
✔ Malware C2 domains, API keys, or campaign identifiers
✔ Commands, error messages, and function names

🔍 Example (Detects hardcoded C2 domain):

rule C2_Domain_Detection {
    strings:
        $c2 = "http://malicious[.]com/api"
    condition:
        $c2
}

b) Hex Byte Patterns (Opcodes & Shellcode)

✔ Detects malware packing, encryption, or XOR routines
✔ Finds shellcode or function hooking techniques

🔍 Example (Detects common shellcode):

rule Shellcode_Pattern {
    strings:
        $shellcode = { 90 90 90 E8 ?? ?? ?? ?? 83 C4 04 }
    condition:
        $shellcode
}

c) File Metadata (Headers, PE Sections, Imports)

✔ Identifies malware families by file properties
✔ Useful for differentiating packed vs. unpacked samples

🔍 Example (Detects a UPX-packed executable):

rule UPX_Packed {
    condition:
        uint16(0) == 0x5A4D and pe.sections[0].name == ".UPX0"
}

d) API Calls (Process Injection, Memory Manipulation)

✔ Malware often calls suspicious APIs like VirtualAlloc, CreateRemoteThread, NtWriteVirtualMemory.
✔ Use function imports and syscalls to detect malicious behavior.

🔍 Example (Detects process injection):

rule Process_Injection {
    strings:
        $alloc = "VirtualAlloc"
        $write = "WriteProcessMemory"
        $execute = "CreateRemoteThread"

    condition:
        all of ($alloc, $write, $execute)
}

2. Extracting IOCs from a Malware Sample

To find reliable IOCs, use tools like:

🔹 Strings Extraction

Linux: strings malware.bin | less
Windows: strings.exe -n 6 malware.exe
For Unicode: strings -el malware.exe

🔹 Hex & Opcode Analysis

xxd: xxd malware.exe | less
Radare2: r2 -AAA malware.exe
Ghidra/IDA Pro for deeper analysis.

🔹 PE Analysis

PEStudio (Windows)
pescanner (Linux)
YARA PE Module

Example to check for suspicious imports:

import "pe"

rule Suspicious_PE {
    condition:
        pe.imports("kernel32.dll", "LoadLibraryA") and
        pe.imports("advapi32.dll", "RegOpenKeyExA")
}

3. Ensuring High-Quality IOCs

To avoid false positives, use these IOC selection strategies:

✅ Use Uncommon Strings

Avoid detecting common OS libraries (e.g., kernel32.dll).
Use malware-specific keywords (e.g., "Stage2_payload.bin").

✅ Check for Variability

Run the malware multiple times and compare variants.
If a string appears only in one sample, it might be unreliable.

✅ Correlate with Threat Intelligence

Cross-check with VirusTotal, Any.Run, or MISP.
Extract IOCs from MITRE ATT&CK techniques.

✅ Use Logical Combinations

Instead of a single condition, combine multiple conditions:

  rule Multi_Layered_Detection {
      strings:
          $s1 = "malicious.com"
          $s2 = "backdoor_key"
          $s3 = { 8B 45 FC 8B 4D F8 }
      condition:
          all of them or (filesize > 500KB and any of ($s*))
  }

4. Where to Get IOCs for YARA Rules

5. Example: Advanced YARA Rule Using Multiple IOCs

import "pe"

rule Advanced_Malware_Detection {
    meta:
        author = "Kenneth G. Hartman"
        description = "Detects APT malware variant"
        malware_family = "APT123"

    strings:
        $s1 = "stealth_mode_enabled"
        $s2 = "cmd.exe /c taskkill /F /IM antivirus.exe"
        $s3 = /https?:\/\/malicious[.]domain[.]com\/.*/
        $hex1 = { 55 8B EC 83 E4 F8 83 EC 20 }

    condition:
        (pe.imports("kernel32.dll", "VirtualAlloc") and all of ($s*)) or $hex1
}

✅ Detects API calls, C2 domains, and process-killing behavior.
✅ Uses regex, hex, and PE import analysis.

Final Checklist for Writing a Good YARA Rule

✔ Use multiple IOC types (strings, hex, API calls, regex, PE metadata).
✔ Test rules on both malware samples and clean files.
✔ Ensure IOCs are unique to the malware family.
✔ Use meta section for tracking rule versions and sources.
✔ Update rules as malware evolves.

Additional Resources

Kenneth G. Hartman

Digital Forensics Expert, Cloud Security Specialist, and SANS Institute Instructor