Button-Pusher to MasterBuilder: Automating SIEM Workflows
Introduction
In this article I’ll deviate from my recent focus on threat research and detection engineering to highlight an important skill set: automation. The ability to program the steps of an otherwise time-consuming, repetitive, and/or manual process is a valuable asset in cybersecurity. In my 2+ years in cybersecurity (and my dozen years in Geographic Information Systems before that), I’ve seen the impact that automation has on careers, operations, job satisfaction, and professional growth. Investing time in learning automation has catapulted my career forward in ways I could never possibly have imagined when I signed up for a “Python for GIS” class in 2012. So, if you are interested in building your security automation skills and want to practice some concrete, hands-on development tasks that will help you get started, read on. You may be about to take your first step towards being a MasterBuilder!
A Little More About Automation
Before we proceed, here is a bit about how I approach automation and fit it into my work. The term “automation” is used so frequently as to almost lose its meaning. If a newly-minted cybersecurity professional or student were to Google “cybersecurity automation” and take the top search results at face value, they would be at risk of getting off track. Below I share a couple of observations and questions, which could cause confusion and make it hard for an early-career security professional to find their pathway in the world of cybersecurity automation:
- A top result states “Cybersecurity automation is a concept used to describe advanced systems powered by artificial intelligence (AI) and machine learning (ML).” (Really?)
- Are alert rules automation? After all, alert rules monitor streams of telemetry, notifying their human overlords when patterns or thresholds are met.
- Can automation be achieved “off-the-shelf” using a product? Plenty of vendors offer “Automations” in their product suite. Should I focus on that?
- Is everything already automated? Is it worth pursuing when Chat-GPT will nullify the career benefits of spending time learning how to code?
First off, everything is NOT automated, and ChatGPT is not going to make programming obsolete. From what I’ve seen, the demand for practitioners who really know how to code is greater than ever. Every organization faces unique problems that can be solved through automation, so the idea that a single vendor or out-of-the-box suite of products will solve those problems is not correct. For a great breakdown of the role that automation and coding can play in a cybersecurity career, I highly recommend Ricky Tan’s video “Is Coding Important for Cyber Security?” which I’ve included below:
It’s tough, because while a lot of students and early-career security professionals want to automate, there seems to be a real gap between Intro to Python classes (where you might code a simple port scanner or password generator), and practical, useful scripting skills. I hope this article can help bridge that gap!
My Approach to Cybersecurity Automation
I like to think about automation in terms of opportunities, rewards, risks, and costs. That’s right, there are risks and costs to automation! Since no one is born an elite coder, it costs time, effort, and sometimes money to get the experience and training that make someone capable of productively automating things.
The risks can be tremendous as well. Ever heard of the crash of Air France Flight 447? In that terrible disaster, pilots who had become over-reliant on automated systems failed to engage their superior human judgement and decision-making skills in time, leading to delayed and poor decisions and ultimately a tragic loss of life. I’ll share another (far less horrific) story of automation risk from my personal experience later in the article when we get into the walkthrough, so keep reading!
How about the good stuff, like opportunities and rewards? Some of the rewards I’ve gained (for myself and my employers) from automation are time savings, cost savings, improved job satisfaction, and the opportunity to focus on the more complex, higher-order aspects of my job as a security professional.
The point of automation is to focus more of your time on the stuff that can’t be automated.
As you confidence and skill, opportunities start to pop out at you. Once you’ve leveled up a few times, you will start to see these opportunities in terms of the components you have, and can visualize in your mind how to put them together almost in real time (even if you don’t know exactly what to do at each moment), kind of like the MasterBuilder concept from the Lego Movie.
But I digress! I put automation opportunities in the following [loose] categories:
- Recurring, regularly-scheduled tasks that require no human interaction. Data management, synchronization, and reporting frequently fall under this category. We’ll cover one of these in the examples below!
- User-operated tools which require a limited set of inputs from a human and complete otherwise repetitive or monotonous tasks on their own. These are worth building and sharing when the same process must be completed multiple times by multiple people with varying sets of input parameters.
- Ad-hoc automations, which automate a task that comes up once in a while, but not frequently enough to necessitate a full-blown tool. These sometimes develop into tools, though, so we definitely want to keep track of them!
- Event-driven automation, which I still need to learn more about, so they won’t be in the scope of this article. If you have some favorite examples of event-driven automations or projects you are working on, please share them in the comments!
The Writeup (Finally)
In the following sections, I’ll share step-by-step how I would go about building some scripts and tools representing opportunities #1–3 from the list above. All of these examples will be built using a modern SIEM platform, real APIs, and industry-standard components as used by detection engineers, security analysts, and other cybersecurity professionals.
For opportunity 1, we’ll write a script that could be deployed on a server as a scheduled task to pull in malware IOCs from Abuse.ch Feodo Tracker (a fantastic open-source threat intel feed) into a threat intel source within our SIEM. This script could be scheduled to run on a recurring basis to continually refresh the IOC-based alerts viewed by our analysts.
For opportunity 2, we’ll create a CLI script tool to find and close investigations left open by our colleagues in the SOC. This will help keep our imaginary queue tidy so our analysts can stay focused on what’s important.
For opportunity 3, we’ll assume the role of junior threat hunter, and look at ways to convert Sigma rules on an emerging threat into our SIEM’s query format, then save those to our list of saved queries so we can use them to initiate our hunts.
Ready to start your journey from button-pusher (🖱️) to MasterBuilder (🏗️)? Let’s get it!
Setting Up
To demonstrate these automation processes, I’ll use Rapid7 InsightIDR, a modern cloud-based SIEM product from a leading vendor in the field. If you’d like to follow along, head over to the InsightIDR Free Trial page and sign up. You’ll have to go through the usual product sign-up steps but what you end up with should be something that looks like this:
From there, you’ll need an Application Programming Interface (API) key. If you’re new to APIs, think of them as the underlying functional components that will enable us to automate the steps of our workflow…sort of like Lego bricks make up a complete set!
The beauty of APIs is that pretty much any modern security tool that uses HTTP(S) to receive instructions and output and/or change data will have one. This means that with curiosity, creativity, and persistence, you can apply the automation skills you’ll learn in this article to virtually any tool that you get stuck with using (ahem, get the opportunity to use) in your day-to-day life as a security professional.
Now back to the walkthrough! In the upper-right hand corner of the InsightIDR dashboard, click the gear icon and go to API Keys:
Create a new User Key and store the value securely, such as in your password manager. You’ll use this API key in lieu of a username and password to authenticate to the SIEM from your code.
Next, we’ll set up the Python environment that we will use to create and run our tools. You can do that here:
To create our scripts and tools, we’ll use InsightIDR4Py, a Python module that I wrote to interact with the InsightIDR API. You can read more about InsightIDR4Py here (give it a ⭐ if you like it!):
To install the module, you can just run the following command from your command prompt:
python -m pip install InsightIDR4Py
This command should install InsightIDR4Py and all its dependencies into your Python environment. And with that, we are good to go!
“But wait!” you say… “What IDE should I use???”
To that question I offer this simple response:
Task 1: Threat Feed Integration
To integrate Abuse.ch Feodo Tracker IOCs into our InsightIDR SIEM, we’ll follow these steps:
- Create a Threat (threat feed containing indicators that is evaluated for matches on your ingested logs) within the InsightIDR platform. Learn more about InsightIDR threats here.
- Retrieve the unique identifier key for our newly-created threat.
- Retrieve IP and domain IOCs from the Feodo Tracker recommended JSON blocklist. This blocklist is maintained by Abuse.ch to minimize false positives and track active C2 infrastructure. Learn more about Feodo Tracker here.
- Write a script to refresh the IOCs in our InsightIDR threat with the current, active indicators from Feodo Tracker. This script could then be scheduled using Windows task scheduler or a Linux cron job to keep the threat feed updated on a regular cadence.
To create the threat in InsightIDR, you can click on Detection Rules in the left-side menu and then “Community Threats” along the top. Then click “Add Threat” in the upper-right. Add in a name, description, and at least one indicator, such as example.com, to get started. Don’t worry about which indicator you choose, we’re going to replace it momentarily. Or, if you want to create the threat using Python, run the following:
# import the required module
import InsightIDR4Py as idr
# reference the API key you created during setup
api_key = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
# connect to your InsightIDR instance using the API key
api = idr.InsightIDR(api_key)
# create a dummy indicator object
indicators = {"domain_names": ["notrealwebsite.fake"]}
# create the threat
name = "Feodo Tracker IOCs"
desc = "Contains IP and domain indicators from the Feodo Tracker\
recommended blocklist."
new_threat = api.CreateThreat(threat_name=name,
threat_description=desc,
indicators=indicators)
You should now see the Feodo Tracker IOCs threat in your list of Community Threats:
Click “View” under the threat, and scroll down until you see Threat Key on the right. Copy the threat key value as you will need this soon. Next, we’ll use the code below to retrieve the indicators from Feodo Tracker’s recommended blocklist, and replace the indicators of our threat with those values.
# import the required module
import InsightIDR4Py as idr
import requests
# reference the API key you created during setup
api_key = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
# connect to your InsightIDR instance using the API key
api = idr.InsightIDR(api_key)
# reference the threat key from the Feodo Tracker threat within the InsightIDR platform
threat_key = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
# retrieve the indicators from Feodo Tracker
blocklist_url = "https://feodotracker.abuse.ch/downloads/ipblocklist_recommended.json"
ioc_data = requests.get(blocklist_url).json()
ips = [ioc["ip_address"] for ioc in ioc_data]
domains = [ioc["hostname"] for ioc in ioc_data if ioc["hostname"]]
# replace the indicators in our InsightIDR Feodo Tracker threat
api.ReplaceThreatIndicators(threat_key, ips=ips, domains=domains)
And there we have it! You should now see the refreshed IOCs in the threat feed indicator listing (you may need to refresh the app).
And don’t forget, as shown above, you can always switch to dark mode, because…well…
Side note: Earlier in this article I talked a bit about the risks and costs associated with automation. I’ll share a story from personal experience about that here. So, while Abuse.ch is a very legitimate (see below) open-source threat intel provider, it is not without an occasional mishap.
I certainly felt proud to have integrated this threat intel source into the SIEM at work, but imagine my horror when my colleagues started getting alerted by the notoriously malicious (sarcasm here, folks) website…
hxxps[:]//www[.]google[.]com!
Luckily, I spotted the issue before any widespread impacts, and I quickly modified the import to exclude Google from our threat feeds (imagine having to do that) and wrote an ad-hoc script (see Task 3 below) to remove this indicator from our threat sources. Turns out, it was a totally honest mistake, and the reporter, the legendary @drb_ra, was able to quickly find and correct the issue.
All was well that ended well, but it was a good lesson to consider the risks and potential costs of an automated approach!
Task 2: Investigation Closure Tool
With the threat feed script under our belt, let’s proceed with a tool that will allow a SOC lead to bulk-close investigations based on a set of criteria, adding a comment to the investigation to explaining why it was closed. Think about the components of this that we’ll need to put together:
🧱 A way to collect the choices of the user. These include which investigations to close, and what comment to add to the investigation.
🧱 A list of investigations matching the criteria set by the user.
🧱 A way to have the user confirm that their choice to close the investigations.
🧱 A way to update the investigations to a CLOSED state.
🧱 A way to add a comment to the investigation reflecting that it was closed through a bulk operation.
Let’s start with collecting the choices of the user. To gather user inputs for CLI-style tools, I like to use the built-in argparse library. To be honest, I used to really not like argparse. But with a little patience and persistence I made peace with it and figured out the basics, which I’ll share with you here.
First though, we’ll need to decide which criteria our users will use to decide which stale investigations to close. To see the options, check out the API documentation for listing investigations, which you can find here. To find the criteria we can use, take a look at the example response from the “List Investigations” operation, which will show us the properties of each investigation that we can filter on and decide whether to close or not (I’ve indicated which properties might make sense to include in our tool’s parameters):
Based on these available options and my own judgement, here’s the header for our script tool:
import InsightIDR4Py as idr
import argparse
import sys
# collect user choices
parser = argparse.ArgumentParser(description="Closes investigations in bulk depending on user selections.",
epilog='Example usage: `python InvestigationClosure.py --assignee-email \
swilliams@acme.com --days-since-last-access 30 --source ALERT --disposition BENIGN`')
parser.add_argument("-p", "--priority",
dest="priority",
help="Comma-separated list of priority values for the investigation. Options include [CRITICAL,HIGH,MEDIUM,LOW].",
required=False,
type=str)
parser.add_argument("-d", "--disposition",
dest="disposition",
help="Comma-separated list of disposition values for the investigation. Options include [BENIGN,MALICIOUS,NOT_APPLICABLE,UNDECIDED] or ALL.",
required=False,
type=str,
default="BENIGN,NOT_APPLICABLE")
parser.add_argument("-s", "--source",
dest="source",
help="Comma-separated list of source values for the investigation. Options include [USER,HUNT,ALERT] or ALL.",
required=False,
type=str)
parser.add_argument("-ae", "--assignee-email",
dest="assignee_email",
help="Email address of the investigation's assignee.",
required=False,
type=str)
parser.add_argument("-dlac", "--days-since-last-access",
dest="days_since_last_access",
help="Minimum number of days since investigation was last viewed or modified.",
required=False,
type=int)
parser.add_argument("-dlal", "--days-since-last-alert",
dest="days_since_last_alert",
help="Minimum number of days since the last alert associated with the investigation.",
required=False,
type=int)
parser.add_argument("-cc", "--close-comment",
dest="close_comment",
help="The comment message to add to the investigation when closing.",
required=False,
type=str)
# parse inputs
args = parser.parse_args()
priority = args.priority
disposition = args.disposition
source = args.source
assignee_email = args.assignee_email
days_since_last_access = args.days_since_last_access
days_since_last_alert = args.days_since_last_alert
close_comment = args.close_comment
# validate user 'priority' selection
if priority:
priorities = [item.upper().strip() for item in priority.split(",")]
for item in priorities:
if item not in ["CRITICAL", "HIGH", "MEDIUM", "LOW"]:
raise ValueError("Error, the priority selection {} is not a valid choice!".format(item))
else:
priorities = ["CRITICAL", "HIGH", "MEDIUM", "LOW"]
# validate user 'disposition' selection
if disposition.upper() != "ALL":
dispositions = [item.upper().strip() for item in disposition.split(",")]
for item in dispositions:
if item not in ["BENIGN", "MALICIOUS", "NOT_APPLICABLE", "UNDECIDED"]:
raise ValueError("Error, the disposition selection {} is not a valid choice!".format(item))
else:
dispositions = ["BENIGN", "MALICIOUS", "NOT_APPLICABLE", "UNDECIDED"]
# validate user 'source' selection
if source:
sources = [item.upper().strip() for item in source.split(",")]
for item in sources:
if item not in ["USER", "HUNT", "ALERT"]:
raise ValueError("Error, the source selection {} is not a valid choice!".format(item))
else:
sources = ["USER", "HUNT", "ALERT"]
In the snippet above, we’ve defined the parameters for the tool, parsed the selections of the user, and validated the user’s selections to ensure they align with the categories we can use to categorize our investigations as stale or not stale.
Next we will create an initial list of investigations and use the user selections to winnow that down to just the ones we want to close. By default, the ListInvestigations() method will only return investigations from the past four weeks/28 days. Since some investigations might be older than that, we can override that default as shown below.
# connect to InsightIDR and list investigations
print("Connecting to InsightIDR.")
api_key = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
api = idr.InsightIDR(api_key)
# use a start time of one year ago to catch (hopefully) all investigations
start_time = (idr.datetime.now(idr.timezone.utc) - idr.timedelta(365)).strftime("%Y-%m-%dT%H:%M:%SZ")
# list investigations
print("\nListing investigations.")
all_investigations = api.ListInvestigations(assignee_email,
start_time,
multi_customer=True,
priorities=priorities,
statuses=["OPEN", "INVESTIGATING"])
# filter by disposition
print("\nFiltering investigations.")
filtered_investigations = [inv for inv in all_investigations if inv["disposition"] in dispositions]
# filter by source
filtered_investigations = [inv for inv in filtered_investigations if inv["source"] in sources]
# filter by last access date
if days_since_last_access != None:
# get the threshold last access date
threshold_access_date = idr.datetime.now(idr.timezone.utc) - idr.timedelta(days_since_last_access)
for investigation in filtered_investigations:
# get the investigation last access date and compare
last_access_date = idr.datetime.strptime(investigation["last_accessed"], "%Y-%m-%dT%H:%M:%S.%f%z")
if last_access_date < threshold_access_date:
filtered_investigations.remove(investigation)
# filter by last alert date
if days_since_last_alert != None:
# get the threshold last alert date
threshold_alert_date = idr.datetime.now(idr.timezone.utc) - idr.timedelta(days_since_last_alert)
for investigation in filtered_investigations:
# check whethere the investigation has a last alert time
if investigation["latest_alert_time"]:
# get the investigation last alert date and compare
last_alert_date = idr.datetime.strptime(investigation["latest_alert_time"], "%Y-%m-%dT%H:%M:%S.%f%z")
if last_alert_date < threshold_alert_date:
filtered_investigations.remove(investigation)
Now at last, we’ve filtered the investigations thoroughly based on the user’s criteria. Let’s give them a list of the alerts they’ve chosen to close and have them confirm their choice:
# confirm the choice
print("\nInvestigations slated for closure:")
for investigation in filtered_investigations:
print("- {} (Created {})".format(investigation["title"], investigation["created_time"]))
choice = ""
while choice.lower() != "y":
choice = input("\nYou have chosen to close the investigations listed above. Continue? (y/n)")
if choice.lower() == "n":
print("Exiting!")
sys.exit()
…the result of which might look like:
Finally, use the UpdateInvestigation() method to close the selected investigations, and optionally add the comment to the closed investigation:
# close the investigations and add the comment
print("\nClosing investigations.")
for investigation in filtered_investigations:
print("- Closing {}.".format(investigation["title"]))
inv_id = investigation["rrn"]
result = api.UpdateInvestigation(inv_id, status="CLOSED")
if close_comment:
comment = api.CreateComment(inv_id, close_comment)
Back in the Insight IDR platform, we should see a lovely listing of closed investigations along with the helpful comment describing why they were closed:
Awesome job!
Here’s the complete tool if you want to check it out — just don’t forget to add your own API key in place of the redacted value on line 83!
https://gist.github.com/mbabinski/86f2412ee94dbeac0b404aeb77fd9550
Task 3: Threat Hunting Saved Searches from Sigma
With our threat feed cron job/scheduled task completed and our Investigation Closure tool in the can, it’s time to move on to a useful ad-hoc automation task that could help save us time and boost our productivity on a one-off basis. Like all good SIEMs, InsightIDR allows us to create saved searches of interest that we can run against our logs, refine, and even promote to custom alerts if we are happy with the results we are getting from a detection standpoint.
To get some threat hunting/saved search inspiration, we’ll look at the brand-new Sigma rule category for Emerging Threats, described by Nasreddine Bencherchali here:
Before we proceed, I want to clarify that there are multiple ways to do this. Experienced Sigma project users will rightly point out that sigma-cli is a convenient way to efficiently convert large numbers of Sigma rules from the command line, which is different than the approach in this example. You can read all about sigma-cli and other exciting recent developments to the Sigma project here:
For today, however, here’s what we’ll do:
- Retrieve the latest 2023 Emerging Threats Sigma rules from the Sigma rules repository.
- Convert Sigma the 2023 Emerging Threat rules into InsightIDR Log Entry Query Language (LEQL) format.
- Convert those converted rules into saved queries within the InsightIDR platform.
First, grab the latest-and-greatest set of Sigma rules by running the command:
$ git clone https://github.com/SigmaHQ/sigma.git
Next, install the powerful pySigma library and the pySigma backend (conversion library) for InsightIDR in one go by running this in your terminal or command prompt:
python -m pip install pysigma pysigma-backend-insightidr
From here, we can write some Python code to load the Sigma rules we are interested in into a pySigma “SigmaCollection” object, a format from which we can convert them to InsightIDR LEQL query format. Note: you’ll need to replace the file path to wherever these rules reside on your system
from sigma.collection import SigmaCollection
import glob
# set our root glob directory to the 2023 emerging threats folder
path = r"C:\Temp\SIEM_Automation_Demo\sigma\rules-emerging-threats\2023"
# use glob to recursively list files with a .yml extension
files = glob.glob(my_path + "/**/*.yml", recursive=True)
# create the SigmaCollection rule set
rule_collection = SigmaCollection.load_ruleset(files, collect_errors=True)
With the rule collection created, we can import the InsightIDR pySigma backend functionality we need to convert the rule.
from sigma.backends.insight_idr import insight_idr
# creates the backend
insight_idr_backend = insight_idr.InsightIDRBackend()
Now let’s convert the rules to LEQL format, where possible. The InsightIDR Sigma backend doesn’t support all rule types, so we’ll test the conversion, then if it succeeds we’ll create a saved query with the output query and the time frame of “Last 7 Days.”
# rule conversion
for rule in rule_collection.rules:
try:
# convert the rule
query = insight_idr_backend.convert_rule(rule, "leql_advanced_search")[0]
# import to InsightIDR as a saved query
result = api.CreateSavedQuery(rule.title[:32], query, time_range="Last 7 Days")
except Exception as exc:
# ignore rules that could not be converted
pass
Note that the max allowable length for a saved query name in InsightIDR is 32 characters, so I’ve limited the name length in my code above. Having run this step, we now see a lovely set of saved queries in the Log Search section of InsightIDR.
Happy hunting!
Conclusion
If you’ve made it this far and followed along, congratulations! You now have experience automating real-life processes within a modern, cloud-based SIEM. If you’re new to Python tool development or web APIs, this may seem cumbersome and complicated. Sometimes, it may seem more costly to automate versus simply copy/pasting or whatever the task entails via the GUI. However, from an efficiency standpoint, automating processes quickly pays off. As you develop your skills and grow in your career, three things will happen:
- You will be able to write scripts and automate processes faster.
- You and your team will more quickly realize the benefits of automation due to shortened development time and fewer errors.
- Your judgement will improve with experience, allowing you to more accurately evaluate which processes would be good candidates for automation, based on your current skill level.
So, for any students or early-career security professionals interested in automating processes, I’ll leave you with the advice to stick with it, ask for help, and take a break when you need one.
Happy building! 🔧👷🔨