A Splunk Phantom Playbook for Masking Sensitive Data
Introduction
Splunk Phantom is an orchestration, automation, and response technology for running “Playbooks” to respond to various conditions. Phantom connects to Splunk Enterprise using the Phantom App for Splunk, so that actions can be taken on knowledge derived from data indexed in Splunk.
IRI DarkShield is a powerful data masking package that can discover, delete, de-identify, and/or deliver PII hidden in a wide range of unstructured data sources. In v3, a command line interface (CLI) was added, allowing third-party applications to embed or run remediation (masking) jobs configured for DarkShield, including Phantom.
It is thus now possible to automate security responses to PII vulnerabilities in dark data uncovered in DarkShield PII searches. Specifically, Phantom can automatically run DarkShield to plug those holes through playbooks that used Splunk to evaluate the data that DarkShield found. Compare this to sending an alert email, described in our prior article on using the Splunk Adaptive Response Framework with DarkShield.
Prerequisites
Here are the underlying components used for this turnkey solution:
- Splunk® Phantom Version 4.5+
- IRI DarkShield Version 3.0
- IRI DarkShield CLI (included)
- Splunk Enterprise Version 7.3+
- SSH server-enabled host
- Virtualization software such as VMware Fusion®, VMware Fusion Pro®, VMware Workstation Player®, VMware Workstation Pro®, or Oracle® VirtualBox.
Phantom Setup
To start, create a Splunk Phantom account at https://my.phantom.us/ if you don’t already have one. Once signed into the Phantom Community site, the Splunk Phantom Community Edition virtual appliance is available for download from the “Products” section of the website. The virtual appliance can be utilized with a large variety of virtualization software.
In this example, I use Oracle VirtualBox. The Splunk Phantom virtual appliance needs to be running to access the Splunk Phantom server. Follow the instructions found on the Phantom Community site, and you should reach a point where you can see what IP the server has been set up on. Access this IP address from your web browser, and sign in to Splunk Phantom.
Splunk Phantom is a CentOS Linux virtual machine that sets up a server to host Phantom. Because of this, SSH needs to be used to run command line actions on your host machine. While this necessitates more information to run the IRI DarkShield Remediation Playbook, it also makes the playbook more versatile. The host that holds the DarkShield CLI and the DarkShield .search file can be any machine in the world running SSH.
DarkShield CLI Setup
The DarkShield CLI runs DarkShield externally to mask data; i.e., it allows the masking jobs to run outside the graphical development and execution environment of IRI Workbench. The CLI
Download the DarkShield CLI, which requires a DarkShield or Voracity license to run.
Once downloaded, unzip the contents of the darkshield.zip archive. You should have the following structure:
darkshield\
├── darkshield
├── darkshield.bat
├── darkshield.jar
The darkshield folder should be added to the system path so that darkshield can run from anywhere, such as in the DarkShield Remediation Phantom Playbook described in this article.
Using the DarkShield CLI requires the following:
- Java 8 (JRE) or OpenJDK 11
- Windows / Linux / Mac OS X host running on premise or in the cloud
- .search/.darkdata files (see the DarkShield Product Overview booklet)
- Valid DarkShield or Voracity license
Like most IRI software, DarkShield relies on a core data processing program called SortCL to, in this case, mask data. By default, DarkShield uses the $COSORT_HOME environment variable to find its bin directory and use the sortcl executable within.
Install and Configure the Phantom App for Splunk
In this case, I will be transferring data from Splunk Enterprise to Splunk Phantom. I can even speed that flow through Splunk Universal Forwarder, though Splunk Phantom supports many other data sources.
Note that the use of the word ‘Search’ in this article should be understood in each context in which it occurs. The Splunk ‘Search’ referred to above is performed subsequent to, and through DarkShield .search file data, which is produced by running the Dark Data Discovery Wizard in IRI Workbench. The *.search file can be used with the DarkShield CLI to search for (and optionally remediate) instances of PII in unstructured file types.
You must have a Splunk account and a current instance of Splunk Enterprise or Splunk Enterprise Security in order to download and use the Phantom App for Splunk. Within Splunk, make sure that both the Phantom App for Splunk and the Phantom Remote Search app have been installed. These apps send Splunk search results to Phantom as an event.
Go to the Phantom Server configuration tab from the Phantom App for Splunk navbar to specify the server details needed to connect to your Splunk Phantom instance. This includes the IP address of the Phantom instance, and an authorization token called a ph-auth token that authorizes the connection.
Create a server, entering the IP address and ph-auth token of the Splunk Phantom instance. The ph-auth token can be found by clicking on the automation user from administration > user management > users from the Splunk Phantom menu. Copy the full Authorization Configuration for REST API from here, as shown here:
Once the Phantom Server is configured in the Phantom App for Splunk, it should be possible to test the connection.
Export DarkShield Search Results to Phantom via Splunk
Phantom allows Splunk searches from Splunk Enterprise and Splunk ES (as well as many other SIEMs and sources) to be exported to Phantom. A practical use of this is to set searches that yield results only for a certain parameter, value, or threshold.
The DarkShield Remediation playbook is designed to act when a certain value or threshold of PII vulnerability as discovered in the DarkShield search process (the .search file results) is reached. The playbook will run DarkShield through its CLI to mask that PII when Splunk finds a specified number of unprotected instances in the indexed results.
The source of the Splunk Search information in this case are the results of a PII scan (requiring a .search file from the Dark Data Discovery WIzard) that outputs to a .txt file which contains detailed information pertaining to instances of unprotected PII in unstructured files. This .txt file can be indexed into Splunk either manually or through the Splunk Universal Forwarder.
To get this data that has been indexed into Splunk from the DarkShield .txt file into Phantom, set up a search in Splunk that will only yield results when you want a Phantom playbook to run. In this case, I am going to search through the DarkShield .txt log file to discover when the DarkShield found more than 3 instances of unprotected PII in a document (such as a PDF).1
Save the search as a report via the “Save As” dropdown. Then click on Settings > Searches, Reports, and Alerts. Make sure the search is stored under the Phantom App. Once the Splunk search has been saved as a report under the Phantom App, navigate to the Event Forwarding page in the Phantom App:
Click the green button for a “New Saved Search Export”. This screen will appear:
Fill out the necessary details such as name of the search, specified saved search, destination (phantom server) receiving the search, and the level of significance of the alert.
Clicking the green “Save and Close” button will automatically send the Splunk search results to Phantom. Clicking “Save and Preview” will allow you to audit which of those search result entries to send to Phantom.
Running the Playbook
From Splunk Phantom, playbooks such as the IRI DarkShield Remediation Playbook can be run to remedy the unprotected files that contain personally identifiable information.
The DarkShield Remediation Playbook works by using SSH to access the machine with the .search file produced by DarkShield.
When this playbook runs, it will prompt the user for the IP address (hostname) of the machine with the .search file on it. After that prompt is answered, other prompts will follow:
The next prompt is the username of the machine to connect to with SSH, followed by the password. The final prompt asks for the absolute file location of the .search file, so the DarkShield CLI will know which .search file to execute (lest it mask the wrong files!).
After all 4 prompts have been answered by the Phantom administrator, the DarkShield CLI will launch a remediation (data masking) job on the computer specified by IP address. This should mask the entries of PII which DarkShield identified and were indexed into Splunk.
Download this Playbook
You can use the same IRI DarkShield Remediation Playbook I did. Download the archive here, which includes notes and the playbook. You can modify the playbook using Phantom’s Python Playbook editor.
Looking Forward
This example presents a turnkey solution to automating the protection of sensitive information hidden in unstructured files on an event-driven basis. It takes some familiarity with the IRI DarkShield command line interface, Splunk Enterprise, and Splunk Phantom, but once built, it’s something that can run for years.
There is potential for integrations with other IRI products and Phantom playbooks. If certain values are detected through a regular Splunk instance after Voracity produces them — for example in customer transaction or IoT sensor data — actions can be taken. These actions can be executed remotely, which allows for smooth automation procedures.
See the article IRI Voracity Data Munging and Masking App for Splunk for more ideas. Contact info@iri.com if you want to automate data discovery and manipulation jobs in Splunk Phantom.