Whether the sensitive values in your sources can be found or not depends on what search matcher or matchers you associated with your data classes (see prior tab). You can choose from:
Location Matchers (faster, for structured or semi-structured sources)
- Column name or pattern-match to a column name
- List file value (from a list of locations, like a list of column names)
- Range (matches within a specified range of indexed locations)
- Excel cells (and worksheets)
- CSV (or other delimited file) header row names
- JSON or XML path
and/or
Data Matchers (to scan the contents of each item for a match to a):
- RegEx pattern with or without computational validation in JavaScript
- Dictionary value (set file) value lookup (exact match)
- Fuzzy lookup value (just a close match)
- Open NLP (NER) model
PyTorch (NER) model - TensorFlow (NER) model
When you execute a search only, or search and mask job, in the IRI Workbench GUI for DarkShield, the search engine will scan the file and database silos you specified in your connection profiles using the method or methods (above) you chose to find that data.
Information about the data found during your searches is recorded in a JSON annotation log which DarkShield can use at search time, or subsequently, to mask the same data with the function you assigned to each data class. Search results can also be directed to a delimited text log which contains metadata associated with (file search) results, and optionally produces layouts for that file in SortCL data definition format (DDF) to facilitate the use of combinatory CoSort transformation and reporting on the those very logs.
Data from these logs is also used in an HTML5-compatible dashboard DarkShield can display to help you locate PII by data class and ranked locations. It can also be exported to external SIEM/SOC and log analytic platforms like Splunk for additional query and visual insight, or actions like a Splunk Adaptive Response or Phantom Playbook to trigger an email or DarkShield masking job.