Challenges
While masking data, or producing useful test data, you need output values that look real, but do not reveal personally identifiable information (PII). This is particularly true with the names of people, places, and things.
Encryption, scrambling, redaction, hashing and many data obfuscation functions protect data at risk, but do not provide the level of realism certain recipients require. You need an easier way to change the individualizing characteristics of data using a substitute, but realistic, output value. This is also referred to as data shuffling.
You must also ensure that the real name cannot be readily discovered through reversal or guesswork. And if you want to provide replacement names, or pseudonyms, for people in production or test data environments, the replacements need to be remain consistent for referential integrity, and the values need to stay updated as original names come and go.
Solutions
If you work with PII in tables or flat files, use IRI FieldShield -- or the SortCL program in the IRI CoSort product or IRI Voracity platform -- to replace that data with safe, but realistic replacement output stored in DB tables or external data sets called set files. If you need to do the same with ranges in Excel, use IRI CellShield, or for unstructured data sources, use IRI DarkShield. They support:
Recoverable Pseudonymization |
Specify a lookup set where real and fake names are either pre-associated, or automatically associated at random. Use the restore set to recover the original names. |
Unrecoverable Pseudonymization |
Randomly select substitute names for the original value from a set file containing real or fake names. This way the original name value has no automatic basis for restoration. |
Consistent, Self-Updating Pseudonymization |
Choose from a hash set rule, or palette item in IRI Workbench to maintain updated, consistent pseudonyms that maintain uniqueness and referential integrity. |
Specify the pseudonym method used in your output fields in simple 4GL job scripts, or use the pseudonymization dialog in the masking dialogs in the FieldShield GUI, or DarkShield wizards, in the same Eclipse™ IDE, or in CellShield, which also supports pseudonymous lookup replacements oif values in Excel.
Pseudonymization is only one method you can use to shuffle the contents and thereby de-identify information in a record. You can also combine pseudonyms with other field-level data security functions.
Need Test Names?
In addition to pseudonymizing and otherwise masking production data, there is a standalone solution for producing safe, but realistic first and last names of either gender (or other nouns). IRI RowGen uses the same metadata as FieldShield (and SortCL) to create and format pseudonyms for use as test data values (or in formatted test data targets).
RowGen is especially helpful for providing anonymous, but real-looking, test data when production data is unavailable or insufficient. RowGen builds structurally and referentially correct test data into database, file, and report targets. RowGen is also included in Voracity.
Related Solutions