Challenges
Your organization manages personally identifiable information (PII). Your data governance efforts must prevent the kinds of data disasters posted at the Privacy Rights Clearinghouse. You must comply with industry and government data privacy rules.
Since you cannot eliminate PII, you have to discover it, classify it, protect it, and verify that you protected it. Then you have to continue monitoring and addressing data risks going forward.
At protection time, technology choices are difficult. Traditional encryption of entire databases, files, disks, or devices is inefficient (especially in volume), restricts access to non-sensitive data, and is subject to complete exposure from a single password breach. Many data obfuscation tools are insecure, complex, expensive, or render the protected data unusable for testing.
Moreover, with current data masking tools you may not get:
- support for finding, extracting, classifying, or applying rules to data that meets PIIcriteria
- an audit trail detailing how you managed risk -- forcing a costly validation exercise
- a separation of encryption and key management (should either be compromised)
- the ability to simultaneously apply multiple protections to multiple data sources
- the ability to combine data protection with other data processing operations
Solutions that provide encryption at the file, database field, and application level provide the highest level of security while allowing authorized individuals ready access to the information. Decentralized encryption and decryption provide higher performance and require less network bandwidth, increase availability by eliminating points of failure, and ensure superior protection by moving data around more frequently but securely.
- Gary Palgon, Enterprise Systems Journal
Solutions
IRI Data Protector Suite 'shield' products like IRI FieldShield -- as their parent SortCL program in the IRI CoSort package and IRI Voracity platform -- support field-level data masking functions for data in tables and files. They protect PII, and:
- find and classify it so global masking rules can be applied later or at the same time
- consistently apply the data masking (replace, encrypt, pseudonymize, hash, etc.) function you choose for each class of data, preserving referential integrity
- maintain data realism with format-preserving encryption, pseudonymization, referential integrity, etc.
- save time, money, and inconvenience by not masking non-sensitive data
- strengthen security by supporting the application different functions to different data sources and elements
- improve efficiency by combining data protection with data transformation and reporting
- verify compliance with multiple data privacy laws with re-ID risk scoring and query-ready audit logs of protection jobs
- send compliant data to applications, reports, databases, the cloud, and BI / analytic targets
- implement data loss prevention (DLP) programs properly, and without undue complexity
IRI CellShield software does the same for PII in Excel spreadsheets, and IRI DarkShield does it for PIIhidden in structured, semi-structured, unstructured sources, including RDBs, free-form text, MF and PDF documents, NoSQL DBs, Parquet and image files.
Non-Recoverable Functions
Irreversible data masking options in IRI masking tools include:
- redaction or omission
- anonymization (blur or bucket)
- randomization
- pseudonymization (via random pull)
- hashing or tokenization
- custom field functions
- deletion
Recoverable Functions
Reversible data masking functions in IRI masking tools include:
- encryption (decryption)
- bit scrambling (less secure)
- binary encoding / decoding
- pseudonymization (via restore set)
- expression / string logic
Synthetic Test Data
IRI RowGen uses the same metadata as FieldShieldand Voracity to randomly generate or select realistic test data column-by-column. While synthesizing that data, RowGencan also uniquely transform and report from it at the same time!
Use RowGen to create anonymous, referentially correct test data for databases and files without production data.