Format-Preserving Encryption … Another Important FieldShield Function [video]
According to Trent Henry of the Burton Group,
Data protection needs to take into account both internal and external threats, including those against critical database and application resources. However, many protection mechanisms require application retooling or add complexity that delays or halts deployment. Format-preserving encryption overcomes these issues. It’s an exciting step toward improved, simpler data protection and compliance with regulatory requirements.
Format-preserving encryption refers to encrypting data in such a way that the output is in the same format as the input. That format could be representative of a field length like 16 digits for a credit card number, an American birth date represented as a valid MM/DD/YYYY format, or even a simple valid English word being converted into another valid English word. So the format could be anything representative of valid data that is intended to be present in the field.
Format-preserving encryption typically addresses the need to alleviate problems that arise from integrating encrypted data back into existing applications. If a program is designed to encrypt a 16-digit credit card number, and converts it into hexadecimal values using AES 128-ECB, the resulting character string will produce many bytes that are considered invalid for that field. And because the hexadecimal values create a string of characters that is much longer than the original 16-digit credit card number, it can not be stored in the same field location because the size is too large.
FieldShield offers several encryption and decryption functions, including 128 and 256-bit AES standard encryption, Triple DES, FIPS-compliant OpenSSL, and GPG routines. Encryption can be applied to individual columns of data, or to several columns of data at once, even on a variety of data sources (table, flat file, pipe, Excel spreadsheet or custom input procedure). Protection can also be applied en masse to specified columns (using name pattern-matching) in an entire database schema, or to just two or more specified tables. Wizards are available in the IRI Workbench to facilitate all protection operations.
Encryption with FieldShield is reversible, and therefore every supplied encryption routine has a corresponding decryption routine. You can apply various types of encryption to various columns within the same, or multiple data sources, and just as easily restore all original values in a new job using the appropriate decryption routines, all with the same wizards used for encryption.
Format-Preserving Encryption (on a single data source):
Consider the following input data consisting of name, email address, social security number, and phone number fields:
Adams John|adams@gmail.com|770-75-3849|(684) 460-0228 Jefferson Thomas|Jefferson@Thomas.com|345-78-3485|(566) 549-1332 Madison James|Madison@James.com|643-34-3478|(129) 375-7753 Monroe James|Monroe@James.com|123-85-4783|(035) 049-9327 Jackson Andrew|Jackson@Andrew.com|335-99-6655|(569) 213-9949 Adams Quincy|Adams@Quincys.com|234-87-1122|(201) 092-4157 Van Martin|Van@Martin.com|763-76-2348|(519) 329-1939 Harrison Henry|Harrison@Henry.com|574-57-8934|(384) 951-2973 Taylor Zachary|Taylor@Zachary.com|676-99-7474|(370) 080-4086 Fillmore Millard|Fillmore@Millard.com|554-44-4444|(867) 711-8324
In this example, the phone number values will undergo 256-bit format-preserving encryption. The screenshot above illustrates the selection of the native FieldShield routine and its related options, including Source name, Passphrase, and Exclude. In this case, the field for Source name is the “Phone” number although the option to apply protection to multiple fields at once is also supported. For this encryption type, the passphrase can be a string (in this case 12345ABC), a file that you browse for in a permission-restricted directory that holds the passphrase, or an environment variable. You may wish to use more advanced key management techniques by selecting one of the SSL or GPG encryption options. For the Exclude option, which is optional, you can enter one or more characters that may be present in the source data to be excluded from encryption.
On completion of the dialog, the relevant entry that will appear automatically in the /OUTFILE section of the job script is:
/FIELD=(ENC_FP_PHONE=enc_fp_aes256_alphanum(PHONE, “12345ABC”), ASCII, POSITION=4, SEPARATOR=”|”)
Note the field name is automatically prefixed with ENC_FP (to denote the use of encryption with format preservation), and that the field name referred to as an argument within the routine parameter is PHONE, which is the source field undergoing the operation. The pass-phrase “12345ABC” is used here. The encrypted output file looks like this:
Adams John|adams@gmail.com|770-75-3849|(899) 506-1645 Jefferson Thomas|Jefferson@Thomas.com|345-78-3485|(282) 339-4045 Madison James|Madison@James.com|643-34-3478|(507) 796-9227 Monroe James|Monroe@James.com|123-85-4783|(215) 732-2737 Jackson Andrew|Jackson@Andrew.com|335-99-6655|(526) 797-5510 Adams Quincy|Adams@Quincys.com|234-87-1122|(292) 737-7668 Van Martin|Van@Martin.com|763-76-2348|(958) 331-9706 Harrison Henry|Harrison@Henry.com|574-57-8934|(166) 688-4609 Taylor Zachary|Taylor@Zachary.com|676-99-7474|(751) 153-4405 Fillmore Millard|Fillmore@Millard.com|554-44-4444|(635) 581-7398
As you can see, the phone number field (final column) still retains the format of a phone number — (XXX) XXX-XXXX — but its values have been encrypted.
Decryption:
To decrypt and restore the phone number values in a subsequent job, you must select the counterpart to AES 256-bit encryption used above, i.e., the routine named dec_fp_aes256_alphanum. And, you *must* use / refer to the same passphrase that was used for the encryption phase. For decryption in this case, the output field generated upon completion of the wizard would therefore appear as:
/FIELD=(DEC_ENC_FP_PHONE=enc_fp_aes256_alphanum(ENC_FP_PHONE, "12345ABC"), ASCII, POSITION=4, SEPARATOR="|")
Note the field name is automatically prefixed with DEC to denote the use of decryption with format preservation, and that the field name referred to as an argument within the routine parameters is ENC_FP_PHONE, which is the field representing the encrypted output from the previous encryption operation. The same passphrase, “12345ABC”, is used here.
Format-Preserving Encryption and Decryption (on multiple tables):
The same options for encryption that are available for protecting one or more columns in a single source of data are also supported when protecting multiple columns across multiple tables. The FieldShield multi-table protection wizard allows you to select your database schema and tables to be protected, and a pattern-matching column name selection as shown below.
You can save the selection and protection specifications to a rules library, or load previously saved specifications from a rule library. The final step is to select and define the encryption routine.
The following screenshot shows 256-bit alphanumeric format-preserving encryption being selected, and the options provided that will apply to all identified columns within the selected tables:
In this case, the source field is ${FieldName} which is a variable rather than a specific column name, because the protection will apply to all columns that match the pattern-matching string. The encryption key can be a string like 12345ABC in this case, a file that you browse for in a permission-restricted directory that holds the passphrase, or an environment variable.
You can use more advanced key management techniques by selecting one of the SSL or GPG encryption options. For the Exclude option, you can enter one or more characters that may be present in the source data to be excluded from encryption.
On completion of this dialog, multiple job scripts will be created, each reflecting the desired encryption routine being applied, along with a batch script and data flow outline. A single operation extracts data from the desired tables, shields the columns that satisfy the pattern-matching string, and loads the protected data to the same, or to another, database schema.
For more information on FieldShield, visit www.iri.com/products/fieldshield. There is also a demonstration if you would like to see how to use format preserving encryption on select fields when creating a FieldShield script using IRI Workbench on our YouTube channel found here: