Using Ripcurrent for DB Structure Change Reporting
Abstract: This is the final in a series of 4 articles pertaining to IRI Ripcurrent, an IRI-developed command-line Java application that utilizes the Debezium embedded engine and the streaming feature of the IRI (CoSort) SortCL program to react in real-time to database (DB) change events. Typically this involves replicating data to downstream target(s), optionally with transformation (e.g., PII masking) rules consistently applied based on the classification of the data.
While Ripcurrent is mainly designed to take action on DB change events that involve the modification of data in tables, it also monitors change events involving structural changes in a DB schema (such as adding a new column to a table) and records them to a log file. A separate program can then monitor this log for new events, and then issue an alert or otherwise take a specified action.
Sample Log Monitoring Program
This publicly available Python program is an example of monitoring a log file and triggering alerts; note that Python must be installed to use it. After Python is installed, dependencies for the program can be installed with the command pip install -r requirements.txt.
Targeted for Windows, the program triggers toast notifications when a new structure change event is recorded by Ripcurrent to the log. An alternative notification possibility (which would be cross-platform) could be to send alerts via email.
Demonstration
The following example shows how a database structure change alert set up in Ripcurrent would be used to help reconfigure an existing FieldShield database masking operation so it can be modified to accommodate a new column.
Here is the original FieldShield job script defining the source, and masked target, table before a new column was added:
I am going to run Ripcurrent, configured to monitor this (local) MySQL database. I will also have the aforementioned Python program running in the background to monitor new structure change events and record them in the Ripcurrent log file (located at the path I specified as an argument to the program).
When I ran that Python program, I received a notification that a new column (EMAIL) has been added to a table named ‘sensitive’, which contains personally identifiable information:
My existing FieldShield job (above) to mask the data in this table did not include this column (it did not exist at the time). The job is not automatically updated when a new column is inserted.
Therefore, In order to keep existing FieldShield masking job scripts in sync with structural changes in a database, I have to re-run the Schema Data Class Search wizard (available by right-clicking on the schema in the Data Source Explorer of IRI Workbench):
A page in the wizard will open like the one above. Options available within the wizard include the selection of schemas to include in the search and selection of data classes to use in the search:
After I finish the schema data class search, the data class library is updated with the new column:
After reviewing the data classifications, from the FieldShield menu, the new Data Class Database Masking Job wizard can be run again to update FieldShield job scripts needed to mask tables with a new structure.
Completion of this wizard results in a new, updated job being produced that includes the newly added email column, along with a masking rule applied to the email column based on the default rule paired with the email data class.
While this example shows just one structural change to one table being made for clarity of demonstration, re-running the aforementioned process will synchronize the FieldShield masking job with any number of changes in any number of tables that may have been made since it was originally created.
To give a basis for comparison, the image below displays the original data in the table.
After executing the FieldShield script, the SSN column is partially redacted, the first name is pseudonymized, the credit card is redacted except for the last four digits, and email addresses are encrypted with format-preserving AES-256 encryption.
All masking rules are consistently applied across tables based on column data classification.
Summary
In addition to reacting in real-time to changes of data in a database, Ripcurrent can record changes to the structure of a database in a log. This log can be monitored to trigger alerts for any manual actions that may need to be taken due to the modified structure of the database.