Data Generation Rules in IRI Workbench
IRI Workbench contains a section of field-level Data Generation Rules for use in IRI RowGen test data synthesis, FieldShield data masking, and other SortCL-compatible tasks or Voracity (ETL) work flows. This article introduces these functions, which can be used ad hoc or applied globally in multi-source jobs, just as data masking or data quality functions can.
Credit Card Number Generator
This rule is used to generate credit card numbers. There are two different algorithms to choose from. The first generates all types of credit card numbers without any separator. The second algorithm has two optional parameters. The first is the type of card, for example, VISA and the second is the separator.
Sample output of the first algorithm:
4062775790116515 5289210718910777 6011024793679576 348860334376413 378578214824319
Sample output of the second algorithm with VISA and a space as parameters:
4930 0692 0548 3698 4767 0255 9172 5769 4037 7345 7480 2370 4701 9222 2608 5057 4709 6816 0502 6791
Date Range Generator
The date range generator can create dates and times in the specified format within the specified range.
1953-09-28 01:13:52 2070-03-04 13:23:44 1909-12-21 00:02:47 1906-09-03 03:39:10 1965-06-06 20:35:58
Distribution
Test data values can be generated in desired frequencies to represent natural occurrences of the values in product, per this article. There are three types of distribution wizards:
Linear Distribution
This rule creates a simple linear distribution.
98 42 85 28 26
Normal Distribution
This bell curve wizard has two options — one for a range, and another for mean and standard deviation.
9 58 38 64 50
Weighted Distribution of Items
This dialog controls the occurrence rate of certain literal values in relation to others. For example, it can specify that, regardless of the number of test rows generated, the data will contain a 10:40:50 percentage of occurrences for Alabama, Georgia, and Florida.
Florida Alabama Florida Georgia Florida
Email Generator
The email address generator can generate emails with random data, or can be specified to use a defined mail server, domain and/or country domain.
wkHqveyXG@gmail.com nB108hRiOzkmy@gmail.com BEdAZtAQKwCZ@gmail.com pEdBR2Mfp832Q@gmail.com IEjo2b9Dbzgh@gmail.com
National ID Functions
There are multiple algorithms for creating country-specific (e.g., tax) identification numbers. The parameters of each are defined in the detail pane. For example, US Social Security Numbers can be defined to specify a state; e.g.,
436-04-6935 436-02-5751 280-08-5682 614-28-9700 737-12-5122
Percent of Nulls Value
This rule assigns a percentage of values as null during data generation.
Random Value Generation
This rule can create specified types of data within a size range or draw data randomly from a set file.
'^-9)R_h3e/O=|i.%{rM Esx)^B]Ll4}]U;Mwx1x ce80hovA#f"fPN'sP+#X o+!v~OLe04e[F 2l' k@`,1X{\
Row ID Value
This rule can create a row ID with a specified initial value, increment step, and limit.
1 11 21 31 41
Set File Selection
This rule points to a set file to draw data from. The Selection type allows the definition of the pull type.
ATL BWI BOS BUR ORD
String Generation Functions
This wizard exposes multiple algorithms for creating strings. The parameters of each are defined in the detail pane.
ef84c47a-6bb2-4e21-93d1-c44f4feb3b24 3fe288e0-ecf8-47d4-8cf3-96a3fcb011ef c145f46a-2e06-4039-a53a-c011d39ba111 1bdce8ba-9d5a-4d26-9a17-eecabd0ee9d1 dad1e325-6407-4759-977d-9b24f93e5dcf
Table Lookup
This rule creates a lookup using a table. The details of the table including which column to look into and which column to return are defined. The field is the script field that contains the data to look for. A default value is also set.
FL TX GA FL FL
For more information on each rule, all the dialogs include help and tips upon hover to guide the user. If you need help implementing data mapping, data quality, data masking, or test data generation rules in IRI Workbench, contact or your IRI representative or email voracity@iri.com.