ETL Task Testing with the IRI Voracity Preview Feature
During the design of IRI Voracity workflows in the IRI Workbench (Eclipse) GUI, you can preview the results of one or more transforms before saving or running the project. This would be typically be done in Voracity data warehouse ETL operations.
The preview can contain a subset of actual input data, or, if that’s not available or confidential, synthetic test data (via embedded IRI RowGen field generation functions).
To test your target formatting, use the Preview with Test Data option available in the Workflow designer diagram. In this case, the transform mapping is defined in this SortCL-based job script familiar to CoSort users:
/INFILE=//cloudy/home/proto_etl/fixedin /PROCESS=RECORD /ALIAS=chiefs /FIELD=(STATE, TYPE=ASCII, POSITION=1, SEPARATOR=’|) /FIELD=(PARTY, TYPE=ASCII, POSITION=2, SEPARATOR=’|) /FIELD=(TERM, TYPE=ASCII,POSITION=3, SEPARATOR=’|’) /FIELD=(PRESIDENT, TYPE=ASCII, POSITION=4, SEPARATOR=’|’) /SORT /KEY=(STATE, ASCENDING) /OUTFILE=C:\IRI\CoSort95\voracity\workbench\workspace\proto_etl\sort1map2flat.out /PROCESS=RECORD /FIELD=(PRESIDENT, TYPE=ASCII, POSITION=1, SIZE=23, PRECISION=0) /FIELD=(TERM, TYPE=ASCII, POSITION=28, SIZE=9, PRECISION=0) /FIELD=(PARTY, TYPE=ASCII, POSITION=40, SIZE=3, PRECISION=0) /FIELD=(STATE, TYPE=ASCII, POSITION=45, SIZE=2, PRECISION=0)
After creating a workflow via wizard or the design palette, right-click on the Transform Mapping Block (pink-beige by default) and select >> IRI Diagram Actions >> Preview with Test Data:
As soon as the option is selected, the preview tab opens to display:
In this case, alphanumeric data was generated with the same formatting called for in the transform block (/OUTFILE section of the .scl task). You can control the number of rows to preview in the Workbench preferences (IRI, Flow) menu.
If you have production data available and want to display the task result before running the entire job, select IRI Diagram Actions >> Preview:
Actual values appear in the Preview tab:
Clearly, real data looks better than synthetic test data, but even with the latter, you at least get a sense of the layout.
For more realistic test data in ETL task prototypes, use the New Test Data Job — and possibly the Generate New Set File — wizard from the RowGen toolbar menu to generate column values from randomly selected real data in set files.