Creating a Voracity Flow Using Existing IRI Scripts (Part…
This is the third in a series of articles for creating an IRI Voracity ETL flow of a month-end job for processing sales transactions.
In the first article, we brought an existing CoSort SortCL job script that processes month-end sales transactions into Voracity and made modifications. In the second (previous) article, we demonstrated how to:
-
- add another SortCL job script called MonthEndSales.scl to the existing Voracity flow diagram by adding a second Transform Mapping Block to the right of the one for the script SortTransSelect.scl
- connect the 2 transform mapping blocks where the output of one block is the input of the next block
- add a second output for the second transform mapping diagram
- define an aggregation for that output and connect the aggregation field to the field from which the aggregation was derived
- change the Output Write Type for an output table so the table will not be truncated
In this article, we will expand the flow by using:
- Command Line blocks where we can do things like define environment variables, write to log files, and call programs
- Decision blocks where the path that is followed by the flow is decided by whether a condition is true or false
Currently, the flow contains a Start block and two Transform Mapping blocks that represent two SortCL job scripts. A batch file that sequentially runs the two scripts has been created from the flow. At the same time, the two job scripts were updated.
Below is the flow diagram from the end of the previous article in this series:
Here is the batch script for the month end job as it now exists.
@echo off sortcl/SPECIFICATION=SortTransSelect.scl sortcl/SPECIFICATION=MonthEndSales.scl
We now want to:
- create a log file that tracks progress of the job
- define environment variables for use in a SortCL job script and in a batch script
- give the exit status for the SortCL scripts
- record the statistics for each SortCL script in the log file
- decide whether to execute the second script based on the exit status of the first script
- record the start and end time for the job
This is done with a combination of Command Line blocks, Decision blocks, and small batch files.
.Remove the Start Block Connector
We need to remove connectors where we are adding additional blocks. We are adding two Command Line blocks between the Start block and the first transform mapping block. Right-click on the connector arrow (it will turn blue) > Edit > Delete from Model.
Create the First Command Line Block
Command Line blocks define the execution of an operating system command. To specify more than one command at a time, you could:
- Create multiple sequential blocks
- Concatenate the commands with an ampersand (&)
- Call a batch file containing the commands
From the Palette, drag the Command Line utility into the space between the Start block and the SortTransSelect.scl Transform Mapping Block. In this block, we want to set the two date values that are used in the /QUERY statement of the script SortTransSelect.scl using environment variables. This is because the job is run each month and the values will change based on the month being processed.
Double click on the command line block, type SetDates for the Name, type set STARTDATE=161201 & set ENDDATE=170101 in Command, and click Finish. This sets the two environment variables STARTDATE and ENDDATE. Below is the Command Line dialog.
Create the Second Command Line Block
From the Palette drag the Command Line utility component between the SetDates block and the SortTransSelect.scl Transform Mapping Block. In this block, we call a batch file that:
-
- Logs the system start time for the MonthEnd Process to a file called time.log
- Creates the file MonthEnd.log by declaring the start of the MonthEnd job
- Saves the date interval for the transactions being processed using environment variables set in the preceding Command Line block to the file called MonthEnd.log.
Create and save the file begin.bat with an IRI Workbench text editor. Enter these contents:
echo Start time for MonthEnd process: %TIME% > time.log echo Start MonthEnd Process >> MonthEnd.log echo From %STARTDATE% up to %ENDDATE% >> MonthEnd.log
The environment variable %TIME% gives the current system time.
Double click on the Command Line block, type BeginMonthEnd for the Name, type call begin.bat for Command, and click Finish. Now connect Start to SetDates to BeginMonthEnd to the infile for SortTransSelect.scl.
Use Date Environment Variables
Double-click on the SortTransSelect.scl Transform Mapping Block, name it SortTrans Transform Mapping Diagram. In the Section Options of the input, double-click on the Query.
In the Query dialog, change 161201 to $STARTDATE and 170101 to $ENDDATE, then click Finish. Save the diagram.
Notice that we are not using the syntax for windows environment variables where the variable is encased with a percent sign(%). The syntax here is the leading dollar sign($) used with Linux and Unix operating systems because this is a requirement of the SortCL scripting language.
Create the Third Command Line Block
Remove the connector from the first TransForm Mapping Block to the second.
Create a file named status1.bat using a text editor. This file is called in a new Command Line block and has lines that:
- use the command echo. to create blank lines in the log file
- use other echo commands to create lines of demarcation
- capture the exit status or ERRORLEVEL from the SortCL script SortTransSelect.scl and save it to an environment variable
- write the exit status that was captured in the previous environment variable to the log file
- send the statistics file associated with the SortCL script to the log file
The lines in the file status1.bat are:
set SORTTRANSSTATUS=%ERRORLEVEL% echo. >> MonthEnd.log echo. >> MonthEnd.log echo ================================================================ >> MonthEnd.log echo %SORTTRANSSTATUS% is the exit Status for SortTransSelect.scl >> MonthEnd.log type monthend.stat >> MonthEnd.log echo ================================================================ >> MonthEnd.log
Drag the Command Line utility from the Palette into the space between the two Transform Mapping blocks. Double-click on the block, type Status1 for the Name, enter call status1.bat for the Command, and click Finish.
Create a Decision Block
A Decision block directs the flow based on whether a condition is true or false. In our flow, the decision will be based on the exit status of the SortCL script SortTransSelect.scl. If that job executes without errors (status of 0), the flow continues to the script MonthEndSales.scl. If the status is not 0, then the flow goes to the end of the job without executing anything further.
All blocks that the Decision block references must be created first. Therefore, we need to create the End_of_Process block. Drag a Command Line utility from the Palette into the space between and below the Status1 and MonthendSales.scl blocks. Create a .bat file called end.bat. This file will create the final entries for the log file. It will write the Start Time and the End Time for the MonthEnd job to the log file and will indicate that the job has ended.
Here are the lines in end.bat:
echo. >> MonthEnd.log echo. >> MonthEnd.log echo ================================================================== >> MonthEnd.log type time.log >> MonthEnd.log echo End time for MonthEnd process: %TIME% >> MonthEnd.log echo ================================================================== >> MonthEnd.log echo End of MonthEnd process >> MonthEnd.log
Double-click on the Command Line block, name it End_of_Process, type call end.bat for Command, click Finish.
Drag the Decision utility into the space between Status1 and MonthEndSales.scl. Fill out the Decision dialog: type %SORTTRANSSTATUS% equ 0 for Criteria. Select the block MonthEndSales.scl from the drop-down for True, select the block End_of_Process from the drop-down for False, and click Finish.
There are arrows connecting to the blocks for True and False. Now connect an output file from SortTransSelect.scl to Status1 to the Decision block.
Create Another Command Line Status Block
Create another .bat file similar to status1.bat and call it status2.bat. The lines in it are:
set MONTHENDSALESSTATUS=%ERRORLEVEL% echo. >> MonthEnd.log echo. >> MonthEnd.log echo ================================================================ >> MonthEnd.log echo %MONTHENDSALESSTATUS% is the exit Status for MonthEndSales.scl >> MonthEnd.log echo ================================================================ >> MonthEnd.log echo Below are the SortCL statistics for MonthEndSales.scl >> MonthEnd.log type monthend.stat >> MonthEnd.log
Drag a Command Line utility from the Palette into the space below MonthEndSales.scl and to the right of End_of_Process. Double click on the block, name it Status2, type call status2.bat for Command, and click Finish.
Add the connectors an output file of MonthEndSales.scl to Status2 to End_of_Process.
We do not need a second decision block because the flow will go to the End_of_Process block whether or not the script MonthEndSales.scl executes without errors.
Create the Flow Batch File
Here is the completed flow diagram:
Now create the batch file MonthEnd.bat by right-clicking the flowlet and selecting IRI Diagram Actions > Export Flow Component. On the Batch Component Options dialog, verify the project, confirm the File name is MonthEnd.bat, and click Finish. Answer Yes to any save prompts.
There is now a new Monthend.bat file in the project.
@echo off set STARTDATE=161201 & set ENDDATE=170101 call begin.bat sortcl/SPECIFICATION=SortTransSelect.scl call status1.bat IF %SORTTRANSSTATUS% equ 0 (sortcl /SPECIFICATION=MonthEndSales.scl call status2.bat call end.bat )ELSE (call end.bat )
To execute, right-click on MonthEnd.bat in the Project Explorer, select Run as => Batch Program. The batch script runs executing each command and script in sequence and writes to the file MonthEnd.log. Below is the partial contents of the log file:
Start MonthEnd Process From 161201 up to 170101 ================================================================== 0 is the exit Status for SortTransSelect.scl ================================================================== Below are the SortCL statistics for SortTransSelect.scl ___________________________________________________________________ CoSort Version 9.5.3 R95160317-1135 32B SortCL STATISTICS . . . . . . . . . . . . Records processed: 14 read 14 kept 12 sorted 24 output Began: 16:32:18 Ended: 16:32:18 Total: 00:00:00.22 ___________________________________________________________________ ================================================================== 0 is the exit Status for MonthEndSales.scl. Below are the SortCL statistics for MonthEndSales.scl: ___________________________________________________________________ CoSort Version 9.5.3 R95160317-1135 32B SortCL STATISTICS . . . . . . . . . . . . Records processed: 1 sorted 4 output Began: 16:32:18 Ended: 16:32:19 Total: 00:00:00.26 ================================================================== Start time for MonthEnd process: 16:32:18.61 End time for MonthEnd process: 16:32:19.16 ================================================================== End of MonthEnd process
This shows that the MonthEnd job starts giving the date interval that was specified by the environment variables we defined. Each script ran without errors and the statistics for each are in the log file. The Start and End time for the whole job is provided at the end of the log file.
Remember to contact voracity@iri.com if you have any questions or need help with your flow.