ASN.1 Integration with SortCL (IRI Voracity)
This article presents a more in-depth look at ASN.1 and processing encoded data messages based on ASN.1 schemas with the SortCL program supporting IRI Voracity platform products. This is article 2 in the five part series discussing ASN.1 and its current support in IRI software.
The ASN.1 Schema File
With each ASN.1 encoded data message there is always an associated ASN.1 schema. An ASN.1 schema offers a complete definition of every field in the message.
Each type described (such as Product, ProductColor, and ProductName) in ASN.1 must begin with an uppercase letter. Items that are components of a message (such as name, color) are called identifiers and must begin with a lowercase letter.
There are also a number of limitations, called constraints, that may be placed when defining types. Usually constraints result in smaller memory footprint and more compact encodings. They also restrict the permissible values of a type.
In ASN.1, all definitions are placed inside of a module which begins with the BEGIN keyword and ends with the END keyword.
A very simple schema may look like this:
ProductInfo DEFINITIONS XER INSTRUCTIONS ::= BEGIN ProductName ::= IA5String (FROM ("a".."z")) (SIZE(1..10)) ProductColor ::= ENUMERATED {red(0), green(1), blue(2), black(3)} Availability ::= BOOLEAN Price ::= INTEGER (10..100) Product ::= SEQUENCE { name ProductName, color ProductColor, available [ATTRIBUTE] Availability, price Price } ProductList ::= SEQUENCE (SIZE(1..23)) OF Product ENCODING-CONTROL XER GLOBAL-DEFAULTS MODIFIED-ENCODINGS END
A basic interpretation of the schema is that the module is called ProductInfo — it has information about products in a store.
- ProductName type is a string between 1 and 10 characters long, of which the characters must be between a-z, inclusive.
- ProductColor is an ENUMERATED type, meaning it must be one of the four described values – red, green, blue, or black.
- Availability is a BOOLEAN type, meaning it must be either TRUE or FALSE.
- Price is an INTEGER type that must be between (and including) the values of 10 and 100.
- Product is a sequence, or grouped collection of types.
- ProductList is a SEQUENCE OF Product (with a maximum size of 23 characters for each Product), meaning that the ProductList can contain multiple Products.
This schema is saved in a file I’m calling products.asn. Note that it is not a standard specification, but a custom (user-defined) sample specification for products sold in a store.
IRI Components for Processing ASN.1-Encoded Data
Support in the IRI Voracity platform and its included products — CoSort, NextForm, FieldShield, and RowGen — for files encoded per ASN.1 schema relies on IRI’s core program for processing structured data — called SortCL — and these two components:
- libcsasn1, an external module that gets called only when a SortCL job script specifies /PROCESS=ASN1 as the file type for a given /INFILE (data source) or /OUTFILE (target). This module processes or outputs the data based on the type of encoding rules specified in the SortCL script. For example, PER can be specified to process PER-encoded data instead of the default BER encoding. The data must meet the constraints of the ASN.1 specification.
- asn1_2ddf, a command-line executable (and IRI Workbench-embedding utility) that converts ASN.1 file specifications into the data definition file (DDF) format metadata that SortCL jobs require to parse and process those data layouts in a structured fashion. This tool thus eases the creation of the aforementioned types of scripts as a whole.
The syntax in SortCL programs for defining ASN.1 sources and targets looks like this:
/(IN/OUT)FILE=”{path_to_asn1_spec.asn[,path_to_spec2.asn]}, TYPE_OF_ENCODING_RULES[,PDU_TYPE_NAME];filename”
Syntax rules for these statements are documented in applicable IRI product manuals.
The asn1_2ddf Converter
The asn1_2ddf command-line utility, also supported in IRI Workbench, reads an ASN.1 specification and an input file to generate a Data Definition Format (DDF) metadata layout compatible with all structured SortCL job scripts. Input and output field formats in DDF syntax are required for all IRI data processing operations run through SortCL.
The jobs written for IRI Voracity, CoSort, FieldShield, NextForm, or RowGen operations rely on their typical data descriptions to parse and map data in structured and symbolic fashion, respectively. A DDF file representing an ASN.1 record could look like this:
Image displaying execution of asn1_2ddf from the command line to get SortCL metadata of a Basic Safety Message used in Intelligent Transportation Systems.
Configuration Options for ASN.1-Encoded Files
IRI support for ASN.1 encoded files is broad and dynamic, allowing for any combination of encoding rules and schema(s). Compare this to other solutions that target ASN.1 encoded messages based on a specific, fixed schema and only support one type of encoding rule.
Many options can be specified in the asn1_2ddf utility (or in its graphical interface in IRI Workbench), to match the wide variety of ASN.1 encoded data that exists.
The type of encoding rules can be specified optionally in asn1_2ddf. This should match the encoding rules of the data. The default encoding is assumed to be BER otherwise.
The type name of the PDU can be specified optionally as well. As mentioned previously, there are many protocols that do not follow the best practice of a single PDU referencing all other types.
In this case, data can only be interpreted based on the PDU rather than a uniform way. Adding the -t flag with the type name of the PDU to interpret will result in a DDF built for that particular PDU. The default behavior if there are multiple PDUs but no -t flag specified is to take the first PDU referenced in the schema(s).
A -p flag can be added as well as a command line argument to asn1_2ddf. This flag outputs a commented process line. This commented process line is not a part of the DDF itself, but can serve as a reminder of the type of process that the ultimate SortCL script will need to include.
To ease discovering the type names of the PDUs in those ASN.1 schemas with no single PDU that references all other types, -l can be specified as an option for asn1_2ddf. This will generate a list of all the PDU type names in the ASN.1 schema to the output file specified, rather than a DDF.
A flag can also be set to generate a CSV file that contains the converted data from the original input file. In that case, the DDF will be for the CSV file.
This flag will be used by IRI Workbench in the Discover Metadata wizard to generate a temporary CSV file that then can be viewed in the wizard. This method will also be used by other wizards to load and search records from encoded files.
The reason is that encoded files cannot be just loaded as normal, since they are defined by a specification, and many are encoded into binary data.
Import file metadata is an option available from the IRI Workbench metadata menu that can be used to get the metadata of a file with data encoded from an ASN.1 schema, and will call the asn1_2ddf utility with specified flags given in the wizard.
Image displaying the graphical interface for asn1_2ddf options available from “import file metadata” in IRI Workbench
Depending on the complexity of the ASN.1 specification, the final DDF file may contain a range of fields from several to several thousand.
Sometimes multiple ASN.1 schemas may need to be compiled at once. In this case, provide the path to each schema following an -a flag for every schema file needed. For example, the command:
asn1_2ddf -eBER -p -aRAP-0102.asn -aTAP-0310.asn rap_bin.ber
compiles both the RAP-0102.asn and TAP-0310.asn schema files needed to decode rap_bin.ber.
A snippet of the beginning of the RAP-0102.asn schema looks like this:
RapDataInterChange ::= CHOICE { returnBatch ReturnBatch, acknowledgement Acknowledgement } ReturnBatch ::= [APPLICATION 534] SEQUENCE { rapBatchControlInfo RapBatchControlInfo, returnDetails ReturnDetailList, rapAuditControlInfo RapAuditControlInfo } Acknowledgement ::= [APPLICATION 535] SEQUENCE { sender Sender, recipient Recipient, rapFileSequenceNumber RapFileSequenceNumber, ackFileCreationTimeStamp AckFileCreationTimeStamp, ackFileAvailableTimeStamp AckFileAvailableTimeStamp, fileTypeIndicator FileTypeIndicator OPTIONAL, operatorSpecList OperatorSpecList OPTIONAL } ReturnDetailList ::= [APPLICATION 536] SEQUENCE OF ReturnDetail ReturnDetail ::= CHOICE { missingReturn MissingReturn, fatalReturn FatalReturn, severeReturn SevereReturn } -- -- Structure of the individual RAP records -- RapBatchControlInfo ::= [APPLICATION 537] SEQUENCE { sender Sender, recipient Recipient, rapFileSequenceNumber RapFileSequenceNumber, rapFileCreationTimeStamp RapFileCreationTimeStamp, rapFileAvailableTimeStamp RapFileAvailableTimeStamp, specificationVersionNumber SpecificationVersionNumber OPTIONAL, releaseVersionNumber ReleaseVersionNumber OPTIONAL, rapSpecificationVersionNumber RapSpecificationVersionNumber, rapReleaseVersionNumber RapReleaseVersionNumber, fileTypeIndicator FileTypeIndicator OPTIONAL, roamingPartner RoamingPartner OPTIONAL, operatorSpecList OperatorSpecList OPTIONAL }
A snippet of the first 25 fields of the resulting SortCL DDF looks like this:
/FIELD=(RETURNBATCH_RAPBATCHCONTROLINFO_SENDER, POSITION=1, TYPE=ASCII, SEPARATOR=",") /FIELD=(RETURNBATCH_RAPBATCHCONTROLINFO_RECIPIENT, POSITION=2, TYPE=ASCII, SEPARATOR=",") /FIELD=(RETURNBATCH_RAPBATCHCONTROLINFO_RAPFILESEQUENCENUMBER, POSITION=3, TYPE=ASCII, SEPARATOR=",") /FIELD=(RETURNBATCH_RAPBATCHCONTROLINFO_RAPFILECREATIONTIMESTAMP_LOCALTIMESTAMP, POSITION=4, TYPE=ASCII, SEPARATOR=",") /FIELD=(RETURNBATCH_RAPBATCHCONTROLINFO_RAPFILECREATIONTIMESTAMP_UTCTIMEOFFSET, POSITION=5, TYPE=ASCII, SEPARATOR=",") /FIELD=(RETURNBATCH_RAPBATCHCONTROLINFO_RAPFILEAVAILABLETIMESTAMP_LOCALTIMESTAMP, POSITION=6, TYPE=ASCII, SEPARATOR=",") /FIELD=(RETURNBATCH_RAPBATCHCONTROLINFO_RAPFILEAVAILABLETIMESTAMP_UTCTIMEOFFSET, POSITION=7, TYPE=ASCII, SEPARATOR=",") /FIELD=(RETURNBATCH_RAPBATCHCONTROLINFO_SPECIFICATIONVERSIONNUMBER, POSITION=8, TYPE=NUMERIC, PRECISION=0, SEPARATOR=",") /FIELD=(RETURNBATCH_RAPBATCHCONTROLINFO_RELEASEVERSIONNUMBER, POSITION=9, TYPE=NUMERIC, PRECISION=0, SEPARATOR=",") /FIELD=(RETURNBATCH_RAPBATCHCONTROLINFO_RAPSPECIFICATIONVERSIONNUMBER, POSITION=10, TYPE=NUMERIC, PRECISION=0, SEPARATOR=",") /FIELD=(RETURNBATCH_RAPBATCHCONTROLINFO_RAPRELEASEVERSIONNUMBER, POSITION=11, TYPE=NUMERIC, PRECISION=0, SEPARATOR=",") /FIELD=(RETURNBATCH_RAPBATCHCONTROLINFO_FILETYPEINDICATOR, POSITION=12, TYPE=ASCII, SEPARATOR=",") /FIELD=(RETURNBATCH_RAPBATCHCONTROLINFO_ROAMINGPARTNER, POSITION=13, TYPE=ASCII, SEPARATOR=",") /FIELD=(RETURNBATCH_RAPBATCHCONTROLINFO_OPERATORSPECLIST, POSITION=14, TYPE=ASCII, SEPARATOR=",") /FIELD=(RETURNBATCH_RETURNDETAILS_MISSINGRETURN_STARTMISSINGSEQNUMBER, POSITION=15, TYPE=ASCII, SEPARATOR=",") /FIELD=(RETURNBATCH_RETURNDETAILS_MISSINGRETURN_ENDMISSINGSEQNUMBER, POSITION=16, TYPE=ASCII, SEPARATOR=",") /FIELD=(RETURNBATCH_RETURNDETAILS_MISSINGRETURN_OPERATORSPECLIST, POSITION=17, TYPE=ASCII, SEPARATOR=",") /FIELD=(RETURNBATCH_RETURNDETAILS_FATALRETURN_FILESEQUENCENUMBER, POSITION=18, TYPE=ASCII, SEPARATOR=",") /FIELD=(RETURNBATCH_RETURNDETAILS_FATALRETURN_TRANSFERBATCHERROR_ERRORDETAIL_ERRORCONTEXT_PATHITEMID, POSITION=19, TYPE=ASCII, SEPARATOR=",") /FIELD=(RETURNBATCH_RETURNDETAILS_FATALRETURN_TRANSFERBATCHERROR_ERRORDETAIL_ERRORCONTEXT_ITEMOCCURRENCE, POSITION=20, TYPE=ASCII, SEPARATOR=",") /FIELD=(RETURNBATCH_RETURNDETAILS_FATALRETURN_TRANSFERBATCHERROR_ERRORDETAIL_ERRORCONTEXT_ITEMLEVEL, POSITION=21, TYPE=ASCII, SEPARATOR=",") /FIELD=(RETURNBATCH_RETURNDETAILS_FATALRETURN_TRANSFERBATCHERROR_ERRORDETAIL_ITEMOFFSET, POSITION=22, TYPE=ASCII, SEPARATOR=",") /FIELD=(RETURNBATCH_RETURNDETAILS_FATALRETURN_TRANSFERBATCHERROR_ERRORDETAIL_ERRORCODE, POSITION=23, TYPE=ASCII, SEPARATOR=",") /FIELD=(RETURNBATCH_RETURNDETAILS_FATALRETURN_NOTIFICATIONERROR_NOTIFICATION_SENDER, POSITION=24, TYPE=ASCII, SEPARATOR=",") /FIELD=(RETURNBATCH_RETURNDETAILS_FATALRETURN_NOTIFICATIONERROR_NOTIFICATION_RECIPIENT, POSITION=25, TYPE=ASCII, SEPARATOR=",")
In the next article, complete SortCL script examples for messages encoded based on an ASN.1 schema will be demonstrated. These scripts make use of the metadata that can be generated through IRI Workbench or asn1_2ddf.
The other articles in the series are:
- Introduction to ASN.1
- SortCL ASN.1 Examples
- Using IRI Workbench with ASN.1 encoded data
- Gaining insight from Call Detail Records