Challenges
JSON is a popular data language and semi-structured file interchange format common in big data applications, NoSQL database collections, and IoT data streams. However, conversions between legacy index or flat files and JSON rely on slow parsing technologies which do not simultaneously enable processing.
Other JSON query or transformation tools on the market cannot turn high volumes of JSON data into analytic subsets or compiled information quickly, if at all. There has been no efficient way to rapidly convert, process, protect, or create huge JSON files.
For example, you may need to:
- Sort a huge JSON file
- Extract data, or report, from a JSON file
- Convert a CSV, LDIF, or other file to JSON
- Convert JSON to text, CSV, LDIF, ISAM, etc.
- Join data in a JSON file with another JSON, or different, source
- Mask, encrypt or otherwise de-identify PII in a JSON file
- Load JSON data to a spreadsheet or database
- Create an JSON file from a legacy or extract file
- Generate test data in JSON file formats
You may even need to perform more than one of these functions at the same time, against many massive source and target files.
Solutions
As of CoSort v10, IRI delivers JSON data conversion and processing functionality in several products. Choose based on need:
JSON File Conversion Only
Use the IRI NextForm product to move JSON files* into DB tables, convert a JSON file into another file format (like CSV, LDIF, COBOL, XML, etc.), or convert from another file formats into JSON.
NextForm includes an JSON file parser to automatically create the field layouts used in the file conversion scripts. NextForm also supports data type conversion at the field level, and the remapping of record layouts. NextForm job definitions also work in SortCL-compatible products like Voracity if you upgrade later.
* There are initially known limitations in handling functional calls and multiple array elements in unstructured JSON files. Provide IRI with sample input for analysis if you can first.
JSON File Conversion, Transformation, Masking, and/or Reporting
Use the SortCL program in the IRI Voracity platform or IRI CoSort package to convert, transform, mask, report from, and create new JSON file and other targets that represent structured data.
Declare one or more JSON and non-JSON files for input and output as part of any SortCL job involving data:
- filtering (select, scrub, links to DQ tools)
- transformation (sort, join, aggregate, calc, etc.)
- conversion (data-type and file-format migrations)
- reporting (CDC, detail and summary formats)
- protection (field encryption, de-ID, masking)
SortCL makes all of these capabilities, one or more at a time, available to data architects who need to work with JSON and other sources.
JSON Data Protection
Use IRI FieldShield to encrypt, mask, or otherwise de-identify values in structured JSON files or IRI DarkShield when the data you need to find and mask within them or NoSQL DBs are less structured. FieldShield and DarkShield are both inside IRI Voracity.
JSON Test Data
Use IRI RowGen if you need test data in JSON file formats. RowGen is included in IRI Voracity, and uses the same layout metadata as CoSort, NextForm, and FieldShield, so you can easily move between test data generation and real data transformation.