Match and Join

 

Next Steps
Overview Aggregate Cross-Calculate Custom Transforms Formats/Keys Lookup Match/Join Pivot/Unpivot Data Remapping Scrub/Cleanse Select/Filter Sort/Merge Substrings Type-Convert

Challenges


Data matching between large tables for query, reporting, or virtualization purposes can take a long time. SQL join operations are typically inefficient in large scale data integration (unification) operations. Custom programs designed to bring unlinked items together may also be slow, or difficult to maintain.

You may also need a fast and easy way to compare two or more files over one or more fields. How do you do that and identify the changes that have occurred (inserts, updates, deletes) between two files, especially when data are in different file formats, or across tables in different databases?

Solutions


The Sort Control Language (SortCL) program in the IRI CoSort data transformation package and IRI Voracity data management (ETL) platform can simultaneously filter, sort, join, aggregate, and reformat multiple table and file sources at once.

SortCL uses simple, explicit 4GL text files to define data sources, targets, and transformations. Automatic script creation, cross-platform execution, modification, and management are supported in the free Eclipse GUI, IRI Workbench.

SortCL supports inner and outer data joining functionality to produce combined outputs and file compares based on specified conditions. With SortCL you can:

  1. Input, join, and output one or more pre-sorted or unsorted tables and/or files
  2. Eliminate inner join results from an outer join
  3. Eliminate and reformat null records

In the same, simple job script and I/O pass with data joining, you can also:

  1. Cross-calculate and derive new values from matched result
  2. Add field-level data masking functions to sensitive fields (PII)
  3. Custom-define multiple detail and summary report targets
  4. Hand-off selected information in different formats for data visualization tools.

The bottom line? The big data matching techniques supported in SortCL join operations allow you to bring matching data from disparate sources together, exclude non-matches, compare files and table data externally, capture changed data, produce business intelligence from it, and reduce database query and refresh overhead.

Did you know that IRI CoSort was the first data management product to join data in flat files? IRI introduced join operations in CoSort in 1999.

Share this page

Request More Information

Live Chat

* indicates a required field.
IRI does NOT share your information.