Frequently Asked CoSort Tuning Questions
Since the advent of automatic tuning in CoSort V10, it has been easier to work with the many available performance settings in CoSort Resource Control (which we call cosortrc or “RC”) files. Nonetheless, we are still asked good questions and think that sharing them and our answers can help other users.
These settings are relevant in standalone and API CoSort sorting (and some aggregation) operations. Those sorts also run in ETL and offline reorg job performance in IRI Voracity, as well as IRI RowGen DB subsetting and test data synthesis jobs (which join, sort, and load). In all these cases, the base ‘SortCL’ program is used.
Q1:How can I tell which RC settings are being used?
CoSort resource control (RC) settings can be set from several places and settings can be overridden by another file allowing preferred tuning to be set for a user or project. The search order for RC settings is explained in Appendix D of the CoSort manual.
The sortcl executable will display the RC settings when you type the “sortcl /RC” command. RC settings are dependent on the user running sortcl and the current directory when sortcl is run. Thus, be sure to run sortcl /RC from the directory where sortcl jobs will run, and with the username that would run the sortcl jobs.
The example output from sortcl /RC below first shows the default RC settings, then the RC files found and RC settings modified by those files.
After each SortCL job runs, performance details are appended to a cosort.log file, and a more detailed .cserrlog file. Their common directory location is specified in the RC file.
The .cserrlog file contains information about memory used, but is overwritten by the next SortCL run. The amount of detail in the .cserrlog can be increased via the “MONITOR_LEVEL” setting.
Example .cserrlog output:
Q2:Which RC settings affect job performance the most?
Many RC settings adjust options that have little or no impact on performance. The most influential settings for job performance are MEMORY_MAX, BLOCKSIZE, THREAD_MAX, WORK_AREAS, and WORKFILE_COMPRESSION.
MEMORY_MAX specifies the amount of memory to be allocated for sort buffers. The AUTO setting attempts to determine the volume of input and if possible, will allocate enough memory to perform the sort in memory, avoiding the need for temporary work files.
This produces the best performance when the required memory is available. The amount of available memory will decrease when running with other processes competing for memory, degrading performance.
The MEMORY_MAX MINIMIZE setting allocates a small amount of memory for sort buffers and relies on temporary work files. This setting should be used when there are many processes running and competing for physical memory.
A constant value can be used for MEMORY_MAX and BLOCKSIZE, and may be optimal when most of the jobs use a similar size and record length. Records being sorted are stored in blocks of memory specified by the RC BLOCKSIZE. Many blocks should fit in the MEMORY_MAX value and increasing BLOCKSIZE often improves performance.
When BLOCKSIZE is too large it will cause problems such as insufficient merge memory. For this reason, the default setting is small. The AUTO and MINIMIZE settings also override the RC BLOCKSIZE setting and use a relatively small value.
WORK_AREAS sets the path where temporary work files are stored. The WORK_AREAS setting and IO subsystem performance will significantly impact performance on large sort jobs.
If no WORK_AREAS are specified, the current directory is used. Assigning multiple work areas, each using a different physical path for each work area will improve performance. It is best to create the number of work areas matching the THREAD_MAX RC setting and avoiding the path used by input or output files.
The amount of physical and virtual memory available is determined by the system hardware and O/S, and on settings within the O/S. Many jobs perform very well on 32-bit systems, but 64-bit sortcl is recommended when running jobs that process input data near 2GB or more.
If you encounter an “Error 2: insufficient memory” on Unix or Linux, use the “ulimit” command to display and increase user resource limits. Contact support@iri.com and include the output from the sortcl /RC command and the .cserrlog file.
WORKFILE_COMPRESSION decreases the amount of I/O at the expense of additional memory and processing time. Processing is usually faster than the I/O.
This RC setting defaults to AUTO, which samples the initial compression ratio to determine whether to compress the temporary sort work files or not. If your system has extremely fast I/O, or you need to reduce memory usage, set WORKFILE_COMPRESSION OFF.
Increasing the RC MONITOR_LEVEL, or using the RC AUDIT_LOG setting, creates additional work which can slow the job a little.
Q3:How do I know the number of threads and RAM I’m actually using?
The cosort.log file and .cserrlog and the output from sortcl /RC display the number of threads specified in the RC file. Run the command sortcl /rc as the user and from the directory where the job is run to see the RC settings and where they were set. Look in the hidden .cserrlog file to see the actual resources used when the job ran.
Q4:(How) does the THREAD_MAX value affect MEMORY_MAX?
The MEMORY_MAX value is divided evenly between threads by the THREAD_MAX setting. Each thread will also use additional memory for local storage.
Q5: Are there any conditions or settings for memory usage over MEMORY_MAX?
The sort operates fastest when the data can fit in RAM so temporary work files are not needed.
When MEMORY_MAX is set to AUTO, CoSort will sort in RAM if enough RAM is available to process the amount of input data. Otherwise, a small amount of RAM is allocated and temporary work files are used.
When multiple jobs are running, CoSort might not be able to allocate the RAM, or the load on the system can lead to excessive paging which will decrease performance. In this case, MEMORY_MAX MINIMIZE can be set to use less RAM and more I/O for temporary work files.
Q6: Let’s say my machine has 128GB memory. If I set the MEMORY_MAX value above that, could CoSort fail or work improperly?
When a large MEMORY_MAX value is used, CoSort will fail if it is unable to allocate that amount of physical RAM. Verify the actual amount of system RAM and user resource limits.
Q7: What is the upper limit for memory usage when MEMORY_MAX is set to AUTO? Is it the upper limit of physical memory on the server?
There is no upper limit imposed by CoSort on MEMORY_MAX except by the hardware and operating system. CoSort will reduce the amount of RAM requested when the system cannot supply the requested amount, but will fail if still unable to allocate that RAM.
The physical memory on the server is used for the operating system and processes running on the system. Each process can access a large amount of virtual address space so not all of the virtual memory must exist in physical RAM at the same time.
A 32-bit process is limited to 2GB, but 64-bit processes may use several TB of virtual address space depending on the hardware and OS and OS settings. The amount of physical RAM installed and swap space available limit the amount of virtual memory that can be used
Q8: What is the difference between AUTO and value specific max memory settings?
The value assigned to MEMORY_MAX affects the amount of RAM used for buffers during the sort and merge operations but additional RAM is used for other purposes. CoSort will always attempt to allocate the entire amount of RAM specified by MEMORY_MAX for sort buffers. A MEMORY_MAX value setting often allocates more memory or less memory than needed for optimum performance. The AUTO setting prevents allocating more memory than the current job requires.
In addition to the information here, we also suggest you consult:
- Section D in the Appendix chapter of your CoSort (V10 or above) manual
- https://www.iri.com/ftp9/pdf/CoSort/CoSort10BestPractices2020.pdf
- https://www.iri.com/company/faqs#cosort-sorting-apps-&-performance-tuning
- support@iri.com, subject to your maintenance status, for further assistance