This is an iterator activity. It will read each row in a CSV file and on each iteration output the first up to 50 column values found in the row. The processing logic nested beneath FOR_EACH_CSVROW activity is repeated for each row read.
This activity is not intended for routine processing of large volumes of data. While the MAXROWS parameter permits you to specify that the activity will read and process more than 999 rows of data, it is not recommended in most instances. The activity can be useful however for transferring limited amounts of information between activities, transformation maps and the processing sequence variable pool.
This activity provides a PARSEOPTION parameter to permit you to control the way that a CSV file is parsed to best suit your case. More detailed information on the parsing options is given below under the heading CSV Parsing Options.
INPUT Parameters:
CSVFILEPATH: Required
This parameter must contain the full path and name of the file to be read.
eg C:\order.csv
or /orders/order_jan.csv
SEPARATOR: Optional
If the CSV file uses a separator other than a comma (for example, a semi-colon is commonly used in some locales), then this value should specify the separator character. The special value *TAB indicates a horizontal tab character. Otherwise, the separator should be 1 character in length and can consist of any character. If not specified, a default value of comma (,) is assumed.
NOTE: if a separator other than comma or *TAB is specified, then the value of the PARSEOPTION parameter will be disregarded. The *EXTENDED parsing option is always used in this case.
MAXROWS: Optional
This activity is not intended for routine processing of large volumes of data. For this reason, (and to avoid unintended "runaway" processes) it is limited, by default, to processing a maximum of 999 rows of data. This parameter allows you to choose to override that maximum, permitting you to process larger amounts of data, if appropriate for your solution. The default value for this parameter is 999 – increasing it is not recommended in most instances.
PARSEOPTION: Optional
Because there are no standards governing the format of data in CSV files, the PARSEOPTION parameter is provided to offer a choice of parsing techniques to suit commonly-used formats. You may choose from the following:
*SIMPLE
*STANDARD (this is the default value, except as noted below)
*EXTENDED
More detailed information on the parsing options is given below under the heading CSV Parsing Options.
Note that if the SEPARATOR parameter specifies a separator other than comma or *TAB, then the value of the PARSEOPTION parameter will be disregarded. The *EXTENDED parsing option is always used in this case.
SKIPFIRSTROW: Optional
Frequently the first row of CSV files contain identifiers or column headings, with the actual data not starting until row 2. In this case, *YES for this parameter to have the activity automatically skip the first row. If not specified, a default of *NO is assumed.
OUTPUT Parameters:
CSVROWUpon each iteration, this output parameter will contain the row number for the current CSV row read.
Note that if you use the SKIPFIRSTROW parameter to skip the first row in the file (for example, if it contains column headings), the skipped row is not counted in determing the row number. The first row returned still has row number 1, even though it may have been the second row in the file.
CSVCOLUMN1
CSVCOLUMN2
…
CSVCOLUMN50
Upon each iteration, these output parameters will contain the value for the corresponding column for the current CSV row read, up to the number of columns present in the data or a maximum of 50 columns.
Unlike some other file types, there are not standards or even universally accepted practices for the formatting of data in CSV files. This activity provides a PARSEOPTION parameter to permit you to strike the best balance between performance and the flexibility to handle a variety of commonly-used formats.
The PARSEOPTION parameter permits you to specify one of the values *SIMPLE, *STANDARD or *EXTENDED. These are further described below.
NOTE: if the SEPARATOR parameter specifies a separator other than comma or *TAB, then the value of the PARSEOPTION parameter will be disregarded. The *EXTENDED parsing option is always used in this case.
Parsing option: *SIMPLE
This parsing option offers the best performance. However:
(As further information for users accustomed with LANSA development, the activity implements this option using the TRANSFORM_FILE built-in function with either 'O' (comma separator) or 'T' (tab separator) specified for the Input file format argument.)
Parsing option: *STANDARD
This parsing option offers a good balance between good performance and flexibility.
However:
(As further information for users accustomed with LANSA development, the activity implements this option using the TRANSFORM_FILE built-in function with either 'O' (comma separator) or 'T' (tab separator) specified for the Input file format argument, along with post-processing of the column values to handle quoted strings as described.)
Parsing option: *EXTENDED
This parsing option offers the most flexibility to handle a wide variety of CSV formatting cases.
The *EXTENDED option is the most functional, but it is also the slowest. For best performance, especially if you are expecting to process large amounts of data, you should make sure that you use the *SIMPLE or *STANDARD option unless your CSV case truly requires the additional functionality offered by the *EXTENDED option.
(As further information for users accustomed with LANSA development, the TRANSFORM_FILE built-in function is NOT used in the implementation for this option. Instead, the parsing is entirely implemented in LANSA Composer code.)