# CSV Schema Settings

Schema Parser implements position-aware parsing of CSV files. Once the file is converted into a tabular model, each cell is assigned a unique address and its value can be retrieved using the cell(rowIndex, columnIndex) function. The schema parser reads rows and columns from top left to bottom right. The range of rows and columns processed is controlled with index and step arguments passed to the select function.

The select() function implements RFC 7111 selections using URI Fragment Identifiers, including row# , col# , and cell# with a custom extension controlling iteration step. See extended ABNF syntax here.

The cell where the cursor is located is called the active cell. Its value is obtained with the cell(row, col) function where row and col arguments represent indexes of the current row and column. Values of other cells can be obtained using absolute or relative references. For example, cell(1, col) refers to the cell located in the 1st row and the same column as the active cell. cell(row, col-1) refers to the cell located in the same row to the left of the active cell.

Value retrieved with the cell(rowIndex, columnIndex) function can be used to set metric, entity, timestamp, and tag fields to assemble series , property , or message commands from the value of the active cell and referenced cells in the header and lead columns. JavaScript expressions are supported for modifying and filtering cell values.

Schema Parser example:

Input File:

DateTime ; sensor01 ; Status ; sensor02 ; Status 2015 - 10 - 29T00 : 00 : 00Z ; 19.2 ; Provis ; 11.3 ; ok 2015 - 10 - 29T00 : 05 : 00Z ; 19.8 ; ok ; 12.9 ; ok

Schema:

select ( "#row=2-*!1" ) . select ( "#col=2-*!2" ) . addSeries ( ) . timestamp ( cell ( row , 1 ) ) . entity ( cell ( 1 , col ) ) . metric ( 'power_kwh' ) . tag ( 'status' , cell ( row , col + 1 ) . toLowerCase ( ) ) ;

Explanation:

select("#row=2-*") : RFC7111 selection. Read rows starting with 2nd row with step 1 > '2015-10-29T00:00:00Z; 19.2; provis; 11.3; ok' .

: selection. Read rows starting with 2nd row with step 1 > . select("#col=2-*!2") : RFC7111 selection. Read columns in the current row starting with 2nd column with step 2: 2,4,6. etc. > '19.2' .

: selection. Read columns in the current row starting with 2nd column with step 2: 2,4,6. etc. > . timestamp(cell(row, 1)) : Set time to '2015-10-29T00:00:00Z' which is the value of the cell located in the current row, 1st column.

: Set time to which is the value of the cell located in the current row, 1st column. entity(cell(1, col)) : Set entity to 'sensor01' which is value of cell located in the 1st row, current column.

: Set entity to which is value of cell located in the 1st row, current column. metric('power_kwh') : Set metric name to a predefined value.

: Set metric name to a predefined value. tag('status',cell(row, col+1).toLowerCase()) : Set tag status to 'provis' which is the lowercase value of the cell located in the current row to the right of the current column (col + 1) .

: Set tag to which is the lowercase value of the cell located in the current row to the right of the current column . Iterate to the next column with step 2, select("#col=2-*!2") , to cell '11.3' . Repeat chained functions after addSeries() .

Commands:

series e : sensor01 m : power_kwh = 19.2 d : 2015 - 10 - 29T00 : 00 : 00Z t : status = provis series e : sensor02 m : power_kwh = 11.3 d : 2015 - 10 - 29T00 : 00 : 00Z t : status = ok series e : sensor01 m : power_kwh = 19.8 d : 2015 - 10 - 29T00 : 05 : 00Z t : status = ok series e : sensor02 m : power_kwh = 12.9 d : 2015 - 10 - 29T00 : 05 : 00Z t : status = ok

If Schema parsing is enabled, only the following fields from the parser configuration are applied:

Delimiter

Line Delimiter

Text Qualifier

Comment Symbol

Padding Symbol

Decimal Separator

Grouping Separator

Fields Lengths

Date fields: Time Pattern, Offset, Time Zone

Replace Entities

Process Events

Discard NaN

Ignore Line Errors

Renamed Columns

Filter

# Schema Functions

# Select and Filter Functions

Name Required Description select(expression) Yes Selects rows, columns, or cell range to process using RFC 7111 selection syntax. filter(condition) No Optionally filter rows, columns and cells depending on rowText and cellText values, for example rowText.indexOf('test')>=0 .

# Initialize Command Functions

Name Required Description addSeries() No Create series command. addProperty() No Create property command. addMessage() No Create message command.

# Set Command Field Functions

Name Required ( addSeries ) Required ( addProperty ) Required ( addMessage ) Description entity(entityName) Yes Yes Yes Set entity name. timestamp(timestampValue) Yes Yes Yes Set timestamp. metric(metricName) Yes Unsupported Unsupported Set metric name. tag(tagName, tagValue) No No No Add tag with defined name and value. key(keyName, keyValue) Unsupported No Unsupported Add key with defined name and value. type(typeName) Unsupported Yes Unsupported Set property type. value(value) No Unsupported Unsupported Overrides series value (default value is current cell content). messageText(text) Unsupported Unsupported No Set message text. appendText(text, delimiter) Unsupported Unsupported No Append text to current message text. forEach(expression) No No No Accepts RFC 7111 #col= scheme, iterates over matched cells in the current row and applies chained-after functions to each cell, for example forEach('#col=5!2').tag(cell(row,col), cell(row,col+1));

Name Type Description col Integer Column index of the active cell. row Integer Row index of the active cell. value, cellText String Text content of the active cell. rowText String Current row full text. columnCount Integer Column count for current row. fileName String CSV file name being parsed, if available.

# Lookup Functions

Name Type Description cell(rowIndex, colIndex) String Returns content from the specified cell. notEmptyLeft(rowIndex, colIndex) String Finds a non-empty cell located to the left from the specified cell. notEmptyUp(rowIndex, colIndex) String Finds a non-empty cell located in the current or prior row in the specified column.

Row and column indexes begin with 1 .

. Row index of the active cell can be referenced with the row parameter.

parameter. Column index of the active cell can be referenced with the col parameter.

parameter. Relative index is specified with +/- , for example col+1 .

, for example . Row index can be smaller or equal to the index of the current row due to the streaming nature of the parser.

If index is not specified, the current index is used. Same as +0 or -0 .

# RFC 7111 Step Extension Syntax

ABNF Extension Syntax:

singlespec = position [ "-" position [ "!" step]] cellspec = cellrow "," cellcol [ "-" cellrow "," cellcol [ "!" steprow "," stepcol ] ] steprow = step stepcol = step step = number

RFC 7111 base syntax: https://tools.ietf.org/html/rfc7111#section-3

Examples:

#row=1-*!2 – Select odd rows.

– Select odd rows. #col=10-*!3 – Select every 3rd column starting with column 10.

– Select every 3rd column starting with column 10. #cell=1,2-5,*!1,2 – Select even columns in the first 5 rows.

# Schema-based Parser Examples