I needed to determine, first, if a certain character was present in a file's field, and then how many times it occurs. Using RPG's file input I would not be able to do this without reading and checking every record's field. Could this be done easier in SQL? Yes.

I am going to give examples of various ways of being able to tell if a certain character is in a field in a file, using increasing complex methods I found. At first I will just give the SQL Select statement for each, and at the end the SQLRPGLE code I created to do this in a program.

Regular readers of this blog will not be surprised that my data is held in a file called TESTFILE , and it contains just one field F1 . If I run a very simple Select statement over this file I can see what rows/records are in the file.

01 SELECT F1 FROM TESTFILE

it would be hard to come up with a simpler statement than that. When executed I see:

F1 SIMON EDWARD SUSAN SIMMON SIMOM MURRAY ---------M

I want to select only those rows/records that contain the letter M. By using the WHERE clause with a wild card I can select all of those rows/records.

01 SELECT F1 FROM TESTFILE 02 WHERE F1 LIKE '%M%'

Line 2: The wild card character in SQL is the percent symbol ( % ). When I place a percent symbol either side of the M it will return the rows/records that contains M anywhere in the field.

F1 SIMON SIMMON SIMOM MURRAY ---------M

When searching the first thing I found was how to determine the first position of the character in the field. I know this was not relevant to the task at hand, but I could resist trying it out.

The LOCATE scalar function returns the first place the search character is found in the column/field. The format for this scalar function is:

LOCATE(<string to search for>, <field or column>) <starting position>

In my example I want to return the first place an M is found in the field:

01 SELECT F1,LOCATE('M',F1) AS FIRST_POSITION 02 FROM TESTFILE 03 WHERE LOCATE('M',F1) > 0

Line 1: I am defining a LOCATE as one of the columns in the returned results, and giving that column a meaningful name. I do not need a starting position here.

Line 3: This time I do need to define a start position for the scalar function. Here I am checking for anywhere in the field, i.e. greater than zero.

The results are:

F1 FIRST_POSITION SIMON 3 SIMMON 3 SIMOM 3 MURRAY 1 ---------M 10

If I change the WHERE to only look in the third position of the file:

01 SELECT F1,LOCATE('M',F1) AS FIRST_POSITION 02 FROM TESTFILE 03 WHERE LOCATE('M',F1) = 3

The results are:

F1 FIRST_POSITION SIMON 3 SIMMON 3 SIMOM 3

What if I change the starting position of the search to greater than three.

01 SELECT F1,LOCATE('M',F1) AS FIRST_POSITION 02 FROM TESTFILE 03 WHERE LOCATE('M',F1) > 3

The results are:

F1 FIRST_POSITION ---------M 10

I have to admit the first time I saw the result I was surprised, until I remembered only rows/records would be returned if the first occurrence of the character in the column/field is in a position greater than three. This row/record was returned as it is the only one where the first M occurs in a place greater than the third.

Returning to task at hand, finding the rows/records with my desired character and the number of times that character occurs in the field. I am going to show the Select statement I came up with and then describe scalar functions I used.

01 SELECT LENGTH(F1) - LENGTH(REPLACE(F1,'M','')),F1 02 LENGTH(F1),LENGTH(REPLACE(F1,'M','')) 03 FROM TESTFILE 04 WHERE LENGTH(F1) - LENGTH(REPLACE(F1,'M','')) > 0

Let me start with the REPLACE . As its name suggests it replaces one character in a field with another.

REPLACE(<column or field>, <character>, <replacement character>)

In my Select statement I am replace ever occurrence of M with null. Null is indicated by the two apostrophes ( ' ) without anything between them.

The first LENGTH functions returns the length of the field F1 , 10. The second LENGTH will return a value of less than 10 as the Ms have replaced by nulls decreasing the length of F1 . I know it sounds a bit strange, which is why I included these two values as columns in the results, line 2.

Line 4: I am only going to return any rows/records where the difference between the length of F1 and the value of the length when the Ms have been replaced by null is greater than zero. The more Ms there are the greater the difference, which, of course, corresponds to the number of Ms.

Numeric Expression F1 LENGTH ( F1 ) LENGTH 1 SIMON 10 9 2 SIMMON 10 8 2 SIMOM 10 8 1 MURRAY 10 9 1 ---------M 10 9

Now I have the SQL statement how can I put it in a SQLRPGLE program?

In this program I am going to make the character that is selected and counted be a variable. This could be passed to a program, subprocedure, etc. I am just going to show this code in its simplest form.

01 **free 02 dcl-s Char char(1) inz('M') ; 03 dcl-s SQL char(300) ; 04 exec sql SET OPTION COMMIT = *NONE ; 05 exec sql DROP TABLE QTEMP.OUTPUT ; 06 SQL = 'CREATE TABLE QTEMP.OUTPUT AS + 07 (SELECT F1, + 08 LENGTH(F1) - LENGTH(REPLACE(F1,''' + Char + 09 ''','''')) + 10 AS NBR_OF_CHAR + 11 FROM TESTFILE + 12 WHERE LENGTH(F1) - LENGTH(REPLACE(F1,''' + Char + 13 ''','''')) > 0) + 14 WITH DATA ' ; 15 exec sql EXECUTE IMMEDIATE :SQL ; 16 *inlr = *on ;

Line 1: As this is on an IBM i running 7.3 why would I not use totally free RPG.

Line 2: This is the definition of the variable I will be using for the search character.

Line 3: This variable will be used to contain the string for the SQL Select statement.

Line 4: I do not want to have commitment control on the file I will be creating. By using the SET OPTION it is turned off.

Line 5: I always like to drop/delete, or try to drop/delete, the output table/file before I create it.

Lines 6 – 14: This is the SQL statement that uses the Select Statement I developed before, to generate a table/file in QTEMP.

Line 8 and 9: Yes, lots of apostrophes. Three either side of the variable Char , and two either side of the null value. I have to do the same on lines 12 and 13.

line 14: Need the WITH DATA otherwise my table/file will be empty.

If I use debug and look at the value in SQL I can see that all the apostrophes are all aligned correctly.

SQL = ....5...10...15...20...25...30...35...40...45...50...55...60 1 'CREATE TABLE QTEMP.OUTPUT AS (SELECT F1, LENGTH(F1) - LENGTH' 61 '(REPLACE(F1,'M','')) AS NBR_OF_CHAR FROM TESTFILE WHERE LENG' 121 'TH(F1) - LENGTH(REPLACE(F1,'M','')) > 0) WITH DATA ' 181 ' '

Line 15: I am using the EXECUTE IMMEDIATE to execute the SQL command string in the SQL variable.

When the program is compiled and run it generates a table/file in QTEMP, that contains the following:

F1 NBR_OF_CHAR SIMON 1 SIMMON 2 SIMOM 2 MURRAY 1 ---------M 1

You can learn more about these scalar functions on the IBM website page about SQL scalar functions here.

This article was written for IBM i 7.3, and should work for earlier releases too.