



T-SQL Function to Count Number of Specific Character In a String or Text Considering Database Collation

If you need to count the number of times a character occurs in a string or text by using t-sql, you can use the REPLACE string function with LEN string function.

During my sql development, I realized how collation effects the results of counting specific characters in a given especially nvarchar() string variables or parameters.

You can find below a t-sql function which you can use to count the occurence or number of specific characters in a given text parameter.









Here is the source codes of the sql user-defined function which counts the number of occurences of a specific character in a given text or string variable taking the collation of the data into consideration.

If the collation parameter is set as NULL or is not given, the function will make the replacement in t-sql codes using the default database collation.

If the collation is specified as an input parameter to the user defined t-sql function, then the replacement process will be done according to that information.

----------------------------------------

--- CountNumberCharacterInString

--- T-SQL Function that returns

--- number of count of a specific character

--- in an input string or text

----------------------------------------

CREATE FUNCTION CountNumberCharacterInString

(

@string nvarchar(max),

@character nchar(1),

@collation varchar(100)

)

RETURNS int

BEGIN



DECLARE @count int



IF @collation IS NULL

SELECT

@count = LEN(@string) - LEN(

REPLACE(

@string,

@character,

''

)

)

ELSE IF @collation = N'Turkish_CI_AS'

SELECT

@count = LEN(@string) - LEN(

REPLACE(

@string COLLATE Turkish_CI_AS,

@character COLLATE Turkish_CI_AS,

''

)

)



RETURN @count



END



GO

You can see how important is the collation parameter especially for other than English like Turkish character sets.

In Turkish there are two different letters, letter "I" or "ı" and letter "İ" or "i".

This will cause problem or results into wrong solutions if you replace characters İ,i,I or ı in a given input string variable.

declare @input_string as nvarchar(50)

set @input_string = N'AIıaIiİaA'



select

dbo.CountNumberCharacterInString(@input_string, N'I', NULL),

dbo.CountNumberCharacterInString(@input_string, N'ı', NULL),

dbo.CountNumberCharacterInString(@input_string, N'İ', NULL),

dbo.CountNumberCharacterInString(@input_string, N'i', NULL)

select

dbo.CountNumberCharacterInString(@input_string, N'I', 'Turkish_CI_AS'),

dbo.CountNumberCharacterInString(@input_string, N'ı', 'Turkish_CI_AS'),

dbo.CountNumberCharacterInString(@input_string, N'İ', 'Turkish_CI_AS'),

dbo.CountNumberCharacterInString(@input_string, N'i', 'Turkish_CI_AS')

And the results of the above t-sql function calls results with the below returned data rows

I hope you have seen how the collation effects the count the number of specific characters in a string by using t-sql methods.

If you carefully check the t-sql codes of the udf (user-defined function), you will see that you have to code the conditions for each possible collation using If statements for example.

The reason for that is you can not create dynamic sql codes and execute the resultant t-sql statement within a user defined function.

But you can convert the t-sql user defined function into a sql stored procedure, and build your sql statement dynamically and execute the resultant sql statement codes using EXEC sp_ExecuteSQL.

If you build dynamic sql codes in a user defined function you will probably get the following error messages: Msg 557, Level 16, State 2, Line 1

Only functions and extended stored procedures can be executed from within a function.

Here is the t-sql codes for creating a sql proc which will return the number of occurences of a character in a given string parameter.

----------------------------------------

--- CountNumberCharacterInString

--- T-SQL Procedure that returns

--- number of count of a specific character

--- in an input string or text

----------------------------------------

CREATE PROCEDURE spCountNumberCharacterInString

(

@searchin nvarchar(max),

@searchfor nvarchar(max),

@collation varchar(100) = NULL,

@count int OUTPUT

)

AS



DECLARE @sql nvarchar(max), @ParmDefinition nvarchar(max)

SET @sql = '

SELECT

@count_p = (LEN(@searchin_p) - LEN(

REPLACE(

@searchin_p' + ISNULL(' COLLATE ' + @collation, '') + ',

@searchfor_p' + ISNULL(' COLLATE ' + @collation, '') + ',

''''

)

)) / LEN(@searchfor_p)'

SET @ParmDefinition = N'

@searchin_p nvarchar(max),

@searchfor_p nvarchar(max),

@count_p int OUTPUT'



EXEC sp_ExecuteSQL @sql, @ParmDefinition, @searchin_p = @searchin, @searchfor_p = @searchfor, @count_p = @count OUTPUT



GO





declare @count int

exec spCountNumberCharacterInString N'İiiİ', N'İI', N'Turkish_CI_AS', @count OUTPUT

select @count











Tweet

Related SQL Resources



