T-SQL Ninja #40

STRING_SPLIT

What does STRING_SPLIT do?

Over the weeks you’ve gotten to know some possibilities to create comma-separated lists in SQL, besides STUFF in combination with SELECT FOR XML and STRING_AGG also CONCAT_WS. This week we want to go the opposite way and split a string containing comma-separated values into its constituent parts. Since SQL Server 2016 there is a way to do this in the Microsoft relational database engine with STRING_SPLIT.

How to use STRING_SPLIT

Until SQL Server 2016, if you wanted to split strings, you usually had to write a custom function that would either operate in a WHILE loop using CHARINDEX and SUBSTRING or convert the list to an XML object that could be queried, as follows

SELECT item = y.i.value('(./text())[1]', 'nvarchar(4000)')
FROM 
( 
    SELECT x = CONVERT(XML, '' 
            + REPLACE('Akio,Hiraku,Kazuo', ',', '') 
            + '').query('.')
) AS a CROSS APPLY x.nodes('i') AS y(i)

Meanwhile this has become much easier, because SQL Server offers with STRING_SPLIT a method to split a comma-separated list into its parts without further pull-ups. The STRING_SPLIT function returns a table with exactly one column called “value” containing the individual components of the string. To see the components of the string, you have to query the “value” column from this return value:

SELECT [value] FROM STRING_SPLIT('Akio,Hiraku,Kazuo', ',')  

It is important to note that unlike, for example, the REPLACE function, the STRING_SPLIT function only accepts a string with a length of one as the second argument, i.e. as a separator. It is therefore not possible to separate an HTML string, for example, by 
 tags. But what happens if your string contains, for example, two commas directly after each other?

SELECT [value] FROM STRING_SPLIT('Akio,Hiraku,,Kazuo', ',') 

In this case, your return table contains a row with an empty string, but no NULL value. The same happens to you if the string ends with a comma:

SELECT [value] FROM STRING_SPLIT('Akio,Hiraku,,Kazuo,', ',')  

If you want to stop these empty returns, you have to filter them using a WHERE condition, which is possible, of course, since you are doing a query on the return value. To do this, we first have the length of each part output:

SELECT 
     [value]
    ,LEN(TRIM([value])) AS [Length] 
FROM STRING_SPLIT('Akio,Hiraku,,Kazuo,', ',')  

Of course you can now filter the result set to this length:

SELECT 
     [value]
    ,LEN(TRIM([value])) AS [Length] 
FROM STRING_SPLIT('Akio,Hiraku,,Kazuo,', ',')  
WHERE LEN(TRIM([value])) > 0

If you now use this in combination with STRING_AGG, you can, for example, clean up a comma-separated list and filter the empty values and the commas at the end:

SELECT 
    STRING_AGG([value], ',')
FROM STRING_SPLIT('Akio,Hiraku,,Kazuo,', ',')  
WHERE LEN(TRIM([value])) > 0

After you have learned many methods to connect single strings to strings in the last weeks, you have now learned the opposite function to split a string back into its parts. Use this katana to split your comma-separated strings properly in the future without ugly custom functions.

References

Ninja-Notebooks @ GitHub

Leave a Reply

Your email address will not be published. Required fields are marked *