How To Extract Serialized Data From Strings In One Go In T-SQL
Consider the following data:
DECLARE @OrderString VARCHAR(MAX) = '1=1;|2=|3=|4=1;|5=|';
This data was being returned in an externally generated XML file, so out of the control of the developer.
Each element of information is delimited by a pipe |. Within each element, the first number, to the left the equal sign =, represented an Item Number. The second number, to the right of the equal sign, represented an Order Quantity. This could be any number, and if empty, it meant zero.
The developer needed to return this data in form of a table. He also needed to return only the rows with an Item Quantity greater than zero.
In T-SQL, this conversion can be done in one go, using a single query. Let’s go through the steps.
Remove Unnecessary Characters
First of all, if take a look at the source data, we can clearly see an oddity there. The semi-colon character seems to appear only after and when there is a declared Item Value. This is extra information we don’t need, as we already have each block delimited by the pipe character. So let’s take this out of the equation:
WITH Step0 AS
(
SELECT
String = REPLACE(@OrderString, ';', '')
),
Identify Elements
Now, to make our lives easier, let’s split each element, or, “block” into a separate row. This can be accomplished with a good old recursive CTE:
Step1 AS
(
SELECT
Block = SUBSTRING(String, 1, CHARINDEX('|', String) - 1),
String = SUBSTRING(String, CHARINDEX('|', String) + 1, LEN(String))
FROM
Step0
UNION ALL
SELECT
Block = SUBSTRING(String, 1, CHARINDEX('|', String) - 1),
String = SUBSTRING(String, CHARINDEX('|', String) + 1, LEN(String))
FROM
Step1
WHERE
LEN(String) > 0
),
Identify Columns
Now that we have a row identified for each block, we can then extract the columns themselves:
Step2 AS
(
SELECT
ItemNumber = SUBSTRING(Block, 1, CHARINDEX('=', Block) - 1),
ItemValue = SUBSTRING(Block, CHARINDEX('=', Block) + 1, LEN(Block))
FROM
Step1
),
Convert & Filter
Finally, now with proper columns in place, we can filter out the zero quantity items:
Step3 AS
(
SELECT
ItemNumber = CAST(ItemNumber AS INT),
ItemValue = CAST(ItemValue AS INT)
FROM
Step2
WHERE
CAST(ItemValue AS INT) > 0
)
SELECT
*
FROM
Step3;
And that was it. Nothing to it.
Full Code
DECLARE @OrderString VARCHAR(MAX) = '1=1;|2=|3=|4=1;|5=|';
WITH Step0 AS
(
SELECT
String = REPLACE(@OrderString, ';', '')
),
Step1 AS
(
SELECT
Block = SUBSTRING(String, 1, CHARINDEX('|', String) - 1),
String = SUBSTRING(String, CHARINDEX('|', String) + 1, LEN(String))
FROM
Step0
UNION ALL
SELECT
Block = SUBSTRING(String, 1, CHARINDEX('|', String) - 1),
String = SUBSTRING(String, CHARINDEX('|', String) + 1, LEN(String))
FROM
Step1
WHERE
LEN(String) > 0
),
Step2 AS
(
SELECT
ItemNumber = SUBSTRING(Block, 1, CHARINDEX('=', Block) - 1),
ItemValue = SUBSTRING(Block, CHARINDEX('=', Block) + 1, LEN(Block))
FROM
Step1
),
Step3 AS
(
SELECT
ItemNumber = CAST(ItemNumber AS INT),
ItemValue = CAST(ItemValue AS INT)
FROM
Step2
WHERE
CAST(ItemValue AS INT) > 0
)
SELECT
*
FROM
Step3;