Technology Temerity

Encapsulated XML ID List Parsing – In Progress

SQL

Introduction

As part of my ongoing MSSQL Versioning Project, it is often times necessary to pass a list of records to stored procedures, either from the controlling application or from a calling procedure. By use of standardized table design, you will normally only need to pass a list of primary keys. The procedure can query for any other information it needs from that point forward, making it fully self contained more maximum encapsulation.

For the list itself there are several options I know of, but only one is fully viable for my own needs. Your mileage may vary of course:

  1. Delimited List: On the surface this is the simplest of means. Just slap together a comma delimited string (“x, y, z”), break it down at the database console and off you go. If only it were actually that simple, and even if it was there’s not a chance on Earth I’m doing that. Neither should you. This is breaking the law of First Normal Form, something you never want to do for reasons well beyond the scope of this article. If you are curious, several (but by no means all) of the pitfalls of this approach are explained quite nicely here.
  2. Table Valued Parameters: TVPs are extremely powerful when used correctly and have their place, but for purposes of acting as lists or caches in a stored procedure, they have two serious drawbacks.
    1. TVPs are assumed by the query optimizer to contain a single row. This is of little to no consequence if only a few records are involved, but it can be disastrous once a threshold of ~100 is reached. Queries with execution times normally in the millisecond range may suddenly balloon into resource hogs requiring several minutes to complete.
    2. It’s rather unusual, but our environment is WIMP (Windows, IIS, MSSQL, PHP). Connectivity is provided by the sqlsrv driver, with an object oriented wrapper penned by yours truly. Unfortunately at time of writing, this otherwise excellent driver set does not support TVP.
  3. Common Language Runtime: Lots of fun to be had here, but like TVPs, it depends on a very specific environment. Otherwise it simply isn’t applicable. Even when it is, the realities of business often mean taking advantage of CLR adds layers of extra management and time wasted on service requests for the simplest of modifications. No thank you.
  4. XML: This is my method of choice. It’s reasonably fast, only giving up speed to CLR and TVP, and eventually surpassing the later as the number of records increases. It’s also T-SQL compliant, and thus quite portable. The downside is there’s more of a learning curve and you’ll want to design carefully to avoid huge strings. Let’s have a closer look at how…

Considerations

  • Efficiency: We want the XML itself and parser to be compact and fast as possible.
  • Scaling: The design should be solid and not break down under a heavy load.
  • Reuse. We need to encapsulate and standardize our code. It won’t do much good if every query or procedure requires an inline rewrite of the XML parsing.

Implementation

There are three basic scenarios where we will need to parse a list of IDs via XML.

Procedure A executes Procedure B, Sending List of IDs

This will be a common occurrence – in my MSSQL versioning design every procedure that updates data must first update the Master Table. Should sub data be involved, then multiple updates to Master table must first take place – one for each row of sub data updated. First Procedure A will establish the list of records to update as a temp table, as in the following example:

id created update_by update_ip active
1 2016-12-28 115 128.163.237.37 False
2 2016-12-28 115 128.163.237.37 False
3 2016-12-28 115 128.163.237.37 False

Once the table is ready, this query is run against it:

SET @xml_string_var = (SELECT id FROM <temp table of items to update> FOR XML RAW, ROOT)

The variable @xml_string_var will be populated with an XML string as follows. Note <root> and row. These are default outputs that we could change these by modifying our SQL above, but I prefer to leave them be. Since this little bit of SQL will be in nearly every data update procedure, let’s keep it simple and reusable as possible.

<root>
<row id=”1″ />
<row id=”2″ />
<row id=”3″ />
</root>

We can now execute Procedure B passing @xml_string_var as an XML string argument.

Procedure B Receives XML From Procedure A

Upon execution, Procedure B will need to break the XML back down into a table. Rather thanĀ Procedure B breaking the XML down inline, let’s outsource the work. We could do this with a stored procedure, but the moment we executed a procedure that in turn executed our XML parser, we would run smack into the irritating limitation of nested executions. For those unfamiliar, MSSQL 2008 and below simply do not allow nested stored procedure execution. Any attempted to do so will produce the following error:

Msg 8164, Level 16, State 1, Procedure <procedure_name>, Line <line> An INSERT EXEC statement cannot be nested.

In short, encapsulation as a stored procedure just won’t work. That really just leaves user defined functions. I personally loathe them for a lot of different reasons. They appeal to the programmer in me, but in SQL tend to cause more trouble than they’re worth. Still, if we want to encapsulate the XML parsing (and we DO), a table valued function is the best way to go. We’ll call it tvf_get_id_list:

-- tvf_get_id_list
-- Caskey, Damon V.
-- 2017-01-25
-- Returns recordset of IDs from xml list
-- 
-- <root>
--		<row id="INT" />
--		... 
-- </root>
			

CREATE FUNCTION tvf_get_id_list (@param_id_list xml)
RETURNS TABLE AS
RETURN (SELECT x.y.value('.','int') AS id	
			FROM @param_id_list.nodes('root/row/@id') AS x(y))

Procedure B will call tvf_get_id_list, passing along the XML. The tvf_get_id_list will break the XML down and produce a record-set of IDs, which we can then insert into temp table:

id
1
2
3

Procedure B will now have a access to record set of IDs that it can use to perform whatever work we need done.

As you can see, the XMl parsing work is fairly simple – we specifically planned the XML markup for easy break down. Even so encapsulating the XML work out to a separate function gives us a couple of advantages over just pasting the XML parsing code inline.

  • Obviously we will use the fastest and best scaled technique for breaking down the XML (see here for examples), but should even better techniques be developed, we only need to modify this one function.

Procedure B and any other procedures that we send our XML list to are simpler and more compact. They need only call tvf_get_id_list to break down the XML list into a usable record set.

Procedure Called By Control Code

This is more or less identical to procedures executing each other, except the procedures are being called by application code. In this case, it is the application’s responsibility to send XML formatted as above. The simplicity of XML makes this rather easy, and the parsing code can be made part of a class file.

foreach($this->id as $key => $id)
{	
					
	if($id == NULL) $id = DB_DEFAULTS::NEW_ID;
							
	$result .= '<row id="'.$id.'">';									
}

Author: Damon Caskey

Hello all, Damon Caskey here - the esteemed owner of this little slice of cyberspace. Welcome!

Leave a Reply