Monday, April 19, 2010

Selecting Distinct Node Values from an eConnect XML Document

I'm still working on a long project that involves using eConnect to import millions of transactions into Dynamics GP. Yes, millions. Over 7 million transactions so far, and many more to go before I sleep.

In this case, it was easier for the client to provide the data in XML format, so I'm importing the transactions by sending the XML directly to eConnect. This has had some benefits, but also posed several challenges.

One challenge is that the client is even including the Batch Number in the XML data. Normally this wouldn't be an issue, but as part of another requirement, I need to know every batch number that was created by each XML file.

To do this, I would like to use XPath to query the XML file and provide me with a distinct list of batch numbers. There are other techniques, but at the moment I believe that XPath is the simplest approach.

This was a bit trickier and more obscure than I realized, and although I am a big fan of XPath, I'm a bit rusty with complex queries, so this was definitely outside of my comfort zone. The tricky part is that there are many different ways to get a distinct list of nodes or node values, and your XML structure and the specific distinct values you need will dictate which approach and syntax you use.

I happen to be working on the POP Receipt import at the moment, so I'll use it's XML as an example. I basically need distinct values for the following node:

eConnect/POPReceivingsType/taPopRcptHdrInsert/BACHNUMB

Or:

//taPopRcptHdrInsert/BACHNUMB

To figure out how to do this, I dug through several similar requests on Experts-Exchange.com and discovered an answer that led me to create the following XPath query:

//taPopRcptHdrInsert[not(BACHNUMB = preceding::taPopRcptHdrInsert/BACHNUMB)]/BACHNUMB

This compact little gem scans the entire XML file (which in my case has thousands of transactions) and returns a tidy node list of distinct batch numbers.

To use it, I just use a SelectNodes statement:

batchNodes = popXML.SelectNodes("//taPopRcptHdrInsert[not(BACHNUMB = preceding::taPopRcptHdrInsert/BACHNUMB)]/BACHNUMB")

Is that cool, or what? (the inner geek in you screams "YES!")

You could use a similar XPath query to find a unique list of customer IDs, vendor IDs, transaction numbers, item numbers, you name it.

Full credit goes to the geniuses that invented XPath, and the clever folks on Experts Exchange that showed me how to use it.

No comments: