« Joey Gibson's mobile testing code | Main | Irish WISP »

Tim Ewald on XML pipelining in .NET

Pipeline XML processing in .NET. with the WSE

Interesting article by Tim Ewald. It starts out described a filter model for munging SOAP envelopes Filters accept envelopes as their argument:

SoapEnvelope env = new SoapEnvelope();
XmlElement body = env.CreateBody();
TimestampOutputFilter tsOutput = new TimestampOutputFilter();

If you're in Java, you'll see it works a bit like Servlet filters except that what's being passed around are SOAP Envelope Infosets (I'll get back to this point).

Web Services Enhancements (WSE) ship with 10 filters. You can compose these filters, and your own, into ordered collections, called Pipes:

SoapInputFilterCollection inputFilters =
new SoapInputFilterCollection();
SoapOutputFilterCollection outputFilters =
new SoapOutputFilterCollection();
outputFilters.Add(new TraceOutputFilter());
outputFilters.Add(new TimestampOutputFilter());
SoapEnvelope env = new SoapEnvelope();
XmlElement body = env.CreateBody();
Pipeline pipe = new Pipeline(inputFilters, outputFilters);

WSE has the right approach, but could do with some refinements (again for those in Java, Axis is also architected on a pipelined model). One criticism of WSE is that it passes around an Infoset instead of the raw XML (strings or streams), but this is consistent with MS philosophy for XML processing - don't process it, process the bound data. Technically there is one difference that results from this to the classic pipes and filters approach - the things being piped, in this case SOAP envelopes, are passed through the components of a pipe. With WSE looking like its passing around envelope references, it's more like the pipeline is moving over the envelope. Two things appear to be missing in WSE for high throughput pipelining. First is queuing of documents between filters - if you're not careful, one overloaded pipe (for example one that's reading or writing a disk resource or into a DB) might propogate a blockage upstream, possibly as far back as causing the webserver to stop accepting requests. The second comes from my experience building uber-scalable filtering intermediaries for HTTP and XML messaging. At some time, a developer will want to be able to drop down to the document, character or even byte/stream level to get things done. As long as your dealing only with Infosets/DOM that facility is locked out of the API- and herein is the downside of making developers access XML through object models.

January 3, 2003 07:48 PM


Trackback Pings

TrackBack URL for this entry: