Samstag, 16. März 2013

Data Integration with XSLT and Java XML Web Services

Introduction

Due to the exponential growth of the Web, social media, and online shopping, data is increasing rapidly within internet businesses. A large part of this data is XML data. XSLT -- besides XQuery, SAX, Stax, and similar technologies -- is the technology of choice when integrating XML data. XSLT is often considered to have poor performance when compared to technologies that keep only very little information in memory, such as SAX. Yet the expressiveness of XSLT and its design as an XML transformation language still make it superior to less declarative approaches such as XML programming apis for general purpose programming languages. This blog post shows how some performance problems of XML transformations can be overcome by calling XML web services from XSLT.

My own experience is that the performance of XSLT transformations accessing large XML documents can be increased by a factor of 10 by having a web service supply fragments of those large documents. Saving large documents within a database with the appropriate indexes, or keeping them as a Map within memory, will result in logarithmic access time to those fragments. Not doing so will make XSLT search the entire document, which is linear runtime.

Calling XML Web Services from XSLT

One of the week points of XSLT is that it cannot directly access databases -- although some proprietary extensions supply this functionality. As a result XSL transformations often hold large XML documents within memory, and joins between elements of different XML sources must be computed without access to indexes or hash maps. Yet what is often overseen is that the XSLT document() primitive can access XML data not only from disk or from static XML documents, but can also call Web services, which in turn can access databases and return the retrieved data to the XSL transformation. Here is a simple example:

<xsl:stylesheet version="2.0" 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:template match="/">
      <xsl:copy-of 
        select="document(
          'http://myserivce.example.com/get?id=1111'
        )" />
    </xsl:template>

</xsl:stylesheet>

The above XSLT will call an XML Web service at at the given URI. This web service can do anything that the programming language it is written in, can do. In particular it can access databases and also return XML fragments which are directly included in the XML result.

Java Web Services with Java and JAXB

Java is possibly the most frequently used language to provide XML Web services. This section briefly describes how one can use JAXB to turn a simple Java Servlet into an XML Web Service.  Using an XML technology such as JAXB for generating the XML makes sense in that it is a higher level language for generating XML, and thus less error-prone, than Java itself. This is achieved in three small steps:

  1. Create a Java class (entity) that holds the data to be serialized,
  2. insert the appropriate annotations for generating the XML tags for the properties, and
  3. serialize that class using a JAXB context to the servlet response.
Here is an example class annotated with JAXB annotations (steps 1 and 2) that could be used by the Web service:


@XmlRootElement(name="Person")
public class Person {

    /* Default constructor required by JAXB */
    public Person() { }
    
    public Person(String name, int age) {
        this.name = name;
        this.age = age;
    }

    @XmlElement
    private String name;

    @XmlElement
    private Integer age;

    /* Getters and Setters omitted for the sake of brevity */
}


When serialized using JAXB, this will yield an XML document of the following form:


<?xml version="1.0" ?>
<Person>
    <name>Some Body</name>
    <age>4</age>
</Person>

See the JAXB documentation on how to generate attribute values, or serialize lists or embedded entities. Any kind of XML document can be generated with JAXB.

Here is example code on how to serialize a Person entity within a servlet:


public void doGet(
        HttpServletRequest req, 
        HttpServletResponse resp) throws Exception {

    JAXBContext contextA = 
        JAXBContext.newInstance(Person.class);
    Marshaller m = contextA.createMarshaller();
          m.marshal(new Person("Some Body", 4), resp.getWriter());

}