KEMBAR78
6 xml parsing | PPT
XML Parsing Atul Kahate [email_address]
Agenda XML Parsing/Processing Basics Simple API for XML (SAX) Document Object Model (DOM) XML and Java using JAXP XML and ASP.NET
XML Parsing/Processing
XML Processing XML processing means Reading an XML document  Parsing it in the desired manner Allows handling the contents of an XML document the way we want
XML Parser Software that sits between an application and the XML files Shield programmers from having to manually parse through XML documents Programmers are free to concentrate on the contents of the XML file, not syntax Programmers use the parser APIs to access/manipulate an XML file
XML Processing Approaches Process as a sequence of events Simple API for XML Processing (SAX) Process as a hierarchy of nodes Document Object Model (DOM) Pull approach Streaming API (StAX)
SAX Versus DOM
StAX Pulls events from the XML document via the parser Also an event-based API, but differs from SAX The application, and not the parser; controls the flow
Simple API for XML (SAX)
XML Processing as Sequence of Events – 1 Process as a sequence of events Event is the occurrence of something noticeable e.g. in Windows, mouse movement, keyboard input are events The OS captures all events and sends messages to a program The programmer has to take an appropriate action to deal with the event
XML Processing as Sequence of Events – 2 Process as a sequence of events Event-based model can be applied to XML documents also Various events that occur while reading an XML document sequentially Start of document Start tag of an element End tag of an element Comments
XML Processing as Sequence of Events – 3 Process as a sequence of events The programmer has to write code to handle these events Called as  event handlers
Sequential Processing Example – 1 Consider the following XML document <?xml version=“1.0”?> <books> <book> <name> Learning XML </name> <author> Simon North </author> <publication> TMH </publication> </book> <book> <name> XML by Example </name> <author> Don Box </author> <publication> Pearson </publication> </book> </books>
Sequential Processing Example – 2 Events generated when we read the above XML file Start document Start element: books Start element: book Start element: name Characters: Learning XML  End element: name Start element: author Characters: Simon North  End element: author Start element: publication  Characters: TMH  End element: publication … End element: book End document
Sample XML Tree
Tree Processing Sequence 1 2 8 3 4 9 10 14 15 5 6 7 11 12 13 16 17
Sequential Traversal: Summary Order Top to bottom Left to right Advantages Simple Fast Requires less amount of memory Drawback Not possible to  look ahead
SAX Concept
JAXP Java API for XML Processing
JAXP Concept Application program written in Java for working with XML Java API for XML Processing (JAXP) JAXP APIs Simple API for XML Processing (SAX)  Document Object Model (DOM) Sequential processing Tree-based processing
JAXP Java API for XML Processing Standardized by Sun Very thin layer on top of SAX or DOM Makes application code parser-independent Our programs should use JAXP, which in turn, calls parser APIs Include package  javax.xml.parsers.*
JAXP: API or Abstraction? JAXP is an API, but is called as an abstraction layer Does not provide new means of parsing XML Does not add to SAX or DOM Does not give new functionality to Java or XML handling Makes working with SAX and DOM easier It is vendor-neutral
JAXP and Parsing JAXP is  not  a replacement for SAX, DOM, JDOM etc Some vendor must supply the implementation of SAX, DOM, etc JAXP provides APIs to use these implementations In the early versions of JDK, Sun had supplied a parser called  Crimson Now, Sun provides Apache Xerces Both are not a part of JAXP API – they are part of JAXP distribution In JDK, we can locate Xerces implementations in the  org.xml.sax  and  org.w3c.dom  packages
JAXP API The main JAXP APIs are defined in the package javax.xml.parsers Contains two vendor-neutral factory classes SAXParserFactory – Gives a SAXParser object DocumentBuilderFactory – Gives a DocumentBuilder object DocumentBuilder, in turn, gives Document object
Package Details javax.xml.parsers   The JAXP APIs, which provide a common interface for different vendors' SAX and DOM parsers.  org.w3c.dom   Defines the Document class (a DOM), as well as classes for all of the components of a DOM.  org.xml.sax   Defines the basic SAX APIs.  javax.xml.transform   Defines the XSLT APIs that let you transform XML into other forms.
Which Packages to use in JAXP? We need to include two sets of packages – one for JAXP and the other for SAX/DOM, as appropriate // JAXP import javax.xml.parsers.SAXParserFactory; // SAX import org.xml.sax.Attributes; import org.xml.sax.SAXException; import org.xml.sax.SAXNotRecognizedException; import org.xml.sax.SAXNotSupportedException; import org.xml.sax.SAXParseException; import org.xml.sax.XMLReader; import org.xml.sax.helpers.DefaultHandler; import org.xml.sax.helpers.XMLReaderFactory;
SAX Programming in JAXP
SAX Approach
Key SAX APIs – 1 SAXParserFactory   Creates an instance of the parser determined by the system property, javax.xml.parsers.SAXParserFactory.    SAXParser   An interface that defines several kinds of parse() methods. In general, you pass an XML data source and a  DefaultHandler  object to the parser, which processes the XML and invokes the appropriate methods in the handler object.      
Key SAX APIs – 2 SAXReader   The SAXParser wraps a SAXReader. Typically, you don't care about that, but every once in a while you need to get hold of it using SAXParser's getXMLReader(), so you can configure it. It is the SAXReader which carries on the conversation with the SAX event handlers you define.  DefaultHandler   Not shown in the diagram, a DefaultHandler implements the ContentHandler, ErrorHandler, DTDHandler, and EntityResolver interfaces (with null methods), so you can override only the ones you're interested in.
Design Patterns Factory Pattern
“ new” means “Concrete” Vehicle vehicle = new Car (); We want to use an interface (say Vehicle) to keep code flexible However, we must create an instance of a concrete class (e.g. Car) Makes the code more fragile and less flexible – Why? See next slide.
Using “new” – 1 Vehicle vehicle; if (picnic) vehicle = new Car (); else if (work) vehicle = new Bus (); else vehicle = new Scooter ();
Using “new” – 2 We do not know until run time which class to instantiate Whenever code needs to be changed, we need to reopen this code and examine what needs to be added or removed Mandates application changes at multiple places, making it difficult to maintain
What is wrong with “new”? Nothing as such Problem is changes to code and their impact on “new” By coding to an interface, we know that we are insulated from changes made to a system This is because different classes would implement the interface using polymorphism appropriately
Key OO Principle Identify the aspects of code that vary and separate them from what stays the same Code should be open for extension, but closed for modifications
Pizza Class – Ideal Situation Pizza orderPizza () { Pizza pizza = new  Pizza () ; Pizza.prepare (); Pizza.bake (); Pizza.cut (); Pizza.pack(); return pizza; } Ideally, we would like this to be an abstract class or an interface, but we cannot directly instantiate an abstract class or an interface!
Pizza Class – Ideal Situation Pizza orderPizza () { Pizza pizza = new  Cheese Pizza () ; Pizza.prepare (); Pizza.bake (); Pizza.cut (); Pizza.pack(); return pizza; } We are left with no choice but to instantiate a concrete class
Solution Pizza orderPizza (String type) { Pizza pizza; if (type.equals (“cheese”)) { pizza = new CheesePizza (); } else if (type.equals (“corn”)) { pizza = new CornPizza (); } else { pizza = new GeneralPizza (); } Pizza.prepare (); Pizza.bake (); Pizza.cut (); Pizza.pack(); return pizza; } We are passing the type of pizza to orderPizza () method. Based on the type of pizza, we instantiate the correct concrete class. Each pizza has to implement the Pizza interface. Each pizza sub-type (e.g. cheese) knows how to prepare itself.
Is this correct? Let us review the principles: Identify the aspects of code that vary and separate them from what stays the same Code should be open for extension, but closed for modifications
Problems What if we remove one pizza type, and add another? We need to touch the code See next slide
Problem – Code Pizza orderPizza (String type) { Pizza pizza; if (type.equals (“veg”)) { pizza = new VeggiePizza (); } else if (type.equals (“corn”)) { pizza = new CornPizza (); } else { pizza = new GeneralPizza (); } Pizza.prepare (); Pizza.bake (); Pizza.cut (); Pizza.pack(); return pizza; return pizza; } Problem is that we end up touching code for modifications. This is not what we want. What is the solution?
Code Modified Further Pizza orderPizza (String type) { Pizza pizza; if (type.equals (“cheese”)) { pizza = new CheesePizza (); } else if (type.equals (“corn”)) { pizza = new CornPizza (); } else { pizza = new GeneralPizza (); } Pizza.prepare (); Pizza.bake (); Pizza.cut (); Pizza.pack(); return pizza; return pizza; } if (type.equals (“cheese”)) { pizza = new CheesePizza (); } else if (type.equals (“corn”)) { pizza = new CornPizza (); } else { pizza = new GeneralPizza (); } Abstract out the code that varies, and put it into a separate class, that would only worry about how to create objects. If any other object needs a pizza object, this is the class to come to.
This new Class is our “Factory” public class PizzaFactory { public Pizza createPizza (String type) { Pizza pizza = null; if (type.equals (“cheese”)) { pizza = new CheesePizza (); } else if (type.equals (“corn”)) { pizza = new CornPizza (); } else { pizza = new GeneralPizza (); } return pizza; } } This new class creates new pizzas for its clients. It has a createPizza () method, which all clients will use to instantiate new objects. It contains code plucked out of the orderPizza () method.
Modified Client Code public class PizzaStore { PizzaFactory factory = new PizzaFactory (); public Pizza orderPizza (String type) { Pizza pizza = factory.createPizza (“Cheese”); pizza.prepare (); pizza.bake (); pizza.cut (); pizza.pack (); return pizza; } } A variation of this is: PizzaFactory factory = PizzaFactory.newInstance (); In this case, newInstance () would be a static method in PizzaFactory, since we are not creating any object of PizzaFactory here
Another Factory Pattern Example
Factory Pattern – The Need – 1  Consider this: Connection connection = new OracleConnection (); Connection connection = new SqlServerConnection (); Connection connection = new DB2Connection (); ... What are the problems? How to resolve them?
Problems Summarized Sometimes, an Application (or framework) at runtime, cannot anticipate the class of object that it must create. The Application (or framework) may know that it has to instantiate classes, but it may only know about abstract classes (or interfaces), which it cannot instantiate. Thus the Application class may only know  when  it has to instantiate a new Object of a class, not  what kind of  subclass to create.  A class may want it's subclasses to specify the objects to be created.  A class may delegate responsibility to one of several helper subclasses so that knowledge can be localized to specific helper subclasses.
Factory Pattern – The Need – 2 public Connection createConnection (String type) { if (type.equals (&quot;Oracle&quot;) { return new OracleConnection (); } else if (type.equals (&quot;SQL Server&quot;) { return new SqlServerConnection (); } else if (type.equals (&quot;DB2&quot;) { return new DB2Connection (); } } Does it resolve all problems?
More on Factory Pattern – 1 Factory Method is a creational pattern. This pattern helps to model an interface for creating an object which at creation time can let its subclasses decide which class to instantiate. We call this a Factory Pattern since it is responsible for &quot;Manufacturing&quot; an Object. It helps instantiate the appropriate Subclass by creating the right Object from a group of related classes. The Factory Pattern promotes loose coupling by eliminating the need to bind application-specific classes into the code.  Factories have a simple function: Churn out objects.
More on Factory Pattern – 2 Obviously, a factory is not needed to make an object. A simple call to  new  will do it for you. However, the use of factories gives the programmer the opportunity to abstract the specific attributes of an Object into specific subclasses which create them. The Factory Pattern is all about &quot; Define an interface for creating an object, but let the subclasses decide which class to instantiate. The Factory method lets a class defer instantiation to subclasses &quot;
Factory Pattern – The Need – 3 We still need to add new code for a new connection type The existing class needs to undergo changes every time When object creation changes a lot, use a  factory
Factory Pattern – The Need – 4 Client code to use the factory FirstFactory factory = FirstFactory.getInstance (); Connection connection = factory.createConnection (“Oracle”);
The Factory Class – 1 public class FirstFactory { protected static String type; public static FirstFactory getInstance () { type = null;   return new FirstFactory (); } ... }
The Factory Class – 2 public class FirstFactory { protected static String type; public static FirstFactory getInstance () { type = &quot;&quot;; return new FirstFactory (); } public Connection createConnection (String t) {   type = t;   if (type.equals (&quot;Oracle&quot;)) { return new OracleConnection (); } else if (type.equals (&quot;SQL Server&quot;)) { return new SQLServerConnection (); } else { //if (type.equals (&quot;DB2&quot;)) { return new DB2Connection (); } } }
Connection Classes public interface Connection { public String description (); } public class OracleConnection implements Connection { public OracleConnection () {   // Logic specific to Oracle } public String description () { return &quot;Oracle&quot;; } } public class SQLServerConnection implements Connection { public  SQLServerConnection () {   // Logic specific to SQL Server } public String description () { return &quot;SQL Server&quot;; } } public class DB2Connection implements Connection { public  DB2Connection () {   // Logic specific to DB2 } public String description () { return &quot;DB2&quot;; } }
Client Code public class TestConnection { public static void main (String args []) {   FirstFactory factory = FirstFactory.getInstance ();   Connection connection = factory.createConnection (&quot;DB2&quot;); System.out.println (&quot;You are connected with &quot; +   connection.description ()); } }
Factory Pattern - Exercise
Exercise We want to be able to create any of the following objects that have some similarities and some differences. Design using factory method design pattern. Employee Student Player
SAX
Sequential Traversal: SAX SAX (Simple API for XML) Specify the parser to be used Create a parser instance Create an event handler to respond to parsing events Invoke the parser with the designated content handler and document
1 – Specify the Parser Various approaches are possible Set a system property for javax.xml.parsers.SAXParserfactory Specify the parser in jre_dir/lib/jaxp.properties Use system-dependent default parser (check documentation) Usually done at the time of JDK installation itself automatically
1 – Specify the Parser Example Public static void main (String [] args) { String jaxpPropertyName = “javax.xml.parsers.SAXParserFactory”; … }
2 – Create a Parser Instance Steps Create an instance of a parser factory Use that to create a SAXParser object Example SAXParserFactory factory = SAXParserFactory.newInstance (); SaxParser p = factory.newSAXParser ();
3 – Create an Event Handler Event handler responds to parsing events It is a subclass of DefaultHandler public class MyHandler extends DefaultHandler { … } Main event methods (callbacks) startDocument, endDocument startElement, endElement characters, ignoreableWhitespace
3 – Create an Event Handler Example method: startElement Declaration public void startElement  (String nameSpaceURI,   String localName,   String qualifiedName,   Attributes attributes) throws SASException Arguments nameSpaceURI URI identifying the namespace uniquely localName Element name without namespace prefix qualifiedName Complete element name, including namespace prefix attributes Attributes object, representing attributes of the element
3 – Create an Event Handler nameSpaceURI <cwp:book xmlns:cwp= “http://www.test.com/xml/”>   qualifiedName attribute[1] < cwp:chapter  number=“23”  part=“Server programming” > <cwp: title > XML made easy </cwp:title> </cwp:chapter>   localName </cwp:book>
4 – Invoke the Parser Call the  parse  method, supplying: The content handler The XML document File or Input stream p.parse (file name, handler);
Sample XML File (emp.xml) <?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?> <root> <employee>test 1</employee> <employee>test 1</employee> <employee>test 1</employee> <employee>test 1</employee> <employee>test 1</employee> <employee>test 1</employee> <employee>test 1</employee> </root>
Java Program to Count Total Number of Elements import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import org.xml.sax.*; import org.xml.sax.helpers.DefaultHandler; public class SAXEmployeeCount extends DefaultHandler { int tagCount = 0; public void startElement (String uri, String localName, String rawName, Attributes attributes) { tagCount++; } public void endDocument() { System.out.println(&quot;There are &quot; + tagCount + &quot; elements.&quot;); } public static void main(String[] args) { SAXEmployeeCount handler = new SAXEmployeeCount (); try { SAXParserFactory spf = SAXParserFactory.newInstance (); SAXParser parser = spf.newSAXParser (); parser.parse(&quot;employee.xml&quot;, handler); } catch (Exception ex) { System.out.println(ex); } } }
Count Only  Book  Elements <?xml version=&quot;1.0&quot;?> <books> <book category=&quot;reference&quot;> <author>Nigel Rees</author> <title>Sayings of the Century</title> <price>8.95</price> </book> <book category=&quot;fiction&quot;> <author>Evelyn Waugh</author> <title>Sword of Honour</title> <price>12.99</price> </book> <book category=&quot;fiction&quot;> <author>Herman Melville</author> <title>Moby Rick</title> <price>8.99</price> </book> </books>
Parsing Code in JAXP import java.io.IOException; import java.lang.*; import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import org.xml.sax.Attributes; import org.xml.sax.Locator; import org.xml.sax.SAXException; import org.xml.sax.SAXNotRecognizedException; import org.xml.sax.SAXNotSupportedException; import org.xml.sax.SAXParseException; import org.xml.sax.XMLReader; import org.xml.sax.ext.LexicalHandler; import org.xml.sax.helpers.DefaultHandler; import org.xml.sax.helpers.ParserAdapter; import org.xml.sax.helpers.XMLReaderFactory; public class BookCount extends DefaultHandler{ private int count = 0; public void startDocument() throws SAXException  { System.out.println(&quot;Start document ...&quot;); } public void startElement(String uri, String local, String raw, Attributes attrs) throws SAXException { int year = 0; String attrValue; System.out.println (&quot;Current element = &quot; + raw); if (raw.equals (&quot;book&quot;)) { count++; } } public void endDocument() throws SAXException  { System.out.println(&quot;The total number of books = &quot; + count); } public static void main (String[] args) throws Exception { BookCount handler = new BookCount (); try { SAXParserFactory spf = SAXParserFactory.newInstance (); SAXParser parser = spf.newSAXParser (); parser.parse (&quot;book.xml&quot;, handler); } catch (SAXException e) { System.err.println(e.getMessage()); } } }
Specifying Parser Name import java.io.IOException; import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import org.xml.sax.Attributes; import org.xml.sax.Locator; import org.xml.sax.SAXException; import org.xml.sax.SAXNotRecognizedException; import org.xml.sax.SAXNotSupportedException; import org.xml.sax.SAXParseException; import org.xml.sax.XMLReader; import org.xml.sax.ext.LexicalHandler; import org.xml.sax.helpers.DefaultHandler; import org.xml.sax.helpers.ParserAdapter; import org.xml.sax.helpers.XMLReaderFactory; public class SAXApp extends DefaultHandler{ // default parser to use protected static final String DEFAULT_PARSER_NAME = &quot;org.apache.xerces.parsers.SAXParser&quot;; private int count = 0; public void countTopics () throws IOException, SAXException { // create parser try { System.out.println (&quot;Inside countTopics&quot;); } catch (Exception e) { e.printStackTrace(System.err); } } public void startElement(String uri, String local, String raw, Attributes attrs) throws SAXException { if (raw.equals(&quot;topic&quot;))  count++; System.out.println (raw); } public void endDocument() throws SAXException  { System.out.println(&quot;There are &quot; + count + &quot; topics&quot;); } public static void main (String[] args) throws Exception{ System.out.println (&quot;Inside main ...&quot;); SAXApp handler = new SAXApp(); try { SAXParserFactory spf = SAXParserFactory.newInstance (); SAXParser parser = spf.newSAXParser (); parser.parse (&quot;contents.xml&quot;, handler); } catch (SAXException e) { System.err.println(e.getMessage()); } } }
Exercise Consider the following XML file and write a program to count the number of elements that have at least one attribute. <?xml version=&quot;1.0&quot;?> <BOOKS> <BOOK pubyear=&quot;1929&quot;> <BOOK_TITLE>Look Homeward, Angel</BOOK_TITLE> <AUTHOR>Wolfe, Thomas</AUTHOR> </BOOK> <BOOK pubyear=&quot;1973&quot;> <BOOK_TITLE>Gravity's Rainbow</BOOK_TITLE> <AUTHOR>Pynchon, Thomas</AUTHOR> </BOOK> <BOOK pubyear=&quot;1977&quot;> <BOOK_TITLE>Cards as Weapons</BOOK_TITLE> <AUTHOR>Jay, Ricky</AUTHOR> </BOOK> <BOOK pubyear=&quot;2001&quot;> <BOOK_TITLE>Computer Networks</BOOK_TITLE> <AUTHOR>Tanenbaum, Andrew</AUTHOR> </BOOK> </BOOKS>
Solution import java.io.IOException; import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import org.xml.sax.Attributes; import org.xml.sax.Locator; import org.xml.sax.SAXException; import org.xml.sax.SAXNotRecognizedException; import org.xml.sax.SAXNotSupportedException; import org.xml.sax.SAXParseException; import org.xml.sax.XMLReader; import org.xml.sax.ext.LexicalHandler; import org.xml.sax.helpers.DefaultHandler; import org.xml.sax.helpers.ParserAdapter; import org.xml.sax.helpers.XMLReaderFactory; public class countAttr extends DefaultHandler{ private int count = 0; public void startDocument() throws SAXException  { System.out.println(&quot;Start document ...&quot;); } public void startElement(String uri, String local, String raw, Attributes attrs) throws SAXException { System.out.println (&quot;Current element = &quot; + raw); if (attrs.getLength () != 0)  { count++;  } } public void endDocument() throws SAXException  { System.out.println(&quot;The total number of attributes = &quot; + count); } public static void main (String[] args) throws Exception { countAttr handler = new countAttr (); try { SAXParserFactory spf = SAXParserFactory.newInstance ();  SAXParser parser = spf.newSAXParser (); parser.parse (&quot;countAttr.xml&quot;, handler); } catch (SAXException e) { System.err.println(e.getMessage()); } } }
Exercise For the same XML file, display element names only if the book is published in the 1970s.
Solution import java.io.IOException; import java.lang.*; import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import org.xml.sax.Attributes; import org.xml.sax.Locator; import org.xml.sax.SAXException; import org.xml.sax.SAXNotRecognizedException; import org.xml.sax.SAXNotSupportedException; import org.xml.sax.SAXParseException; import org.xml.sax.XMLReader; import org.xml.sax.ext.LexicalHandler; import org.xml.sax.helpers.DefaultHandler; import org.xml.sax.helpers.ParserAdapter; import org.xml.sax.helpers.XMLReaderFactory; public class seventiesBooks extends DefaultHandler{ private int count = 0; public void startDocument() throws SAXException  { System.out.println(&quot;Start document ...&quot;); } public void startElement(String uri, String local, String raw, Attributes attrs) throws SAXException { int year = 0; String attrValue; System.out.println (&quot;Current element = &quot; + raw); if (attrs.getLength () > 0) { attrValue = attrs.getValue (0); year = Integer.parseInt (attrValue); if (year < 1970) { count++;  } } } public void endDocument() throws SAXException  { System.out.println(&quot;The total number of matching elements = &quot; + count); } public static void main (String[] args) throws Exception { seventiesBooks handler = new seventiesBooks(); try { SAXParserFactory spf = SAXParserFactory.newInstance ();  SAXParser parser = spf.newSAXParser (); parser.parse (&quot;countAttr.xml&quot;, handler); } catch (SAXException e) { System.err.println(e.getMessage()); } } }
Exercise Consider the following XML document (stock.xml) <?xml version=&quot;1.0&quot;?> <stock> <stockinfo symbol=&quot;IFL&quot;> <company>i-flex solutions limited</company> <price>2500</price> </stockinfo> <stockinfo symbol=&quot;HLL&quot;> <company>Hindustan Lever</company> <price>1840</price> </stockinfo> <stockinfo symbol=&quot;LT&quot;> <company>Laresn and Toubro</company> <price>2678</price> </stockinfo> <stockinfo symbol=&quot;Rel&quot;> <company>Reliance Communications</company> <price>1743</price> </stockinfo> </stock> Produce output as shown on the next slide
Expected Output
Solution import java.io.*; import org.xml.sax.*; import org.xml.sax.helpers.*; import javax.xml.parsers.*; public class DisplayStockDetails extends DefaultHandler { public void startDocument () throws SAXException { System.out.println (&quot;\nDisplaying Stock Details&quot;); System.out.println (&quot;=========================\n&quot;); } public void endDocument () throws SAXException { System.out.println (&quot;\nEnd of Details&quot;); System.out.println (&quot;==============\n&quot;); } public void startElement (String uri, String local, String raw, Attributes attrs) throws SAXException { // Skip processing root element if (local.equals (&quot;stock&quot;)) return; // Skip processing if there are no attributes if (attrs == null) return; for (int i=0; i<attrs.getLength (); i++) { System.out.println (&quot;[Symbol: &quot; + attrs.getValue (i) + &quot;]&quot;); } } public void endElement (String uri, String local, String raw) throws SAXException { // System.out.println (); } public void characters (char[] ch, int start, int length) throws SAXException { System.out.println (new String (ch, start, length)); } public static void main (String[] args) throws Exception { DisplayStockDetails handler = new DisplayStockDetails (); try { SAXParserFactory spf = SAXParserFactory.newInstance (); SAXParser parser = spf.newSAXParser (); parser.parse (&quot;stock.xml&quot;, handler); } catch (SAXException e) { System.err.println(e.getMessage()); } } }
Exercise Consider the following XML file and write a program to find out and display the total cost for all CDs. <?xml version=&quot;1.0&quot; encoding=&quot;ISO-8859-1&quot;?> <catalog> <cd> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <country>USA</country> <company>Columbia</company> <price>10.90</price> <year>1985</year> </cd> <cd> <title>Candle in the wind</title> <artist>Elton John</artist> <country>UK</country> <company>HMV</company> <price>8.20</price> <year>1998</year> </cd> </catalog>
Solution import java.io.IOException; import java.lang.*; import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import org.xml.sax.Attributes; import org.xml.sax.Locator; import org.xml.sax.SAXException; import org.xml.sax.SAXNotRecognizedException; import org.xml.sax.SAXNotSupportedException; import org.xml.sax.SAXParseException; import org.xml.sax.XMLReader; import org.xml.sax.ext.LexicalHandler; import org.xml.sax.helpers.DefaultHandler; import org.xml.sax.helpers.ParserAdapter; import org.xml.sax.helpers.XMLReaderFactory; public class CDPrice extends DefaultHandler{ private int count = 0, total = 0; private boolean flagIsAvailable = false, flagIsCurrentElementPrice = false; public void startDocument() throws SAXException  { System.out.println(&quot;Start document ...&quot;); } public void startElement(String uri, String local, String raw, Attributes attrs) throws SAXException { int year = 0; String attrValue; System.out.println (&quot;Current element = &quot; + raw); if (raw.equals (&quot;price&quot;)) { flagIsCurrentElementPrice = true; System.out.println (&quot;INSIDE if of startElement ===&quot;); } } public void characters (char [] ch, int start, int len) throws SAXException { if (flagIsCurrentElementPrice) {   System.out.println (&quot;ch  = &quot; + ch);   System.out.println (&quot;start = &quot; + start);   System.out.println (&quot;len  = &quot; + len);   StringBuffer buffer = new StringBuffer ();   for (int i=0; i<len; i++) {   buffer.append (ch[start+i]);   }   System.out.println (&quot;*** buffer = &quot; + buffer + &quot; ***&quot;); String str = buffer.substring (0);   int uprice = Integer.parseInt(str);   total += uprice;   flagIsCurrentElementPrice = false;   System.out.println (&quot;Current total = &quot; + total); } } public void endDocument() throws SAXException  { System.out.println(&quot;The total price of available CDs = &quot; + total); } public static void main (String[] args) throws Exception { CDPrice handler = new CDPrice(); try { SAXParserFactory spf = SAXParserFactory.newInstance (); SAXParser parser = spf.newSAXParser (); parser.parse (&quot;cdcatalog2.xml&quot;, handler); } catch (SAXException e) { System.err.println(e.getMessage()); } } }
Document Object Model (DOM)
DOM – Basic Flow
Basic Concepts
JAXP and DOM – Overview Class DocumentBuilderFactory public abstract class javax.xml.parsers.DocumentBuilderFactory extends java.lang.object Defines a factory API that enables applications to obtain a parser that produces DOM object trees from XML documents parse  method: Parses the contents of an XML document and returns the contents as a new  Document  object
JAXP and DOM – Overview Class DocumentBuilder public abstract class javax.xml.parsers. DocumentBuilder extends java.lang.Object Defines the API to obtain DOM Document instances from an XML document
JAXP and DOM – Overview Interface Document public interface Document extends Node The Document interface represents the entire HTML or XML document Conceptually, it is the root of the document tree, and provides the primary access to the document's data
JAXP and DOM – Overview Interface Element public interface Element extends Node The Element interface represents an element in an HTML or XML document Elements may have attributes associated with them Inherits from Node, the generic Node interface attributes may be used to retrieve the set of all attributes for an element
JAXP and DOM DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance (); DocumentBuilder builder = factory.newDocumentBuilder (); Document document = builder.parse (fileName); Element root = document.getDocumentElement ();
Example – XML File Count the number of Employee elements from this XML using DOM <?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?> <root> <employee>test 1</employee> <employee>test 1</employee> <employee>test 1</employee> <employee>test 1</employee> <employee>test 1</employee> <employee>test 1</employee> <employee>test 1</employee> </root>
Example – Java Code package javaapplication1; import org.w3c.dom.*; public class Main { public static void main(String[] args)  { try { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance (); DocumentBuilder builder = factory.newDocumentBuilder (); Document document = builder.parse (&quot;cdcatalog.xml&quot;); Element root = document.getDocumentElement (); NodeList nodes = document.getElementsByTagName(&quot;employee&quot;); System.out.println(&quot;There are &quot; + nodes.getLength() +  &quot;  elements.&quot;); } catch (Exception ex) { System.out.println(ex); } } }
Check if a File is Well-Formed package  sicsr; import  javax.xml.parsers.*; public   class  IsWellFormed { /** *   @param   args */ public   static   void  main(String[] args) { try  { DocumentBuilderFactory domFactory = DocumentBuilderFactory. newInstance (); DocumentBuilder domBuilder = domFactory.newDocumentBuilder(); domBuilder.parse( &quot;NWF.xml&quot; ); } catch  (org.xml.sax.SAXException ex) { System. out .println( &quot;File is not well-formed&quot; ); } catch  (FactoryConfigurationError ex) { System. out .println(ex.toString ()); } catch  (ParserConfigurationException ex) { System. out .println(ex.toString ()); } catch  (Exception ex) { System. out .println(ex.toString ()); } } }
JAXP Code to Open an XML File import org.w3c.dom.*; import javax.xml.parsers.*; import org.xml.sax.*; public class DOMExample1 { public static void main (String[] args) { try { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance (); DocumentBuilder builder = factory.newDocumentBuilder (); Document document = builder.parse (&quot;cdcatalog.xml&quot;); Element root = document.getDocumentElement (); System.out.println (&quot;In main ... XML file openend successfully ...&quot;); } catch (ParserConfigurationException e1) { System.out.println (&quot;Exception: &quot; + e1); } catch (SAXException e2) { System.out.println (&quot;Exception: &quot; + e2); } catch (java.io.IOException e3) { System.out.println (&quot;Exception: &quot; + e3); } } }
Case Study – XML File <?xml version=&quot;1.0&quot;?> <catalog> <cd> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <country>USA</country> <company>Columbia</company> <price>10</price> <year>1985</year> </cd> <cd> <title>Candle in the wind</title> <artist>Elton John</artist> <country>UK</country> <company>HMV</company> <price>8</price> <year>1998</year> </cd> </catalog>
Problem Write a program to find out if an element by the name price exists in the XML file and display its contents
Solution import org.w3c.dom.*; import javax.xml.parsers.*; import org.xml.sax.*; public class DOMExample2 { public static void main (String[] args) { NodeList elements; String elementName = &quot;price&quot;; try { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance (); DocumentBuilder builder = factory.newDocumentBuilder (); Document document = builder.parse (&quot;cdcatalog.xml&quot;); Element root = document.getDocumentElement (); System.out.println (&quot;In main ... XML file openend successfully ...&quot;); elements = document.getElementsByTagName(elementName); // is there anything to do? if (elements == null) { return; } // print all elements int elementCount = elements.getLength(); System.out.println (&quot;Count = &quot; + elementCount); for (int i = 0; i < elementCount; i++) { Element element = (Element) elements.item(i); System.out.println (&quot;Element Name  = &quot; + element.getNodeName()); System.out.println (&quot;Element Type  = &quot; + element.getNodeType()); System.out.println (&quot;Element Value = &quot; + element.getNodeValue()); System.out.println (&quot;Has attributes = &quot; + element.hasAttributes()); } } catch (ParserConfigurationException e1) { System.out.println (&quot;Exception: &quot; + e1); } catch (SAXException e2) { System.out.println (&quot;Exception: &quot; + e2); } catch (DOMException e2) { System.out.println (&quot;Exception: &quot; + e2); } catch (java.io.IOException e3) { System.out.println (&quot;Exception: &quot; + e3); } } }
Problem Write a program to display element names and their attribute names and values
Solution import org.w3c.dom.*; import javax.xml.parsers.*; import org.xml.sax.*; public class DOMExample3 { public static void main (String[] args) { NodeList elements; String elementName = &quot;cd&quot;; try { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance (); DocumentBuilder builder = factory.newDocumentBuilder (); Document document = builder.parse (&quot;cdcatalog.xml&quot;); Element root = document.getDocumentElement (); System.out.println (&quot;In main ... XML file openend successfully ...&quot;); elements = document.getElementsByTagName(elementName); // is there anything to do? if (elements == null) { return; } // print all elements int elementCount = elements.getLength(); System.out.println (&quot;Count = &quot; + elementCount); for (int i = 0; i < elementCount; i++) { Element element = (Element) elements.item(i); System.out.println (&quot;Element Name  = &quot; + element.getNodeName()); System.out.println (&quot;Element Type  = &quot; + element.getNodeType()); System.out.println (&quot;Element Value  = &quot; + element.getNodeValue()); System.out.println (&quot;Has attributes = &quot; + element.hasAttributes()); // If attributes exist, print them if(element.hasAttributes())  { // if it does, store it in a NamedNodeMap object NamedNodeMap AttributesList = element.getAttributes(); // iterate through the NamedNodeMap and get the attribute names and values for(int j = 0; j < AttributesList.getLength(); j++) { System.out.println(&quot;Attribute: &quot; +  AttributesList.item(j).getNodeName() + &quot; = &quot; + AttributesList.item(j).getNodeValue());  } } } } catch (ParserConfigurationException e1) { System.out.println (&quot;Exception: &quot; + e1); } catch (SAXException e2) { System.out.println (&quot;Exception: &quot; + e2); } catch (DOMException e2) { System.out.println (&quot;Exception: &quot; + e2); } catch (java.io.IOException e3) { System.out.println (&quot;Exception: &quot; + e3); } } }
Problem For a given element, find out all the child elements and display their types
Solution import org.w3c.dom.*; import javax.xml.parsers.*; import org.xml.sax.*; public class DOMExample4 { public static void main (String[] args) { NodeList elements, Children; String elementName = &quot;cd&quot;; String local = &quot;&quot;; Element element = null; try { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance (); DocumentBuilder builder = factory.newDocumentBuilder (); Document document = builder.parse (&quot;cdcatalog.xml&quot;); Element root = document.getDocumentElement (); System.out.println (&quot;In main ... XML file openend successfully ...&quot;); elements = document.getElementsByTagName(elementName); // is there anything to do? if (elements == null) { return; } // print all elements int elementCount = elements.getLength(); System.out.println (&quot;Count = &quot; + elementCount); for (int i = 0; i < elementCount; i++) { element = (Element) elements.item(i); System.out.println (&quot;Element Name  = &quot; + element.getNodeName()); System.out.println (&quot;Element Type  = &quot; + element.getNodeType()); System.out.println (&quot;Element Value  = &quot; + element.getNodeValue()); System.out.println (&quot;Has attributes = &quot; + element.hasAttributes());  // Find out if child nodes exist for this element Children = element.getChildNodes();  if (Children != null) { for (int j=0; j< Children.getLength(); j++) { local = Children.item(j).getNodeName(); System.out.println (&quot;Child element name = &quot; + local);  } } } } catch (ParserConfigurationException e1) { System.out.println (&quot;Exception: &quot; + e1); } catch (SAXException e2) { System.out.println (&quot;Exception: &quot; + e2); } catch (DOMException e2) { System.out.println (&quot;Exception: &quot; + e2); } catch (java.io.IOException e3) { System.out.println (&quot;Exception: &quot; + e3); } } }
Node Types 1 ELEMENT_NODE Element The element name  2 ATTRIBUTE_NODE Attribute The attribute name  3 TEXT_NODE Text #text  4 CDATA_SECTION_NODE CDATA #cdata-section  5 ENTITY_REFERENCE_NODE Entity reference The entity reference name  6 ENTITY_NODE Entity The entity name  7 PROCESSING_INSTRUCTION_NODE PI The PI target  8 COMMENT_NODE Comment #comment  9 DOCUMENT_NODE Document #document  10 DOCUMENT_TYPE_NODE DocType Root element  11 DOCUMENT_FRAGMENT_NODE DocumentFragment #document-fragment  12 NOTATION_NODE Notation The notation name
Making Use of Node Types import org.w3c.dom.*; import javax.xml.parsers.*; import org.xml.sax.*; public class DOMExample4 { public static void main (String[] args) { NodeList elements, Children; String elementName = &quot;cd&quot;; String local = &quot;&quot;; Element element = null; try { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance (); DocumentBuilder builder = factory.newDocumentBuilder (); Document document = builder.parse (&quot;cdcatalog.xml&quot;); Element root = document.getDocumentElement (); System.out.println (&quot;In main ... XML file openend successfully ...&quot;); elements = document.getElementsByTagName(elementName); // is there anything to do? if (elements == null) { return; } // print all elements int elementCount = elements.getLength(); System.out.println (&quot;Count = &quot; + elementCount); for (int i = 0; i < elementCount; i++) { element = (Element) elements.item(i); System.out.println (&quot;Element Name  = &quot; + element.getNodeName()); System.out.println (&quot;Element Type  = &quot; + element.getNodeType()); System.out.println (&quot;Element Value  = &quot; + element.getNodeValue()); System.out.println (&quot;Has attributes = &quot; + element.hasAttributes());  // Find out if child nodes exist for this element Children = element.getChildNodes();  if (Children != null) { for (int j=0; j< Children.getLength(); j++) { local = Children.item(j).getNodeName(); System.out.println (&quot;Child element name = &quot; + local);  } } } } catch (ParserConfigurationException e1) { System.out.println (&quot;Exception: &quot; + e1); } catch (SAXException e2) { System.out.println (&quot;Exception: &quot; + e2); } catch (DOMException e2) { System.out.println (&quot;Exception: &quot; + e2); } catch (java.io.IOException e3) { System.out.println (&quot;Exception: &quot; + e3); } } }
Problem Write a program to create XML contents dynamically and write them to a file on the disk
Solution import java.io.File; import java.io.IOException; import java.io.OutputStreamWriter; import java.io.Writer; import org.w3c.dom.*; import javax.xml.parsers.*; import org.xml.sax.*; import javax.xml.transform.TransformerConfigurationException; import javax.xml.transform.TransformerException; import javax.xml.transform.Source; import javax.xml.transform.dom.DOMSource; import javax.xml.transform.Result; import javax.xml.transform.stream.StreamResult; import javax.xml.transform.Transformer; import javax.xml.transform.TransformerFactory; public class DOMExample5 { public static void main (String[] args) { Source source; File file; Result result; try { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance (); DocumentBuilder builder = factory.newDocumentBuilder (); // Create a new XML document Document document = builder.newDocument (); Element root = (Element) document.createElement(&quot;Order&quot;); // Insert child Manifest document.appendChild(root); Node manifestChild = document.createElement(&quot;Manifest&quot;); root.appendChild(manifestChild); // Insert Items CreateOrderDOM co = new CreateOrderDOM (); co.insertItem(document, manifestChild, &quot;101&quot;, &quot;Name one&quot;, &quot;$29.99&quot;); co.insertItem(document, manifestChild, &quot;108&quot;, &quot;Name two&quot;, &quot;$19.99&quot;); co.insertItem(document, manifestChild, &quot;125&quot;, &quot;Name three&quot;, &quot;$39.99&quot;); co.insertItem(document, manifestChild, &quot;143&quot;, &quot;Name four&quot;, &quot;$59.99&quot;); co.insertItem(document, manifestChild, &quot;118&quot;, &quot;Name five&quot;, &quot;$99.99&quot;); // Normalizing the DOM document.getDocumentElement().normalize();  // Prepare the DOM document for writing source = new DOMSource(document); // Prepare the output file file = new File(&quot;test.xml&quot;); result = new StreamResult(file); // Write the DOM document to the file // Get Transformer Transformer xformer = TransformerFactory.newInstance().newTransformer(); // Write to a file xformer.transform(source, result); } catch ( Exception ex ) { ex.printStackTrace(); }  } }
Stream API (StAX) – Brief Overview To be covered in more depth in “Web Services”
What is StAX? Addition in Java EE 5.0 Pull approach Event-based API Different from SAX, since application pulls event from the XML document/parser, and not the other way round Can do read and write
StAX Classification Two APIs Cursor-based API Allows walk-through of an XML document in document order and  Provides access to all the structural and content information in the form of event objects Iterator-based API Similar to cursor API, but does not provide low level access
Using XSLT in JAXP
Applying an XSLT to an XML File Programatically package  sicsr; import  javax.xml.transform.*; import  javax.xml.transform.stream.StreamResult; import  javax.xml.transform.stream.StreamSource; import  java.io.*; public   class  ApplyXSLT { public   static   void  main(String[] args) { try  { StreamSource xmlFile =  new  StreamSource ( new  File ( &quot;history.xml&quot; )); StreamSource xslFile =  new  StreamSource ( new  File ( &quot;history.xsl&quot; )); TransformerFactory xslFactory = TransformerFactory. newInstance (); Transformer transformer = xslFactory.newTransformer (xslFile); StreamResult resultStream =  new  StreamResult (System. out ); transformer.transform(xmlFile, resultStream); } catch  (Exception ex) { ex.printStackTrace(); } } }
Details about the  Transformer TransformerFactory  is an abstract class in  javax.xml.transform  package Can be used to create a  Transformer  object Transformer  is also an abstract class in  javax.xml.transform.Transformer  package An instance of this class can transform a source tree into a result tree
XML and ASP.NET – An Overview
XmlReader and XmlWriter XMLReader Pull-style API for XML Forward-only, read-only access to XML documents XMLReader is an abstract class that other classes derive from, to provide specific concrete instances such as XmlTextReader and XmlNodeReader In ASP.NET 2.0, XMLReader is a factory  We need not specify which implementation of XMLReader needs to be used We use a static  Create  method, and supply necessary parameters and let .NET decide how to instantiate it
Example – XML Document <? xml   version =&quot;1.0&quot;  encoding =&quot;utf-8&quot; ?> < bookstore > < book   genre  =&quot;autobiography&quot;  publicationdate =&quot;1981&quot;  ISBN =&quot;1-861003-11-0&quot;> < title >The Autobiography of Benjamin Franklin</ title > < author > < first-name >Benjamin</ first-name > < last-name >Franklin</ last-name > </ author > < price >8.99</ price > </ book > < book   genre  =&quot;novel&quot;  publicationdate =&quot;1967&quot;  ISBN =&quot;0-201-65512-2&quot;> < title >The Confidence Man</ title > < author > < first-name >Herman</ first-name > < last-name >Melville</ last-name > </ author > < price >11.99</ price > </ book > < book   genre  =&quot;philosophy&quot;  publicationdate =&quot;1991&quot;  ISBN =&quot;1-861001-57-6&quot;> < title >The Gorgias</ title > < author > < first-name >Sidas</ first-name > < last-name >Plato</ last-name > </ author > < price >9.99</ price > </ book > </ bookstore >
Example – ASP.NET Page using System; using System.Data; using System.Configuration; using System.Collections; using System.Web; using System.Web.Security; using System.Web.UI; using System.Web.UI.WebControls; using System.Web.UI.WebControls.WebParts; using System.Web.UI.HtmlControls; using System.Xml; using System.IO; public partial class  XMLReader2 : System.Web.UI.Page { protected void Page_Load(object sender,  EventArgs e) { int bookCount = 0; XmlReaderSettings settings =  new  XmlReaderSettings(); settings.IgnoreWhitespace =  true; settings.IgnoreComments = true; string booksFile =  Path.Combine(Request.PhysicalApplicationPath,  &quot;Books.xml&quot;); using ( XmlReader reader = XmlReader.Create(booksFile, settings)) { while (reader.Read()) { if (reader.NodeType ==  XmlNodeType.Element &&  &quot;book&quot; == reader.LocalName) { bookCount++; } } } Response.Write( String.Format( &quot;Found {0} books!&quot;, bookCount)); } }
Validating an XML Against a Schema using System.Xml.Schema; using System; using System.Xml; using System.IO; public partial class  XMLReader3 : System.Web.UI.Page { protected void Page_Load(object sender,  EventArgs e) { int bookCount = 0; XmlReaderSettings settings =  new  XmlReaderSettings(); string booksSchemaFile =  Path.Combine(Request.PhysicalApplicationPath,  &quot;books.xsd&quot;); settings.Schemas.Add ( null,  XmlReader.Create (booksSchemaFile)); settings.ValidationType = ValidationType.Schema; settings.ValidationFlags = XmlSchemaValidationFlags.ReportValidationWarnings; settings.ValidationEventHandler +=  new  ValidationEventHandler (settings_ValidationEventHandler); settings.IgnoreWhitespace =  true; settings.IgnoreComments = true; string booksFile =  Path.Combine(Request.PhysicalApplicationPath,  &quot;Books.xml&quot;); using ( XmlReader reader = XmlReader.Create(booksFile, settings)) { while (reader.Read()) { if (reader.NodeType ==  XmlNodeType.Element &&  &quot;book&quot; == reader.LocalName) { bookCount++; } } } Response.Write( String.Format( &quot;Found {0} books!&quot;, bookCount)); } void settings_ValidationEventHandler(object sender, System.Xml.Schema. ValidationEventArgs e) { Response.Write(e.Message); } }
Creating an XML Document using System.Xml.Schema; using System; using System.Xml; using System.IO; public partial class  XMLReader3 : System.Web.UI.Page { protected void Page_Load(object sender,  EventArgs e) { int bookCount = 0; XmlReaderSettings settings =  new  XmlReaderSettings(); string booksSchemaFile =  Path.Combine(Request.PhysicalApplicationPath,  &quot;books.xsd&quot;); settings.Schemas.Add ( null,  XmlReader.Create (booksSchemaFile)); settings.ValidationType = ValidationType.Schema; settings.ValidationFlags = XmlSchemaValidationFlags.ReportValidationWarnings; settings.ValidationEventHandler +=  new  ValidationEventHandler (settings_ValidationEventHandler); settings.IgnoreWhitespace =  true; settings.IgnoreComments = true; string booksFile =  Path.Combine(Request.PhysicalApplicationPath,  &quot;Books.xml&quot;); using ( XmlReader reader = XmlReader.Create(booksFile, settings)) { while (reader.Read()) { if (reader.NodeType ==  XmlNodeType.Element &&  &quot;book&quot; == reader.LocalName) { bookCount++; } } } Response.Write( String.Format( &quot;Found {0} books!&quot;, bookCount)); } void settings_ValidationEventHandler(object sender, System.Xml.Schema. ValidationEventArgs e) { Response.Write(e.Message); } }
Thank you! Any Questions?

6 xml parsing

  • 1.
    XML Parsing AtulKahate [email_address]
  • 2.
    Agenda XML Parsing/ProcessingBasics Simple API for XML (SAX) Document Object Model (DOM) XML and Java using JAXP XML and ASP.NET
  • 3.
  • 4.
    XML Processing XMLprocessing means Reading an XML document Parsing it in the desired manner Allows handling the contents of an XML document the way we want
  • 5.
    XML Parser Softwarethat sits between an application and the XML files Shield programmers from having to manually parse through XML documents Programmers are free to concentrate on the contents of the XML file, not syntax Programmers use the parser APIs to access/manipulate an XML file
  • 6.
    XML Processing ApproachesProcess as a sequence of events Simple API for XML Processing (SAX) Process as a hierarchy of nodes Document Object Model (DOM) Pull approach Streaming API (StAX)
  • 7.
  • 8.
    StAX Pulls eventsfrom the XML document via the parser Also an event-based API, but differs from SAX The application, and not the parser; controls the flow
  • 9.
    Simple API forXML (SAX)
  • 10.
    XML Processing asSequence of Events – 1 Process as a sequence of events Event is the occurrence of something noticeable e.g. in Windows, mouse movement, keyboard input are events The OS captures all events and sends messages to a program The programmer has to take an appropriate action to deal with the event
  • 11.
    XML Processing asSequence of Events – 2 Process as a sequence of events Event-based model can be applied to XML documents also Various events that occur while reading an XML document sequentially Start of document Start tag of an element End tag of an element Comments
  • 12.
    XML Processing asSequence of Events – 3 Process as a sequence of events The programmer has to write code to handle these events Called as event handlers
  • 13.
    Sequential Processing Example– 1 Consider the following XML document <?xml version=“1.0”?> <books> <book> <name> Learning XML </name> <author> Simon North </author> <publication> TMH </publication> </book> <book> <name> XML by Example </name> <author> Don Box </author> <publication> Pearson </publication> </book> </books>
  • 14.
    Sequential Processing Example– 2 Events generated when we read the above XML file Start document Start element: books Start element: book Start element: name Characters: Learning XML End element: name Start element: author Characters: Simon North End element: author Start element: publication Characters: TMH End element: publication … End element: book End document
  • 15.
  • 16.
    Tree Processing Sequence1 2 8 3 4 9 10 14 15 5 6 7 11 12 13 16 17
  • 17.
    Sequential Traversal: SummaryOrder Top to bottom Left to right Advantages Simple Fast Requires less amount of memory Drawback Not possible to look ahead
  • 18.
  • 19.
    JAXP Java APIfor XML Processing
  • 20.
    JAXP Concept Applicationprogram written in Java for working with XML Java API for XML Processing (JAXP) JAXP APIs Simple API for XML Processing (SAX) Document Object Model (DOM) Sequential processing Tree-based processing
  • 21.
    JAXP Java APIfor XML Processing Standardized by Sun Very thin layer on top of SAX or DOM Makes application code parser-independent Our programs should use JAXP, which in turn, calls parser APIs Include package javax.xml.parsers.*
  • 22.
    JAXP: API orAbstraction? JAXP is an API, but is called as an abstraction layer Does not provide new means of parsing XML Does not add to SAX or DOM Does not give new functionality to Java or XML handling Makes working with SAX and DOM easier It is vendor-neutral
  • 23.
    JAXP and ParsingJAXP is not a replacement for SAX, DOM, JDOM etc Some vendor must supply the implementation of SAX, DOM, etc JAXP provides APIs to use these implementations In the early versions of JDK, Sun had supplied a parser called Crimson Now, Sun provides Apache Xerces Both are not a part of JAXP API – they are part of JAXP distribution In JDK, we can locate Xerces implementations in the org.xml.sax and org.w3c.dom packages
  • 24.
    JAXP API Themain JAXP APIs are defined in the package javax.xml.parsers Contains two vendor-neutral factory classes SAXParserFactory – Gives a SAXParser object DocumentBuilderFactory – Gives a DocumentBuilder object DocumentBuilder, in turn, gives Document object
  • 25.
    Package Details javax.xml.parsers The JAXP APIs, which provide a common interface for different vendors' SAX and DOM parsers. org.w3c.dom Defines the Document class (a DOM), as well as classes for all of the components of a DOM. org.xml.sax Defines the basic SAX APIs. javax.xml.transform Defines the XSLT APIs that let you transform XML into other forms.
  • 26.
    Which Packages touse in JAXP? We need to include two sets of packages – one for JAXP and the other for SAX/DOM, as appropriate // JAXP import javax.xml.parsers.SAXParserFactory; // SAX import org.xml.sax.Attributes; import org.xml.sax.SAXException; import org.xml.sax.SAXNotRecognizedException; import org.xml.sax.SAXNotSupportedException; import org.xml.sax.SAXParseException; import org.xml.sax.XMLReader; import org.xml.sax.helpers.DefaultHandler; import org.xml.sax.helpers.XMLReaderFactory;
  • 27.
  • 28.
  • 29.
    Key SAX APIs– 1 SAXParserFactory Creates an instance of the parser determined by the system property, javax.xml.parsers.SAXParserFactory.   SAXParser An interface that defines several kinds of parse() methods. In general, you pass an XML data source and a DefaultHandler object to the parser, which processes the XML and invokes the appropriate methods in the handler object.    
  • 30.
    Key SAX APIs– 2 SAXReader The SAXParser wraps a SAXReader. Typically, you don't care about that, but every once in a while you need to get hold of it using SAXParser's getXMLReader(), so you can configure it. It is the SAXReader which carries on the conversation with the SAX event handlers you define. DefaultHandler Not shown in the diagram, a DefaultHandler implements the ContentHandler, ErrorHandler, DTDHandler, and EntityResolver interfaces (with null methods), so you can override only the ones you're interested in.
  • 31.
  • 32.
    “ new” means“Concrete” Vehicle vehicle = new Car (); We want to use an interface (say Vehicle) to keep code flexible However, we must create an instance of a concrete class (e.g. Car) Makes the code more fragile and less flexible – Why? See next slide.
  • 33.
    Using “new” –1 Vehicle vehicle; if (picnic) vehicle = new Car (); else if (work) vehicle = new Bus (); else vehicle = new Scooter ();
  • 34.
    Using “new” –2 We do not know until run time which class to instantiate Whenever code needs to be changed, we need to reopen this code and examine what needs to be added or removed Mandates application changes at multiple places, making it difficult to maintain
  • 35.
    What is wrongwith “new”? Nothing as such Problem is changes to code and their impact on “new” By coding to an interface, we know that we are insulated from changes made to a system This is because different classes would implement the interface using polymorphism appropriately
  • 36.
    Key OO PrincipleIdentify the aspects of code that vary and separate them from what stays the same Code should be open for extension, but closed for modifications
  • 37.
    Pizza Class –Ideal Situation Pizza orderPizza () { Pizza pizza = new Pizza () ; Pizza.prepare (); Pizza.bake (); Pizza.cut (); Pizza.pack(); return pizza; } Ideally, we would like this to be an abstract class or an interface, but we cannot directly instantiate an abstract class or an interface!
  • 38.
    Pizza Class –Ideal Situation Pizza orderPizza () { Pizza pizza = new Cheese Pizza () ; Pizza.prepare (); Pizza.bake (); Pizza.cut (); Pizza.pack(); return pizza; } We are left with no choice but to instantiate a concrete class
  • 39.
    Solution Pizza orderPizza(String type) { Pizza pizza; if (type.equals (“cheese”)) { pizza = new CheesePizza (); } else if (type.equals (“corn”)) { pizza = new CornPizza (); } else { pizza = new GeneralPizza (); } Pizza.prepare (); Pizza.bake (); Pizza.cut (); Pizza.pack(); return pizza; } We are passing the type of pizza to orderPizza () method. Based on the type of pizza, we instantiate the correct concrete class. Each pizza has to implement the Pizza interface. Each pizza sub-type (e.g. cheese) knows how to prepare itself.
  • 40.
    Is this correct?Let us review the principles: Identify the aspects of code that vary and separate them from what stays the same Code should be open for extension, but closed for modifications
  • 41.
    Problems What ifwe remove one pizza type, and add another? We need to touch the code See next slide
  • 42.
    Problem – CodePizza orderPizza (String type) { Pizza pizza; if (type.equals (“veg”)) { pizza = new VeggiePizza (); } else if (type.equals (“corn”)) { pizza = new CornPizza (); } else { pizza = new GeneralPizza (); } Pizza.prepare (); Pizza.bake (); Pizza.cut (); Pizza.pack(); return pizza; return pizza; } Problem is that we end up touching code for modifications. This is not what we want. What is the solution?
  • 43.
    Code Modified FurtherPizza orderPizza (String type) { Pizza pizza; if (type.equals (“cheese”)) { pizza = new CheesePizza (); } else if (type.equals (“corn”)) { pizza = new CornPizza (); } else { pizza = new GeneralPizza (); } Pizza.prepare (); Pizza.bake (); Pizza.cut (); Pizza.pack(); return pizza; return pizza; } if (type.equals (“cheese”)) { pizza = new CheesePizza (); } else if (type.equals (“corn”)) { pizza = new CornPizza (); } else { pizza = new GeneralPizza (); } Abstract out the code that varies, and put it into a separate class, that would only worry about how to create objects. If any other object needs a pizza object, this is the class to come to.
  • 44.
    This new Classis our “Factory” public class PizzaFactory { public Pizza createPizza (String type) { Pizza pizza = null; if (type.equals (“cheese”)) { pizza = new CheesePizza (); } else if (type.equals (“corn”)) { pizza = new CornPizza (); } else { pizza = new GeneralPizza (); } return pizza; } } This new class creates new pizzas for its clients. It has a createPizza () method, which all clients will use to instantiate new objects. It contains code plucked out of the orderPizza () method.
  • 45.
    Modified Client Codepublic class PizzaStore { PizzaFactory factory = new PizzaFactory (); public Pizza orderPizza (String type) { Pizza pizza = factory.createPizza (“Cheese”); pizza.prepare (); pizza.bake (); pizza.cut (); pizza.pack (); return pizza; } } A variation of this is: PizzaFactory factory = PizzaFactory.newInstance (); In this case, newInstance () would be a static method in PizzaFactory, since we are not creating any object of PizzaFactory here
  • 46.
  • 47.
    Factory Pattern –The Need – 1 Consider this: Connection connection = new OracleConnection (); Connection connection = new SqlServerConnection (); Connection connection = new DB2Connection (); ... What are the problems? How to resolve them?
  • 48.
    Problems Summarized Sometimes,an Application (or framework) at runtime, cannot anticipate the class of object that it must create. The Application (or framework) may know that it has to instantiate classes, but it may only know about abstract classes (or interfaces), which it cannot instantiate. Thus the Application class may only know when it has to instantiate a new Object of a class, not what kind of subclass to create. A class may want it's subclasses to specify the objects to be created. A class may delegate responsibility to one of several helper subclasses so that knowledge can be localized to specific helper subclasses.
  • 49.
    Factory Pattern –The Need – 2 public Connection createConnection (String type) { if (type.equals (&quot;Oracle&quot;) { return new OracleConnection (); } else if (type.equals (&quot;SQL Server&quot;) { return new SqlServerConnection (); } else if (type.equals (&quot;DB2&quot;) { return new DB2Connection (); } } Does it resolve all problems?
  • 50.
    More on FactoryPattern – 1 Factory Method is a creational pattern. This pattern helps to model an interface for creating an object which at creation time can let its subclasses decide which class to instantiate. We call this a Factory Pattern since it is responsible for &quot;Manufacturing&quot; an Object. It helps instantiate the appropriate Subclass by creating the right Object from a group of related classes. The Factory Pattern promotes loose coupling by eliminating the need to bind application-specific classes into the code. Factories have a simple function: Churn out objects.
  • 51.
    More on FactoryPattern – 2 Obviously, a factory is not needed to make an object. A simple call to new will do it for you. However, the use of factories gives the programmer the opportunity to abstract the specific attributes of an Object into specific subclasses which create them. The Factory Pattern is all about &quot; Define an interface for creating an object, but let the subclasses decide which class to instantiate. The Factory method lets a class defer instantiation to subclasses &quot;
  • 52.
    Factory Pattern –The Need – 3 We still need to add new code for a new connection type The existing class needs to undergo changes every time When object creation changes a lot, use a factory
  • 53.
    Factory Pattern –The Need – 4 Client code to use the factory FirstFactory factory = FirstFactory.getInstance (); Connection connection = factory.createConnection (“Oracle”);
  • 54.
    The Factory Class– 1 public class FirstFactory { protected static String type; public static FirstFactory getInstance () { type = null; return new FirstFactory (); } ... }
  • 55.
    The Factory Class– 2 public class FirstFactory { protected static String type; public static FirstFactory getInstance () { type = &quot;&quot;; return new FirstFactory (); } public Connection createConnection (String t) { type = t; if (type.equals (&quot;Oracle&quot;)) { return new OracleConnection (); } else if (type.equals (&quot;SQL Server&quot;)) { return new SQLServerConnection (); } else { //if (type.equals (&quot;DB2&quot;)) { return new DB2Connection (); } } }
  • 56.
    Connection Classes publicinterface Connection { public String description (); } public class OracleConnection implements Connection { public OracleConnection () { // Logic specific to Oracle } public String description () { return &quot;Oracle&quot;; } } public class SQLServerConnection implements Connection { public SQLServerConnection () { // Logic specific to SQL Server } public String description () { return &quot;SQL Server&quot;; } } public class DB2Connection implements Connection { public DB2Connection () { // Logic specific to DB2 } public String description () { return &quot;DB2&quot;; } }
  • 57.
    Client Code publicclass TestConnection { public static void main (String args []) { FirstFactory factory = FirstFactory.getInstance (); Connection connection = factory.createConnection (&quot;DB2&quot;); System.out.println (&quot;You are connected with &quot; + connection.description ()); } }
  • 58.
  • 59.
    Exercise We wantto be able to create any of the following objects that have some similarities and some differences. Design using factory method design pattern. Employee Student Player
  • 60.
  • 61.
    Sequential Traversal: SAXSAX (Simple API for XML) Specify the parser to be used Create a parser instance Create an event handler to respond to parsing events Invoke the parser with the designated content handler and document
  • 62.
    1 – Specifythe Parser Various approaches are possible Set a system property for javax.xml.parsers.SAXParserfactory Specify the parser in jre_dir/lib/jaxp.properties Use system-dependent default parser (check documentation) Usually done at the time of JDK installation itself automatically
  • 63.
    1 – Specifythe Parser Example Public static void main (String [] args) { String jaxpPropertyName = “javax.xml.parsers.SAXParserFactory”; … }
  • 64.
    2 – Createa Parser Instance Steps Create an instance of a parser factory Use that to create a SAXParser object Example SAXParserFactory factory = SAXParserFactory.newInstance (); SaxParser p = factory.newSAXParser ();
  • 65.
    3 – Createan Event Handler Event handler responds to parsing events It is a subclass of DefaultHandler public class MyHandler extends DefaultHandler { … } Main event methods (callbacks) startDocument, endDocument startElement, endElement characters, ignoreableWhitespace
  • 66.
    3 – Createan Event Handler Example method: startElement Declaration public void startElement (String nameSpaceURI, String localName, String qualifiedName, Attributes attributes) throws SASException Arguments nameSpaceURI URI identifying the namespace uniquely localName Element name without namespace prefix qualifiedName Complete element name, including namespace prefix attributes Attributes object, representing attributes of the element
  • 67.
    3 – Createan Event Handler nameSpaceURI <cwp:book xmlns:cwp= “http://www.test.com/xml/”> qualifiedName attribute[1] < cwp:chapter number=“23” part=“Server programming” > <cwp: title > XML made easy </cwp:title> </cwp:chapter> localName </cwp:book>
  • 68.
    4 – Invokethe Parser Call the parse method, supplying: The content handler The XML document File or Input stream p.parse (file name, handler);
  • 69.
    Sample XML File(emp.xml) <?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?> <root> <employee>test 1</employee> <employee>test 1</employee> <employee>test 1</employee> <employee>test 1</employee> <employee>test 1</employee> <employee>test 1</employee> <employee>test 1</employee> </root>
  • 70.
    Java Program toCount Total Number of Elements import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import org.xml.sax.*; import org.xml.sax.helpers.DefaultHandler; public class SAXEmployeeCount extends DefaultHandler { int tagCount = 0; public void startElement (String uri, String localName, String rawName, Attributes attributes) { tagCount++; } public void endDocument() { System.out.println(&quot;There are &quot; + tagCount + &quot; elements.&quot;); } public static void main(String[] args) { SAXEmployeeCount handler = new SAXEmployeeCount (); try { SAXParserFactory spf = SAXParserFactory.newInstance (); SAXParser parser = spf.newSAXParser (); parser.parse(&quot;employee.xml&quot;, handler); } catch (Exception ex) { System.out.println(ex); } } }
  • 71.
    Count Only Book Elements <?xml version=&quot;1.0&quot;?> <books> <book category=&quot;reference&quot;> <author>Nigel Rees</author> <title>Sayings of the Century</title> <price>8.95</price> </book> <book category=&quot;fiction&quot;> <author>Evelyn Waugh</author> <title>Sword of Honour</title> <price>12.99</price> </book> <book category=&quot;fiction&quot;> <author>Herman Melville</author> <title>Moby Rick</title> <price>8.99</price> </book> </books>
  • 72.
    Parsing Code inJAXP import java.io.IOException; import java.lang.*; import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import org.xml.sax.Attributes; import org.xml.sax.Locator; import org.xml.sax.SAXException; import org.xml.sax.SAXNotRecognizedException; import org.xml.sax.SAXNotSupportedException; import org.xml.sax.SAXParseException; import org.xml.sax.XMLReader; import org.xml.sax.ext.LexicalHandler; import org.xml.sax.helpers.DefaultHandler; import org.xml.sax.helpers.ParserAdapter; import org.xml.sax.helpers.XMLReaderFactory; public class BookCount extends DefaultHandler{ private int count = 0; public void startDocument() throws SAXException { System.out.println(&quot;Start document ...&quot;); } public void startElement(String uri, String local, String raw, Attributes attrs) throws SAXException { int year = 0; String attrValue; System.out.println (&quot;Current element = &quot; + raw); if (raw.equals (&quot;book&quot;)) { count++; } } public void endDocument() throws SAXException { System.out.println(&quot;The total number of books = &quot; + count); } public static void main (String[] args) throws Exception { BookCount handler = new BookCount (); try { SAXParserFactory spf = SAXParserFactory.newInstance (); SAXParser parser = spf.newSAXParser (); parser.parse (&quot;book.xml&quot;, handler); } catch (SAXException e) { System.err.println(e.getMessage()); } } }
  • 73.
    Specifying Parser Nameimport java.io.IOException; import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import org.xml.sax.Attributes; import org.xml.sax.Locator; import org.xml.sax.SAXException; import org.xml.sax.SAXNotRecognizedException; import org.xml.sax.SAXNotSupportedException; import org.xml.sax.SAXParseException; import org.xml.sax.XMLReader; import org.xml.sax.ext.LexicalHandler; import org.xml.sax.helpers.DefaultHandler; import org.xml.sax.helpers.ParserAdapter; import org.xml.sax.helpers.XMLReaderFactory; public class SAXApp extends DefaultHandler{ // default parser to use protected static final String DEFAULT_PARSER_NAME = &quot;org.apache.xerces.parsers.SAXParser&quot;; private int count = 0; public void countTopics () throws IOException, SAXException { // create parser try { System.out.println (&quot;Inside countTopics&quot;); } catch (Exception e) { e.printStackTrace(System.err); } } public void startElement(String uri, String local, String raw, Attributes attrs) throws SAXException { if (raw.equals(&quot;topic&quot;)) count++; System.out.println (raw); } public void endDocument() throws SAXException { System.out.println(&quot;There are &quot; + count + &quot; topics&quot;); } public static void main (String[] args) throws Exception{ System.out.println (&quot;Inside main ...&quot;); SAXApp handler = new SAXApp(); try { SAXParserFactory spf = SAXParserFactory.newInstance (); SAXParser parser = spf.newSAXParser (); parser.parse (&quot;contents.xml&quot;, handler); } catch (SAXException e) { System.err.println(e.getMessage()); } } }
  • 74.
    Exercise Consider thefollowing XML file and write a program to count the number of elements that have at least one attribute. <?xml version=&quot;1.0&quot;?> <BOOKS> <BOOK pubyear=&quot;1929&quot;> <BOOK_TITLE>Look Homeward, Angel</BOOK_TITLE> <AUTHOR>Wolfe, Thomas</AUTHOR> </BOOK> <BOOK pubyear=&quot;1973&quot;> <BOOK_TITLE>Gravity's Rainbow</BOOK_TITLE> <AUTHOR>Pynchon, Thomas</AUTHOR> </BOOK> <BOOK pubyear=&quot;1977&quot;> <BOOK_TITLE>Cards as Weapons</BOOK_TITLE> <AUTHOR>Jay, Ricky</AUTHOR> </BOOK> <BOOK pubyear=&quot;2001&quot;> <BOOK_TITLE>Computer Networks</BOOK_TITLE> <AUTHOR>Tanenbaum, Andrew</AUTHOR> </BOOK> </BOOKS>
  • 75.
    Solution import java.io.IOException;import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import org.xml.sax.Attributes; import org.xml.sax.Locator; import org.xml.sax.SAXException; import org.xml.sax.SAXNotRecognizedException; import org.xml.sax.SAXNotSupportedException; import org.xml.sax.SAXParseException; import org.xml.sax.XMLReader; import org.xml.sax.ext.LexicalHandler; import org.xml.sax.helpers.DefaultHandler; import org.xml.sax.helpers.ParserAdapter; import org.xml.sax.helpers.XMLReaderFactory; public class countAttr extends DefaultHandler{ private int count = 0; public void startDocument() throws SAXException { System.out.println(&quot;Start document ...&quot;); } public void startElement(String uri, String local, String raw, Attributes attrs) throws SAXException { System.out.println (&quot;Current element = &quot; + raw); if (attrs.getLength () != 0) { count++; } } public void endDocument() throws SAXException { System.out.println(&quot;The total number of attributes = &quot; + count); } public static void main (String[] args) throws Exception { countAttr handler = new countAttr (); try { SAXParserFactory spf = SAXParserFactory.newInstance (); SAXParser parser = spf.newSAXParser (); parser.parse (&quot;countAttr.xml&quot;, handler); } catch (SAXException e) { System.err.println(e.getMessage()); } } }
  • 76.
    Exercise For thesame XML file, display element names only if the book is published in the 1970s.
  • 77.
    Solution import java.io.IOException;import java.lang.*; import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import org.xml.sax.Attributes; import org.xml.sax.Locator; import org.xml.sax.SAXException; import org.xml.sax.SAXNotRecognizedException; import org.xml.sax.SAXNotSupportedException; import org.xml.sax.SAXParseException; import org.xml.sax.XMLReader; import org.xml.sax.ext.LexicalHandler; import org.xml.sax.helpers.DefaultHandler; import org.xml.sax.helpers.ParserAdapter; import org.xml.sax.helpers.XMLReaderFactory; public class seventiesBooks extends DefaultHandler{ private int count = 0; public void startDocument() throws SAXException { System.out.println(&quot;Start document ...&quot;); } public void startElement(String uri, String local, String raw, Attributes attrs) throws SAXException { int year = 0; String attrValue; System.out.println (&quot;Current element = &quot; + raw); if (attrs.getLength () > 0) { attrValue = attrs.getValue (0); year = Integer.parseInt (attrValue); if (year < 1970) { count++; } } } public void endDocument() throws SAXException { System.out.println(&quot;The total number of matching elements = &quot; + count); } public static void main (String[] args) throws Exception { seventiesBooks handler = new seventiesBooks(); try { SAXParserFactory spf = SAXParserFactory.newInstance (); SAXParser parser = spf.newSAXParser (); parser.parse (&quot;countAttr.xml&quot;, handler); } catch (SAXException e) { System.err.println(e.getMessage()); } } }
  • 78.
    Exercise Consider thefollowing XML document (stock.xml) <?xml version=&quot;1.0&quot;?> <stock> <stockinfo symbol=&quot;IFL&quot;> <company>i-flex solutions limited</company> <price>2500</price> </stockinfo> <stockinfo symbol=&quot;HLL&quot;> <company>Hindustan Lever</company> <price>1840</price> </stockinfo> <stockinfo symbol=&quot;LT&quot;> <company>Laresn and Toubro</company> <price>2678</price> </stockinfo> <stockinfo symbol=&quot;Rel&quot;> <company>Reliance Communications</company> <price>1743</price> </stockinfo> </stock> Produce output as shown on the next slide
  • 79.
  • 80.
    Solution import java.io.*;import org.xml.sax.*; import org.xml.sax.helpers.*; import javax.xml.parsers.*; public class DisplayStockDetails extends DefaultHandler { public void startDocument () throws SAXException { System.out.println (&quot;\nDisplaying Stock Details&quot;); System.out.println (&quot;=========================\n&quot;); } public void endDocument () throws SAXException { System.out.println (&quot;\nEnd of Details&quot;); System.out.println (&quot;==============\n&quot;); } public void startElement (String uri, String local, String raw, Attributes attrs) throws SAXException { // Skip processing root element if (local.equals (&quot;stock&quot;)) return; // Skip processing if there are no attributes if (attrs == null) return; for (int i=0; i<attrs.getLength (); i++) { System.out.println (&quot;[Symbol: &quot; + attrs.getValue (i) + &quot;]&quot;); } } public void endElement (String uri, String local, String raw) throws SAXException { // System.out.println (); } public void characters (char[] ch, int start, int length) throws SAXException { System.out.println (new String (ch, start, length)); } public static void main (String[] args) throws Exception { DisplayStockDetails handler = new DisplayStockDetails (); try { SAXParserFactory spf = SAXParserFactory.newInstance (); SAXParser parser = spf.newSAXParser (); parser.parse (&quot;stock.xml&quot;, handler); } catch (SAXException e) { System.err.println(e.getMessage()); } } }
  • 81.
    Exercise Consider thefollowing XML file and write a program to find out and display the total cost for all CDs. <?xml version=&quot;1.0&quot; encoding=&quot;ISO-8859-1&quot;?> <catalog> <cd> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <country>USA</country> <company>Columbia</company> <price>10.90</price> <year>1985</year> </cd> <cd> <title>Candle in the wind</title> <artist>Elton John</artist> <country>UK</country> <company>HMV</company> <price>8.20</price> <year>1998</year> </cd> </catalog>
  • 82.
    Solution import java.io.IOException;import java.lang.*; import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import org.xml.sax.Attributes; import org.xml.sax.Locator; import org.xml.sax.SAXException; import org.xml.sax.SAXNotRecognizedException; import org.xml.sax.SAXNotSupportedException; import org.xml.sax.SAXParseException; import org.xml.sax.XMLReader; import org.xml.sax.ext.LexicalHandler; import org.xml.sax.helpers.DefaultHandler; import org.xml.sax.helpers.ParserAdapter; import org.xml.sax.helpers.XMLReaderFactory; public class CDPrice extends DefaultHandler{ private int count = 0, total = 0; private boolean flagIsAvailable = false, flagIsCurrentElementPrice = false; public void startDocument() throws SAXException { System.out.println(&quot;Start document ...&quot;); } public void startElement(String uri, String local, String raw, Attributes attrs) throws SAXException { int year = 0; String attrValue; System.out.println (&quot;Current element = &quot; + raw); if (raw.equals (&quot;price&quot;)) { flagIsCurrentElementPrice = true; System.out.println (&quot;INSIDE if of startElement ===&quot;); } } public void characters (char [] ch, int start, int len) throws SAXException { if (flagIsCurrentElementPrice) { System.out.println (&quot;ch = &quot; + ch); System.out.println (&quot;start = &quot; + start); System.out.println (&quot;len = &quot; + len); StringBuffer buffer = new StringBuffer (); for (int i=0; i<len; i++) { buffer.append (ch[start+i]); } System.out.println (&quot;*** buffer = &quot; + buffer + &quot; ***&quot;); String str = buffer.substring (0); int uprice = Integer.parseInt(str); total += uprice; flagIsCurrentElementPrice = false; System.out.println (&quot;Current total = &quot; + total); } } public void endDocument() throws SAXException { System.out.println(&quot;The total price of available CDs = &quot; + total); } public static void main (String[] args) throws Exception { CDPrice handler = new CDPrice(); try { SAXParserFactory spf = SAXParserFactory.newInstance (); SAXParser parser = spf.newSAXParser (); parser.parse (&quot;cdcatalog2.xml&quot;, handler); } catch (SAXException e) { System.err.println(e.getMessage()); } } }
  • 83.
  • 84.
  • 85.
  • 86.
    JAXP and DOM– Overview Class DocumentBuilderFactory public abstract class javax.xml.parsers.DocumentBuilderFactory extends java.lang.object Defines a factory API that enables applications to obtain a parser that produces DOM object trees from XML documents parse method: Parses the contents of an XML document and returns the contents as a new Document object
  • 87.
    JAXP and DOM– Overview Class DocumentBuilder public abstract class javax.xml.parsers. DocumentBuilder extends java.lang.Object Defines the API to obtain DOM Document instances from an XML document
  • 88.
    JAXP and DOM– Overview Interface Document public interface Document extends Node The Document interface represents the entire HTML or XML document Conceptually, it is the root of the document tree, and provides the primary access to the document's data
  • 89.
    JAXP and DOM– Overview Interface Element public interface Element extends Node The Element interface represents an element in an HTML or XML document Elements may have attributes associated with them Inherits from Node, the generic Node interface attributes may be used to retrieve the set of all attributes for an element
  • 90.
    JAXP and DOMDocumentBuilderFactory factory = DocumentBuilderFactory.newInstance (); DocumentBuilder builder = factory.newDocumentBuilder (); Document document = builder.parse (fileName); Element root = document.getDocumentElement ();
  • 91.
    Example – XMLFile Count the number of Employee elements from this XML using DOM <?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?> <root> <employee>test 1</employee> <employee>test 1</employee> <employee>test 1</employee> <employee>test 1</employee> <employee>test 1</employee> <employee>test 1</employee> <employee>test 1</employee> </root>
  • 92.
    Example – JavaCode package javaapplication1; import org.w3c.dom.*; public class Main { public static void main(String[] args) { try { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance (); DocumentBuilder builder = factory.newDocumentBuilder (); Document document = builder.parse (&quot;cdcatalog.xml&quot;); Element root = document.getDocumentElement (); NodeList nodes = document.getElementsByTagName(&quot;employee&quot;); System.out.println(&quot;There are &quot; + nodes.getLength() + &quot; elements.&quot;); } catch (Exception ex) { System.out.println(ex); } } }
  • 93.
    Check if aFile is Well-Formed package sicsr; import javax.xml.parsers.*; public class IsWellFormed { /** * @param args */ public static void main(String[] args) { try { DocumentBuilderFactory domFactory = DocumentBuilderFactory. newInstance (); DocumentBuilder domBuilder = domFactory.newDocumentBuilder(); domBuilder.parse( &quot;NWF.xml&quot; ); } catch (org.xml.sax.SAXException ex) { System. out .println( &quot;File is not well-formed&quot; ); } catch (FactoryConfigurationError ex) { System. out .println(ex.toString ()); } catch (ParserConfigurationException ex) { System. out .println(ex.toString ()); } catch (Exception ex) { System. out .println(ex.toString ()); } } }
  • 94.
    JAXP Code toOpen an XML File import org.w3c.dom.*; import javax.xml.parsers.*; import org.xml.sax.*; public class DOMExample1 { public static void main (String[] args) { try { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance (); DocumentBuilder builder = factory.newDocumentBuilder (); Document document = builder.parse (&quot;cdcatalog.xml&quot;); Element root = document.getDocumentElement (); System.out.println (&quot;In main ... XML file openend successfully ...&quot;); } catch (ParserConfigurationException e1) { System.out.println (&quot;Exception: &quot; + e1); } catch (SAXException e2) { System.out.println (&quot;Exception: &quot; + e2); } catch (java.io.IOException e3) { System.out.println (&quot;Exception: &quot; + e3); } } }
  • 95.
    Case Study –XML File <?xml version=&quot;1.0&quot;?> <catalog> <cd> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <country>USA</country> <company>Columbia</company> <price>10</price> <year>1985</year> </cd> <cd> <title>Candle in the wind</title> <artist>Elton John</artist> <country>UK</country> <company>HMV</company> <price>8</price> <year>1998</year> </cd> </catalog>
  • 96.
    Problem Write aprogram to find out if an element by the name price exists in the XML file and display its contents
  • 97.
    Solution import org.w3c.dom.*;import javax.xml.parsers.*; import org.xml.sax.*; public class DOMExample2 { public static void main (String[] args) { NodeList elements; String elementName = &quot;price&quot;; try { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance (); DocumentBuilder builder = factory.newDocumentBuilder (); Document document = builder.parse (&quot;cdcatalog.xml&quot;); Element root = document.getDocumentElement (); System.out.println (&quot;In main ... XML file openend successfully ...&quot;); elements = document.getElementsByTagName(elementName); // is there anything to do? if (elements == null) { return; } // print all elements int elementCount = elements.getLength(); System.out.println (&quot;Count = &quot; + elementCount); for (int i = 0; i < elementCount; i++) { Element element = (Element) elements.item(i); System.out.println (&quot;Element Name = &quot; + element.getNodeName()); System.out.println (&quot;Element Type = &quot; + element.getNodeType()); System.out.println (&quot;Element Value = &quot; + element.getNodeValue()); System.out.println (&quot;Has attributes = &quot; + element.hasAttributes()); } } catch (ParserConfigurationException e1) { System.out.println (&quot;Exception: &quot; + e1); } catch (SAXException e2) { System.out.println (&quot;Exception: &quot; + e2); } catch (DOMException e2) { System.out.println (&quot;Exception: &quot; + e2); } catch (java.io.IOException e3) { System.out.println (&quot;Exception: &quot; + e3); } } }
  • 98.
    Problem Write aprogram to display element names and their attribute names and values
  • 99.
    Solution import org.w3c.dom.*;import javax.xml.parsers.*; import org.xml.sax.*; public class DOMExample3 { public static void main (String[] args) { NodeList elements; String elementName = &quot;cd&quot;; try { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance (); DocumentBuilder builder = factory.newDocumentBuilder (); Document document = builder.parse (&quot;cdcatalog.xml&quot;); Element root = document.getDocumentElement (); System.out.println (&quot;In main ... XML file openend successfully ...&quot;); elements = document.getElementsByTagName(elementName); // is there anything to do? if (elements == null) { return; } // print all elements int elementCount = elements.getLength(); System.out.println (&quot;Count = &quot; + elementCount); for (int i = 0; i < elementCount; i++) { Element element = (Element) elements.item(i); System.out.println (&quot;Element Name = &quot; + element.getNodeName()); System.out.println (&quot;Element Type = &quot; + element.getNodeType()); System.out.println (&quot;Element Value = &quot; + element.getNodeValue()); System.out.println (&quot;Has attributes = &quot; + element.hasAttributes()); // If attributes exist, print them if(element.hasAttributes()) { // if it does, store it in a NamedNodeMap object NamedNodeMap AttributesList = element.getAttributes(); // iterate through the NamedNodeMap and get the attribute names and values for(int j = 0; j < AttributesList.getLength(); j++) { System.out.println(&quot;Attribute: &quot; + AttributesList.item(j).getNodeName() + &quot; = &quot; + AttributesList.item(j).getNodeValue()); } } } } catch (ParserConfigurationException e1) { System.out.println (&quot;Exception: &quot; + e1); } catch (SAXException e2) { System.out.println (&quot;Exception: &quot; + e2); } catch (DOMException e2) { System.out.println (&quot;Exception: &quot; + e2); } catch (java.io.IOException e3) { System.out.println (&quot;Exception: &quot; + e3); } } }
  • 100.
    Problem For agiven element, find out all the child elements and display their types
  • 101.
    Solution import org.w3c.dom.*;import javax.xml.parsers.*; import org.xml.sax.*; public class DOMExample4 { public static void main (String[] args) { NodeList elements, Children; String elementName = &quot;cd&quot;; String local = &quot;&quot;; Element element = null; try { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance (); DocumentBuilder builder = factory.newDocumentBuilder (); Document document = builder.parse (&quot;cdcatalog.xml&quot;); Element root = document.getDocumentElement (); System.out.println (&quot;In main ... XML file openend successfully ...&quot;); elements = document.getElementsByTagName(elementName); // is there anything to do? if (elements == null) { return; } // print all elements int elementCount = elements.getLength(); System.out.println (&quot;Count = &quot; + elementCount); for (int i = 0; i < elementCount; i++) { element = (Element) elements.item(i); System.out.println (&quot;Element Name = &quot; + element.getNodeName()); System.out.println (&quot;Element Type = &quot; + element.getNodeType()); System.out.println (&quot;Element Value = &quot; + element.getNodeValue()); System.out.println (&quot;Has attributes = &quot; + element.hasAttributes()); // Find out if child nodes exist for this element Children = element.getChildNodes(); if (Children != null) { for (int j=0; j< Children.getLength(); j++) { local = Children.item(j).getNodeName(); System.out.println (&quot;Child element name = &quot; + local); } } } } catch (ParserConfigurationException e1) { System.out.println (&quot;Exception: &quot; + e1); } catch (SAXException e2) { System.out.println (&quot;Exception: &quot; + e2); } catch (DOMException e2) { System.out.println (&quot;Exception: &quot; + e2); } catch (java.io.IOException e3) { System.out.println (&quot;Exception: &quot; + e3); } } }
  • 102.
    Node Types 1ELEMENT_NODE Element The element name 2 ATTRIBUTE_NODE Attribute The attribute name 3 TEXT_NODE Text #text 4 CDATA_SECTION_NODE CDATA #cdata-section 5 ENTITY_REFERENCE_NODE Entity reference The entity reference name 6 ENTITY_NODE Entity The entity name 7 PROCESSING_INSTRUCTION_NODE PI The PI target 8 COMMENT_NODE Comment #comment 9 DOCUMENT_NODE Document #document 10 DOCUMENT_TYPE_NODE DocType Root element 11 DOCUMENT_FRAGMENT_NODE DocumentFragment #document-fragment 12 NOTATION_NODE Notation The notation name
  • 103.
    Making Use ofNode Types import org.w3c.dom.*; import javax.xml.parsers.*; import org.xml.sax.*; public class DOMExample4 { public static void main (String[] args) { NodeList elements, Children; String elementName = &quot;cd&quot;; String local = &quot;&quot;; Element element = null; try { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance (); DocumentBuilder builder = factory.newDocumentBuilder (); Document document = builder.parse (&quot;cdcatalog.xml&quot;); Element root = document.getDocumentElement (); System.out.println (&quot;In main ... XML file openend successfully ...&quot;); elements = document.getElementsByTagName(elementName); // is there anything to do? if (elements == null) { return; } // print all elements int elementCount = elements.getLength(); System.out.println (&quot;Count = &quot; + elementCount); for (int i = 0; i < elementCount; i++) { element = (Element) elements.item(i); System.out.println (&quot;Element Name = &quot; + element.getNodeName()); System.out.println (&quot;Element Type = &quot; + element.getNodeType()); System.out.println (&quot;Element Value = &quot; + element.getNodeValue()); System.out.println (&quot;Has attributes = &quot; + element.hasAttributes()); // Find out if child nodes exist for this element Children = element.getChildNodes(); if (Children != null) { for (int j=0; j< Children.getLength(); j++) { local = Children.item(j).getNodeName(); System.out.println (&quot;Child element name = &quot; + local); } } } } catch (ParserConfigurationException e1) { System.out.println (&quot;Exception: &quot; + e1); } catch (SAXException e2) { System.out.println (&quot;Exception: &quot; + e2); } catch (DOMException e2) { System.out.println (&quot;Exception: &quot; + e2); } catch (java.io.IOException e3) { System.out.println (&quot;Exception: &quot; + e3); } } }
  • 104.
    Problem Write aprogram to create XML contents dynamically and write them to a file on the disk
  • 105.
    Solution import java.io.File;import java.io.IOException; import java.io.OutputStreamWriter; import java.io.Writer; import org.w3c.dom.*; import javax.xml.parsers.*; import org.xml.sax.*; import javax.xml.transform.TransformerConfigurationException; import javax.xml.transform.TransformerException; import javax.xml.transform.Source; import javax.xml.transform.dom.DOMSource; import javax.xml.transform.Result; import javax.xml.transform.stream.StreamResult; import javax.xml.transform.Transformer; import javax.xml.transform.TransformerFactory; public class DOMExample5 { public static void main (String[] args) { Source source; File file; Result result; try { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance (); DocumentBuilder builder = factory.newDocumentBuilder (); // Create a new XML document Document document = builder.newDocument (); Element root = (Element) document.createElement(&quot;Order&quot;); // Insert child Manifest document.appendChild(root); Node manifestChild = document.createElement(&quot;Manifest&quot;); root.appendChild(manifestChild); // Insert Items CreateOrderDOM co = new CreateOrderDOM (); co.insertItem(document, manifestChild, &quot;101&quot;, &quot;Name one&quot;, &quot;$29.99&quot;); co.insertItem(document, manifestChild, &quot;108&quot;, &quot;Name two&quot;, &quot;$19.99&quot;); co.insertItem(document, manifestChild, &quot;125&quot;, &quot;Name three&quot;, &quot;$39.99&quot;); co.insertItem(document, manifestChild, &quot;143&quot;, &quot;Name four&quot;, &quot;$59.99&quot;); co.insertItem(document, manifestChild, &quot;118&quot;, &quot;Name five&quot;, &quot;$99.99&quot;); // Normalizing the DOM document.getDocumentElement().normalize(); // Prepare the DOM document for writing source = new DOMSource(document); // Prepare the output file file = new File(&quot;test.xml&quot;); result = new StreamResult(file); // Write the DOM document to the file // Get Transformer Transformer xformer = TransformerFactory.newInstance().newTransformer(); // Write to a file xformer.transform(source, result); } catch ( Exception ex ) { ex.printStackTrace(); } } }
  • 106.
    Stream API (StAX)– Brief Overview To be covered in more depth in “Web Services”
  • 107.
    What is StAX?Addition in Java EE 5.0 Pull approach Event-based API Different from SAX, since application pulls event from the XML document/parser, and not the other way round Can do read and write
  • 108.
    StAX Classification TwoAPIs Cursor-based API Allows walk-through of an XML document in document order and Provides access to all the structural and content information in the form of event objects Iterator-based API Similar to cursor API, but does not provide low level access
  • 109.
  • 110.
    Applying an XSLTto an XML File Programatically package sicsr; import javax.xml.transform.*; import javax.xml.transform.stream.StreamResult; import javax.xml.transform.stream.StreamSource; import java.io.*; public class ApplyXSLT { public static void main(String[] args) { try { StreamSource xmlFile = new StreamSource ( new File ( &quot;history.xml&quot; )); StreamSource xslFile = new StreamSource ( new File ( &quot;history.xsl&quot; )); TransformerFactory xslFactory = TransformerFactory. newInstance (); Transformer transformer = xslFactory.newTransformer (xslFile); StreamResult resultStream = new StreamResult (System. out ); transformer.transform(xmlFile, resultStream); } catch (Exception ex) { ex.printStackTrace(); } } }
  • 111.
    Details about the Transformer TransformerFactory is an abstract class in javax.xml.transform package Can be used to create a Transformer object Transformer is also an abstract class in javax.xml.transform.Transformer package An instance of this class can transform a source tree into a result tree
  • 112.
    XML and ASP.NET– An Overview
  • 113.
    XmlReader and XmlWriterXMLReader Pull-style API for XML Forward-only, read-only access to XML documents XMLReader is an abstract class that other classes derive from, to provide specific concrete instances such as XmlTextReader and XmlNodeReader In ASP.NET 2.0, XMLReader is a factory We need not specify which implementation of XMLReader needs to be used We use a static Create method, and supply necessary parameters and let .NET decide how to instantiate it
  • 114.
    Example – XMLDocument <? xml version =&quot;1.0&quot; encoding =&quot;utf-8&quot; ?> < bookstore > < book genre =&quot;autobiography&quot; publicationdate =&quot;1981&quot; ISBN =&quot;1-861003-11-0&quot;> < title >The Autobiography of Benjamin Franklin</ title > < author > < first-name >Benjamin</ first-name > < last-name >Franklin</ last-name > </ author > < price >8.99</ price > </ book > < book genre =&quot;novel&quot; publicationdate =&quot;1967&quot; ISBN =&quot;0-201-65512-2&quot;> < title >The Confidence Man</ title > < author > < first-name >Herman</ first-name > < last-name >Melville</ last-name > </ author > < price >11.99</ price > </ book > < book genre =&quot;philosophy&quot; publicationdate =&quot;1991&quot; ISBN =&quot;1-861001-57-6&quot;> < title >The Gorgias</ title > < author > < first-name >Sidas</ first-name > < last-name >Plato</ last-name > </ author > < price >9.99</ price > </ book > </ bookstore >
  • 115.
    Example – ASP.NETPage using System; using System.Data; using System.Configuration; using System.Collections; using System.Web; using System.Web.Security; using System.Web.UI; using System.Web.UI.WebControls; using System.Web.UI.WebControls.WebParts; using System.Web.UI.HtmlControls; using System.Xml; using System.IO; public partial class XMLReader2 : System.Web.UI.Page { protected void Page_Load(object sender, EventArgs e) { int bookCount = 0; XmlReaderSettings settings = new XmlReaderSettings(); settings.IgnoreWhitespace = true; settings.IgnoreComments = true; string booksFile = Path.Combine(Request.PhysicalApplicationPath, &quot;Books.xml&quot;); using ( XmlReader reader = XmlReader.Create(booksFile, settings)) { while (reader.Read()) { if (reader.NodeType == XmlNodeType.Element && &quot;book&quot; == reader.LocalName) { bookCount++; } } } Response.Write( String.Format( &quot;Found {0} books!&quot;, bookCount)); } }
  • 116.
    Validating an XMLAgainst a Schema using System.Xml.Schema; using System; using System.Xml; using System.IO; public partial class XMLReader3 : System.Web.UI.Page { protected void Page_Load(object sender, EventArgs e) { int bookCount = 0; XmlReaderSettings settings = new XmlReaderSettings(); string booksSchemaFile = Path.Combine(Request.PhysicalApplicationPath, &quot;books.xsd&quot;); settings.Schemas.Add ( null, XmlReader.Create (booksSchemaFile)); settings.ValidationType = ValidationType.Schema; settings.ValidationFlags = XmlSchemaValidationFlags.ReportValidationWarnings; settings.ValidationEventHandler += new ValidationEventHandler (settings_ValidationEventHandler); settings.IgnoreWhitespace = true; settings.IgnoreComments = true; string booksFile = Path.Combine(Request.PhysicalApplicationPath, &quot;Books.xml&quot;); using ( XmlReader reader = XmlReader.Create(booksFile, settings)) { while (reader.Read()) { if (reader.NodeType == XmlNodeType.Element && &quot;book&quot; == reader.LocalName) { bookCount++; } } } Response.Write( String.Format( &quot;Found {0} books!&quot;, bookCount)); } void settings_ValidationEventHandler(object sender, System.Xml.Schema. ValidationEventArgs e) { Response.Write(e.Message); } }
  • 117.
    Creating an XMLDocument using System.Xml.Schema; using System; using System.Xml; using System.IO; public partial class XMLReader3 : System.Web.UI.Page { protected void Page_Load(object sender, EventArgs e) { int bookCount = 0; XmlReaderSettings settings = new XmlReaderSettings(); string booksSchemaFile = Path.Combine(Request.PhysicalApplicationPath, &quot;books.xsd&quot;); settings.Schemas.Add ( null, XmlReader.Create (booksSchemaFile)); settings.ValidationType = ValidationType.Schema; settings.ValidationFlags = XmlSchemaValidationFlags.ReportValidationWarnings; settings.ValidationEventHandler += new ValidationEventHandler (settings_ValidationEventHandler); settings.IgnoreWhitespace = true; settings.IgnoreComments = true; string booksFile = Path.Combine(Request.PhysicalApplicationPath, &quot;Books.xml&quot;); using ( XmlReader reader = XmlReader.Create(booksFile, settings)) { while (reader.Read()) { if (reader.NodeType == XmlNodeType.Element && &quot;book&quot; == reader.LocalName) { bookCount++; } } } Response.Write( String.Format( &quot;Found {0} books!&quot;, bookCount)); } void settings_ValidationEventHandler(object sender, System.Xml.Schema. ValidationEventArgs e) { Response.Write(e.Message); } }
  • 118.
    Thank you! AnyQuestions?