Chapter 8: Using XML in the .NET Framework – ASP.Net Web Developer's Guide

Chapter 8

Using XML in the .NET Framework

Introduction

The Extensible Markup Language (XML) is the latest offering in the world of data access. Microsoft has been actively supporting this language since its conception. XML provides a universal way for exchanging information between organizations. Its structure makes it perfect for online applications and working with data residing on the local or remote data sources.

Like Hypertext Markup Language (HTML), XML is a tag-based markup language. Many other technologies, such as browsers, JavaScript, VBScript, Dynamic HTML (DHTML), and Cascading Style Sheets (CSS), were developed to support the HTML documents. Similarly, XML cannot be singled out as a stand-alone technology. It is actually a family of a growing set of technologies and frameworks. The major members of this family are XML parsers, Extensible Stylesheet Language Transformations (XSLT), XPath, XLink, Simple API for XML (SAX), Schema Generators, and Document Object Model (DOM), just to name a few.

Please take note that ADO.NET is not coded in XML but that ADO.NET revolves around XML. Some readers may confuse the terms. Microsoft has integrated the XML technology in its .NET Framework rather tightly. The core foundation of the entire ADO.NET architecture is built upon XML. The ADO.NET itself is not coded in XML; however, it provides the facilities to apply various existing and emerging XML technologies to manipulate data and information. The System.XML namespace offers perhaps the richest collection of classes for generating, transmitting, processing, and storing information via XML. In this chapter, we will first have a brief introduction to the structural components of an XML document. Then we will look into the architecture of the XML objects in the .NET Framework. Finally, we will study several major XML.NET objects with many examples.

An Overview of XML

XML is fast becoming a standard for data exchange in the next generation’s Internet applications. XML allows user-defined tags that make XML document handling more flexible than HTML, the conventional language of the Internet. Since XML is the heart and soul of ADO.NET, sound knowledge of XML is imperative for developing applications in ASP.NET. The following section touches on some of the basic concepts of XML.

What Does an XML Document Look Like?

The idea behind XML is surprisingly simple. The major objective is to organize information in such a way so that human beings can read and comprehend the data and its context; also, the document itself is technology and platform independent. Consider the following text file:

F10 Shimano Calcutta 47.76

F20 Bantam Lexica 49.99

Obviously, it is difficult to understand exactly what information the above text file contains. Now consider the XML document shown in Figure 8.1. The code is available in the Catalog1.xml file on the accompanying CD.

Figure 8.1 Example XML Document (Catalog1.xml)

The above document is the XML’s way of representing data contained in a product catalog. It has many advantages. It is easily readable and comprehendible, it is self-documented, and it is technology independent. Most importantly, it is quickly becoming the universally acceptable data container and transmission format in the current information technology era. Well, welcome to the exciting world of XML!

Developing & Deploying …

XML and Its Future

XML is quickly becoming the universal protocol for transferring information from site to site via HTTP. Whereas, the HTML will continue to be the language for displaying documents on the Internet, the developers will start using the power of XML to transmit, exchange, and manipulate data using XML.

XML offers a very simple solution to a complex problem. It offers a standard format for structuring data or information in a self-defined document format. This way, the data are kept independent of the processes that will consume the data. Obviously, the concept behind XML is nothing new. XML happens to be a proper subset of a massive specification named SGML developed by W3C in 1986. The W3C began to develop the standard for XML in 1996 with the motivation that XML would be simpler to use than SGML but that it will have more rigid structure than HTML. Since then, many software vendors have implemented various features of XML technologies. For example, Ariba has built its entire B2B system architecture based on XML, many Web servers (such as Weblogic Server) utilize XML specifications for configuring various server related parameters, Oracle has included necessary parsers and utilities to develop business applications in its 8i/9i suites, and finally, the .NET has also embraced the XML technology.

XML contains self-defined data in document format. Hence it is platform independent. It is also easy to transmit a document from a site to another site easily via HTTP. However, the applications of XML do not necessarily have to be limited to conventional Internet applications only. It can be used to communicate and exchange information in other contexts, too. For example, a VB client can call a remote function by passing the function name and parameter values using a XML document. The server may return the result via a subsequent XML document. Basically, that is the technology behind the SOAP (Simple Object Access Protocol).

Creating an XML Document

We can use Notepad to create an XML document.VS.NET offers an array of tools packaged in the XML Designer to work with XML documents. We will demonstrate the usages of the XML Designer later. Right now, go ahead and open the Catalog1.xml file from the CD that accompanies this book in IE 5.0 or higher. You will see that the IE displays the document in a very interesting fashion with drill-down features as shown in Figure 8.2.

Figure 8.2 Catalog1.xml Displayed in IE

Creating an XML Document in VS.NET XML Designer

It is very easy to create an XML document in VS.NET. Use the following steps to develop an XML document:

1. From the Project menu, select Add New Item.

2. Select the XML File icon in the Add New Item dialog box.

3. Enter a name for your XML file.

4. The VS.NET will automatically load the XML Designer and display the XML document template.

5. Finally, enter the contents of your XML document.

The system will display two tabs for two views: the XML view and the Data view of your XML document. These views are shown in Figures 8.3 and 8.4. The XML Designer has many other tools to work with. We will introduce these later in this chapter.

Figure 8.3 The XML View of an XML Document in VS.NET XML Designer

Figure 8.4 The Data View of an XML Document in VS.NET XML Designer

Components of an XML Document

In this section, we will introduce the major components of an XML document. An XML document contains a variety of constructs. Some of the frequently used ones are as follows:

 Declaration Each XML document may have the optional entry <?xml version=“1.0”?>. This standard entry is used to identify the document as an XML document conforming to the W3C (World Wide Web Consortium) recommendation for version 1.0.

 Comment An XML document may contain html–style comments like <!––Catalog data ––>.

 Schema or Document Type Definition (DTD) In certain situations, a schema or DTD may precede the XML document. A schema or DTD contains the rules about the elements of the document. For example, we may specify a rule like “A product element must have a ProductName, but a ListPrice element is optional.” We will discuss schemas later in the chapter.

 Elements An XML document is mostly composed of elements. An element has a start-tag and end-tag. In between the start-tag and end-tag, we include the content of the element. An element may contain a piece of character data, or it may contain other elements. For example, in the Catalog1.xml, the Product element contains three other elements: ProductId, ProductName, and ListPrice. On the other hand, the first ProductName element contains a piece of character data like Shimano Calcutta.

 Root Element In an XML document, one single main element must contain all other elements inside it. This specific element is often called the root element. In our example, the root element is the Catalog element. The XML document may contain many Product elements, but there must be only one instance of the Catalog element.

 Attributes Okay, we agree that we didn’t tell you the whole story in our first example. So far, we have said that an element may contain other elements, or it may contain data, or both. Besides these, an element may also contain zero or more so-called attributes. An attribute is just an additional way to attach a piece of data to an element. An attribute is always placed inside the start-tag of an element, and we specify its value using the “name=value” pair protocol.

Let us revise our Catalog1.xml and include some attributes to the Product element. Here, we will assume that a Product element will have two attributes named Type and SupplierId. As shown in Figure 8.5, we will simply add the Type=“Spinning Reel” and SupplierId=“5” attributes in the first product element. Similarly, we will also add the attributes to the second product element. The code shown in Figure 8.5 is also available in the accompanying CD Let us not get confused with the “attribute” label! An attribute is just an additional way to attach data to an element. Rather than using the attributes, we could have easily modeled them as elements as follows:

Alternatively, we could have modeled the entire product element to be composed of only attributes as follows:

Figure 8.5 Catalog2.xml

<Product ProductID=“F10” ProductName=“Shimano Calcutta” ListPrice = “47.76” Type=“Spinning Reel” SupplierId= “5” >

</Product>

At the initial stage, the necessity of an attribute may appear questionable. Nevertheless, they exist in the W3C recommendation, and in most situations these become handy in designing otherwise-complex XML-based systems.

 Empty Element We have already mentioned a couple of times that an element may contain other elements, or data, or both. However, an element does not necessarily have to have any of them. If needed, it can be kept totally empty. For example, observe the following element:

    <Input type=“text” id=“txtCity” runat=“server” />

The empty element is a correct XML element. The name of the element is Input. It has three attributes: type, id, and runat. However, neither does it contain any sub-elements, nor does it contain any explicit data. Hence, it is an empty element. We may specify an empty element in one of two ways:

 Just before the “>” symbol of the start-tag, add a slash (/), as shown above, or

 Terminate the element using standard end-tag as follows:

    <Input type=“text” id=“txtCity” runat=“server” ></Input>

Examples of some empty elements are: <br/>, <Pup Age=1 />, <Story></Story>, and <Mail/>.

Well-Formed XML Documents

At first sight, an XML document may appear to be like a standard HTML document with additional user-given tag names. However, the syntax of an XML document is much more rigorous than that of an HTML document. The HTML document enables us to spell many tags incorrectly (the browser would just ignore it), and it is a free world out there for people who are not case-sensitive. For example, we may use <BODY> and </Body> in the same HTML document without getting into trouble. On the contrary, there are certain rules that must be followed when we develop an XML document. Please, refer to the http://W3C.org Web site for the details. Some basic rules, among many others are as follows:

 The document must have exactly one root element.

 Each element must have a start-tag and end-tag.

 The elements must be properly nested.

 The first letter of an attribute’s name must begin with a letter or an underscore.

 A particular attribute name may appear only once in the same start tag.

An XML document that is syntactically correct is often called a “well-formed” document. If the document is not well formed, Internet Explorer will provide an error message. For example, the following XML document will receive an error message, when opened in Internet Explorer, just because of the case sensitivity of the tag <product> and </Product>.

<?xml version=“1.0”?>

<product>

<ProductID>F10</ProductID>

</Product>

Schema and Valid XML Documents

An XML document may be well formed, but it may not necessarily be a valid XML document. A valid XML document is a document that conforms to the rules specified in its Document Type Definition (DTD) or Schema. DTD and Schema are actually two different ways to specify the rules about the contents of an XML document. The DTD has several shortcomings. First, a DTD document does not have to be coded in XML. That means a DTD is itself not an XML document. Second, the data-types available to define the contents of an attribute or element are very limited in DTD. This is why, although VS.NET allows both DTD and schema, we will present only the schema specification in this chapter. The W3C has put forward the candidate proposal for the standard schema specification (www.w3.org/XML/Schema#dev). The XML Schema Definition (XSD) specification by W3C has been implemented in ADO.NET.VS.NET supports the XSD specifications.

A schema is simply a set of predefined rules that describe the data contents of an XML document. Conceptually, it is very similar to the definition of a relational database table. In an XML schema, we define the structure of an XML document, its elements, the data types of the elements and associated attributes, and most importantly, the parent-child relationships among the elements. We may develop a schema in many different ways. One way is to enter the definition manually using Notepad. We may also develop schema using visual tools, such as VS.NET or XML Authority. Many automated tools may also generate a rough-cut schema from a sample XML document (similar to reverse-engineering).

If we do not want to code a schema manually, we may generate a rough-cut schema of a sample XML document using VS.NET XML Designer. We may then polish the rough-cut schema to conform to our exact business rules. In VS.NET, it is just a matter of one click to generate a schema from a sample XML document. Use the following steps to generate a rough-cut schema for our Catalog1.xml document:

 Open the Catalog1.xml file in a VS.NET Project.VS.NET will display the XML document and its XML View and the Data View tabs at the bottom.

 Click on the XML menu pad of the Main menu.

That’s all! The systems will create the schema named Catalog1.xsd. If we double-click on the Catalog1.xsd file in the Solution Explorer, we will see the screen as shown in Figure 8.6. We will see the DataSet view tag and the XML view tag at the bottom of the screen. We will elaborate on the DataSet view later in the chapter.

Figure 8.6 Truncated Version of the XSD Schema Generated by the XML Designer

For discussion purposes, we have also listed the contents of the schema in Figure 8.7. The XSD starts with certain standard entries at the top. Although the code for an XSD may appear complex, there is no need to get overwhelmed by its syntax. Actually, the structural part of an XSD is very simple. An element is defined to contain either one or more complexType or simpleType data structures. A complexType data structure nests other complexType or simpleType data structures. A simpleType data structure contains only data.

Figure 8.7 Partial Contents of Catalog1.xsd

In our XSD example (Figure 8.7), the Catalog element may contain one or more (unbounded) instances of the Product element. Thus, it is defined to contain a complexType structure. Besides containing the Product element, it may also contain other elements (for example, it could contain an element Supplier). In the XSD construct, we specify this rule using a choice structure as follows:

NOTE

An XSD is itself a well-formed XML document.

Because the Product element contains further elements, it also contains a complexType structure. This complexType structure, in turn, contains a sequence of ProductId, and ListPrice. The ProductId and the ListPrice do not contain further elements. Thus, we simply provide their data types in their definitions. The automated generator failed to identify the ListPrice element’s text as decimal data. We converted its data type to decimal manually. The complete listing of the Catalog.xsd is shown in Figure 8.7. The code is also available in the accompanying CD.

Minimal knowledge about the XSD schema is required to understand the XML.NET architecture. You will find it especially useful when we discuss the XmlDataDocument.

NOTE

Readers interested in the details of DTD and Schema may explore http://msdn.microsoft.com/xml/default.asp and www.w3.org/XML.

Developing & Deploying …

XML Validation in VS.NET

VS.NET provides a number of tools to work on XML documents. One of them enables you to check if a given XML document is well formed. While on the XML view of an XML document, you may use XML>>Validate XML Data of the main menu to see if the document is well formed. The system displays its findings in the bottom-left corner of the status bar. Similarly, you can use the Schema Validation tool to check if your schema is well formed, too. While on the XML view of the schema, use the Schema>>Validate Schema of the main menu to perform this task.

However, none of the above tests guarantee that your XML data is valid according to the rules specified in the schema. To accomplish this task, you will need to link your XML document to a particular schema first. Then you can test the validity of the XML document. To assign a schema to an XML document, perform the following steps:

1. Display the XML document in XML view (in the XML Designer).

2. Display its Property sheet. (It will be captioned DOCUMENT.)

3. Open the drop-down list box at the right-hand side of the targetSchema, and select the appropriate schema.

4. Now, go ahead and validate the document using the XML>>Validate XML Data of the main menu.

By the way, there are many other third-party software packages that can also test if an XML document is well formed, and if it is valid (against a given schema). In this context, we have found the XML Authority (by TIBCO) and XML Writer (by Wattle Software) to be very good. An excellent tool named XSV is also available from www.w3.org/2000/09/webdata/xsv.

Structure of an XML Document

In an XML document, the data are stored in a hierarchical fashion. A hierarchy is also referred to as a tree in data structures. Conceptually, the data stored in the Catalog1.xml can be represented as a tree diagram, as shown in Figure 8.8. Please note that certain element names and values have been abbreviated in the tree diagram, mostly to conserve real estate on the page.

Figure 8.8 The Tree-Diagram for Catalog1.xml

In this figure, each rectangle is a node in the tree. Depending on the context, a node can be of different types. For example, each product node in the figure is an element-type node. Each product node happens to be a child node of the catalog node. The catalog node can also be termed as the parent of all product nodes. Each product node, in turn, is the parent of its PId, PName, and Price nodes.

In this particular tree diagram, the bottom-most nodes are not of element-type; rather, these are of text-type. There could have been nodes for each attribute and its value, too, although we have not shown those in this diagram.

The Product nodes are the immediate descendants of the Catalog node. Both Product nodes are siblings of each other. Similarly, the PId, PName, and Price nodes under a specific product node are also siblings of each other. In short, all children of a parent are called siblings.

At this stage, you may have been wondering why we are studying the family history rather than ASP. Well, you will find out pretty soon that all of these terminologies will play major roles in taming the beauties and the beasts of something called XML technology.

Processing XML Documents Using .NET

The entire ADO.NET Framework has been designed based on XML technology. Many of the ADO.NET data-handling methodologies, including DataTables and DataSets, use XML in the background, thus keeping it transparent to us. The .NET Framework’s System.Xml namespace provides a very rich collection of classes that can be used to store and process XML documents. These classes are also often referred to as the XML.NET.

Before we get into the details of the XML.NET objects, let us ask ourselves several questions. As ASP NET developers, what kind of support would we need from .NET for processing XML documents? Well, at the very least, we would like .NET to assist us in creating, reading, and parsing XML documents. Anything else? Okay, if we have adequate cache, we would like to load the entire document in the memory and process it directly from there. If we do not have enough cache, then we would like to read various fragments of an XML document one piece at a time. Do we want more? How about the ability for searching and querying the information contained in an XML document? How about instantly creating an XML document from a database query and sending it to our B2B partners? How about converting an XML document from one format to another format and transmitting it to other servers? Actually, XML.NET provides all of these, and much more! All of the above questions fall into two major categories:

1. How do we read, parse and write XML documents?

2. How do we store, structure, and process them in the memory?

As mentioned earlier, XML is associated with a growing family of technologies and frameworks. The major trends in this area are W3C DOM, XSLT, XPath, XPath Query, SAX, and XSLT. In XML.NET, Microsoft has incorporated almost all of these frameworks and technologies. It has also added some of its own unique ideas. There is a plethora of alternative XML.NET objects to satisfy our needs and likings. However, it’s a jungle out there! In the remainder of this section, we will have a brief glance over this jungle.

Migrating…

Legacy Systems and XML

Organizational data stored in legacy systems can be converted to appropriate XML documents, if needed, reasonably easily. There is third-party software like XML Authority by Tibco Extensibility and others, which can convert legacy system’s data into XML format. We can also use VS.NET to convert legacy data to XML documents.

Reading and Writing XML Documents

Two primary classes in this group are XmlReader and XmlWriter. Both of these classes are abstract classes, and therefore we cannot create objects of these classes. Microsoft has provided a number of concrete implementations of both of these classes:

 XmlTextReader We may use an object of this class to read non-cached XML data on a forward-only basis. It checks for well-formed XML, but it does not support data validation.

 XmlNodeReader An object of this class can be used to access non-cached forward-only data from an XML node. It does not support data validation.

 XmlValidationReader This is very similar to the XMLTextReader, except that it accommodates XML data validation.

We may create objects of these classes and use their methods and properties. If warranted, we may also extend these classes to provide further specific functionalities. Fortunately, the XmlWriter class has only one concrete implementation: XmlTextWriter. It can be used to write XML document on a forward-only basis. These classes and their relationships are shown in Figure 8.9.

Figure 8.9 Major XmlReader and XmlWriter Classes

Storing and Processing XML Documents

Once XML data are read, we need to structure these data in the computer’s memory. For this purpose, the major offerings include the XmlNode class and the XPathDocument class. The XmlNode class is an abstract class. There are a number of concrete implementations of this class, too, such as the XmlDocument, XmlAttribute, XmlDocumentFragment, and so on. We will limit our attention to the XmlDocument class, and to one of its subsequent extensions named the XmlDataDocument. The characteristics of some of these classes are as follows:

 XmlDocument This class structures an XML document according to a DOM tree (as specified in the W3C DOM Core Level 1 and 2 specifications).

 XmlDataDocument This class is a major milestone in integrating XML and database processing. It allows two views of the in-cache data: the Relational Table view, and the XML Tree View.

 XPathDocument This class employs the XSLT and XPath technologies, and enables you to transform an XML document in to a desired format.

Above classes are essentially used for storing the XML data in the cache. Just storing data in the memory serves us no purpose unless we can process and query these data. The .NET Framework has included a number of classes to operate on the cached XML data. These classes include XPathNavigator, XPathNodeIterator, XSLTransform, XmlNodeList, etc. These classes are shown in Figure 8.10.

Figure 8.10 Major XML Classes for In-Memory Storage and Processing

Reading and Parsing Using the XmlTextReader Class

The XmlTextReader class provides a fast forward-only cursor that can be used to “pull” data from an XML document. An instance of it can be created as follows:

Dim myRdr As New XmlTextReader(Server.MapPath(“catalog2.xml”))

Once an instance is created, the imaginary cursor is set at the top of the document. We may use its Read() method to extract fragments of data sequentially. Each fragment of data is distantly similar to a node of the underlying XML tree. The NodeType property captures the type of the data fragment read, the Name property contains the name of the node, and the Value property contains the value of the node, if any. Thus, once a data fragment has been read, we may use the following type of statement to display the node-type, name, and value of the node.

Response.Write(myRdr.NodeType.ToString() + “ ” +

myRdr.Name + “: ” + myRdr.Value)

The attributes are treated slightly differently in the XmlTextReader object. When a node is read, we may use the HasAttributes property of the reader object to see if there are any attributes attached to it. If there are attributes in an element, the MoveToAttribute(i) method can be applied to iterate through the attribute collection. The AttributeCount property contains the number of attributes of the current element. Once we process all of the attributes, we need to apply the MoveToElement method to move the cursor back to the current element node. Therefore, the following code will display the attributes of an element:

Microsoft has loaded the XmlDocument class with a variety of convenient class members. Some of the frequently used methods and properties are AttributeCount, Depth, EOF, HasAttributes, HasValue, IsDefault, IsEmptyElement, Item, ReadState, and Value.

Parsing an XML Document:

In this section, we will apply the XMLTextReader object to parse and display all data contained in our Catalog2.xml (as shown in Figure 8.5) document. The code for this example and its output are shown in Figures 8.11 and 8.12, respectively. The code shown in Figure 8.12 is available in the accompanying CD. Our objective is to start at the top of the document and then sequentially travel through its nodes using the XMLTextReader’s Read() method. When there is no more data to read, the Read() method returns “false.” Thus, we are able to build the While myRdr.Read() loop to process all data. Please review the code (Figure 8.12) and its output cautiously. While displaying the data, we have separated the node-type, node-name, and values using colons. Not all elements have names or values. Hence, you will see many empty names and values after respective colons.

Figure 8.11 Truncated Output of the XmlTextReader1.aspx Code

Figure 8.12 XmlTextReader1.aspx

Navigating through an XML Document to Retrieve Data

In the previous section, we extracted and displayed all data, including the “whitespaces” contained in an XML document. Now, we will illustrate an example where we will navigate through the document and pick up only those data that are necessary for an application. The output of this application is shown in Figure 8.13. In this example, we will display the names of our products in a list box. We will load the list box using the Product Name data from the XML file. The user will select a particular product. Subsequently, we will search the XML document to find and display the price of the product. We will travel through the XML file twice, once to load the list box, and once to find the price of a selected product. Please be aware that we could have easily developed the application by building an array or arraylist of the products during the first pass through the XML data, thus avoiding a second pass. Nevertheless, we are reading the file twice just to illustrate various methods and properties of the XmlTextReader object.

Figure 8.13 Output of the Navigation ASPX Example XmlTextReader2.aspx

To load the List Box, we will go through the following process: We will load the list box in the Page_Load event. Here, we will read the nodes one at a time. If the node type is of element-type, we will check if its name is ProductName. If it is a ProductName node, we will perform a Read() to get to its text node and then apply the myRdr.ReadString() method to extract the value and load it in the list box. Finally, we will close the reader object. Caution: We are assuming that there is no “whitespace” between the ProductName and its Text node. If there is a “whitespace,” we will need to put the second Read() in a loop until the node-type is Text.

To find the price of the selected product, we will go through the following process: We will include the necessary code in the “unclick” event code of the command button “Show Price.” We will create a second XmlTextReader object based on the catalog2.xml file. Of course, we may scan all nodes sequentially to find the price. However, the XmlTextReader class enables you to skip undesirable nodes, such as the “whitespace” or the declaration nodes via the MoveToContent() method. According to Microsoft, all nonwhitespace, Element, End Element, EntityReference, and EndEntity nodes are content nodes. The MoveToContent() method checks whether the current node is a content node. If the node is not a content node, then the method skips to the next content node. You need to be careful though. If the current node happens to be a content node, the cursor does not move to the next content node automatically on a further MoveToContent().

Initially, when we instantiate the reader object, its node type is None. It happens to be a noncontent node. Hence our first MoveToContent() statement takes us to a content node. There, we check if it is an Element-type node named “ProductName” and if its ReadString() is equal to the name of the selected product. If all are true, then we apply a Read() to go to the next node. This Read() may take us to a “whitespace” node, and thus we have applied a MoveToContent()to get to the ListPrice node. Figure 8.14 shows an excerpt of the relevant code. The complete code is available in XmlTextReader2.aspx file in the CD.

Figure 8.14 Excerpt of XmlTextReader2.aspx

By the way, we could have also used the MoveToContent() method to load our list box more effectively. However, we just wanted to show the alternative methodologies.

NOTE

We may also read XML files from remote servers as follows:

Dim myRdr As New XmlTextReader(“http://ahmed2/Chapter8/Catalog2.xml”)

Writing an XML Document Using the XmlTextWriter Class

The XmlTextWriter class is a concrete implementation of the XmlWriter abstract class. An XmlTextWriter object can be used to write data sequentially to an output stream, or to a disk file as an XML document. The data to be written may come from the user’s input and/or from a variety of other sources, such as text files, databases, XmlTextReaders, or XmlDocuments. Its major methods and properties include Close, Flush, Formatting, WriteAttribues, WriteAttributeString, WriteComment, WriteElementString, WriteElementString, WriteEndAttribute, WriteEndDocument, WriteState, and WriteStartDocument.

Generating an XML Document Using XmlTextWriter

In this section, we will collect user-given data via an .aspx page, and write the information in an XML file. The run-time view of the application is shown in Figure 8.15. On the click event of the “Create XML File,” the application will create the XML file (in the disk) and display it back in the browser as seen in Figure 8.16.

Figure 8.15 Output of the XmlTextReader2.aspx

Figure 8.16 Generated XML File

We have included the necessary code in the click event of the command button. Our objective is to write the data in a disk file named Customer.xml. In the code, first we have created an instance of the XmlTextWriter object as follows:

Dim myWriter As New XmlTextWriter _

(Server.MapPath(“Customer.xml”), Nothing)

The second parameter “Nothing” is specified to map the file to a UTF-8 format. Then it is just a matter of writing the various elements, attributes, and their values judiciously. Once the file is written, we simply employed the Response.Redirect(Server.MapPath(“Customer.xml”)) to display the XML documents information in the browser. The complete code for the application is shown in Figure 8.17. Both Customer.xml and XmlTextWriter1.aspx files are available in the accompanying CD.

Figure 8.17 XmlTextWriter1.aspx

Exploring the XML Document Object Model

The W3C Document Object Model (DOM) is a set of specifications to represent an XML document in the computer’s memory. Microsoft has implemented the W3C Document Object Model via a number of .NET objects. The XmlDocument is one of these objects. When an XmlDocument object is loaded, it organizes the contents of an XML document as a “tree” (as shown in Figure 8.18). Whereas the XMLTextReader object provides a forward-only cursor, the XmlDocument object provides fast and direct access to a node. However, a DOM tree is cache intensive, especially for large XML documents.

Figure 8.18 Node Addressing Techniques in an XML DOM Tree

An XmlDocument object can be loaded from an XmlTextReader. Once it is loaded, we may navigate via the nodes of its tree using numerous methods and properties. Some of the frequently used members are the following: DocumentElement (root of the tree), ChildNodes (all children of a node), FirstChild, LastChild, HasChildNodes, ChildNodes.Count (# of children), InnerText (the content of the sub-tree in text format), Name (node name), NodeType, and Value (of a text node) among many others.

If needed, we may address a node using the parent-child hierarchy. The first child of a node is the ChildNode(0), the second child is ChildNode(1), and so on. For example, the first product can be referenced as DocumentElement .ChildNodes(0). Similarly, the price of the second product can be addressed as DocumentElement.ChildNodes(1).ChildNodes(2).InnerText.

Navigating through an XmlDocument Object

In this example we will implement our product selection page using the XML document object model. The output of the code is shown in Figure 8.19.

Figure 8.19 Output of the XmlDocument Object Example

Let’s go through the process of loading the XmlDocument (DOM tree). There are a number different ways to load an XML Document object. We will load it using an XmlTextReader object. We will ask the reader to ignore the “whitespaces” (more or less to conserve cache). As you can see from the following code, we are loading the tree in the Page_Load event. On “PostBack”, we will not have access to this tree. That is why we are storing the “tree” in a Session variable. When the user makes a selection, we will retrieve the tree from the session, and search its node for the appropriate price.

Once the tree is loaded, we can load the list box with the InnerText property of the ProductName nodes.

Next, let’s investigate how to retrieve the price of a selected product. On click of the Show Price button, we simply retrieve the tree from the session, and get to the Price node directly. The SelectedIndex property of the list box does a favor for us, as its Selected Index value will match the corresponding child’s ordinal position in the Catalog (DocumentElement). Figure 8.20 shows an excerpt of the relevant code that is used to retrieve the price of a selected product. The complete code is available in the XmlDom1.aspx file in the accompanying CD.

Figure 8.20 Partial Listing of XmlDom1.aspx

Parsing an XML Document Using the XmlDocument Object

A tree is composed of nodes. Essentially, a node is also a tree because it contains all other nodes below it. A node at the bottom does not have any children; hence, most likely it will be of a text-type node. We will employ this phenomenon to travel through a tree using a VB recursive procedure. The primary objective of this example is to travel through DOM tree and display the information contained in each of its nodes. The output of this exercise is shown in Figure 8.21.

Figure 8.21 Parsing an XmlDocument Object

We will develop two subprocedures:

1. DisplayNode(node As XmlNode) It will receive a node and check if it is a terminal node. If the node is a terminal node, this subprocedure will print its contents. If the node is not a terminal node, then the subprocedure will check if the node has any attributes. If there are attributes, it will print them.

2. TravelDownATree(tree As XmlNode) It will receive a tree, and at first it will call the DisplayNode procedure. Then it will pass the sub-tree of the received tree to itself. This is a recursive procedure. Thus, it will actually fathom all nodes of a received tree, and we will get all nodes of the entire tree printed.

The complete listing of the code is shown in Figure 8.22. The code is also available in the file named XmlDom2.aspx in the accompanying CD. As usual, we will load the XmlDocument in the Page_Load() event using an XmlTextReader. After the DOM tree is loaded, we will call the TravelDownATree recursive procedure, which will accomplish the remainder of the job.

Figure 8.22 The Complete Code XmlDom2.aspx

Using the XmlDataDocument Class

The XmlDataDocument class is an extension of the XmlDocument class. It more-or-less behaves almost the same way the XmlDocument does. The most fascinating feature of an XmlDataDocument object is that it provides two alternative views of the same data, the “XML view” and the “relational view.” The XmlDataDocument has a property named DataSet. It is through this property that XmlDataDocument exposes its data as one or more related or unrelated DataTables. A DataTable is actually an imaginary table-view of XML data. Once we load an XmlDataDocument object, we can treat it as a DOM tree, or we can treat its data as a DataTable (or a collection of DataTables) via its DataSet property. Figure 8.23 shows the two views of an XmlDataDocument. Because these views are drawn from the same DataDocument object, these are automatically synchronized. That means that any changes in any one of them will change the other. In this section, we will provide three examples.

Figure 8.23 Two Views of an XmlDataDocument Object

 We will demonstrate how to load an XML document as an XmlDataDocument object, and process it as a Dom tree.

 We will illustrate how to retrieve the data from a DataTable view of the XmlDataDocument’s DataSet.

 Finally, We will demonstrate when and how the XmlDataDocument object provides multiple-table views.

Loading an XmlDocument and Retrieving the Values of Certain Nodes

In this section we will load an XmlDataDocument using our Catalog2.xml file. After we load it, we will retrieve the product names and load them in a list box. The output of this example is shown in Figure 8.24. The code for this application is listed in Figure 8.25, and it is also available in the file named XmlDataDocument1 .aspx in the accompanying CD.

Figure 8.24 Output of XmlDataDocument1.aspx

Figure 8.25 XmlDataDocument1.aspx

The XmlDataDocument is a pleasant object to work with. In this example, the code is pretty straightforward. After we have loaded the XmlDataDocument, we have declared an XmlNodeList collection named productNames. We have populated the collection by using the GetElementsByTgName(“ProductName”) method of the XmlDataDocument object. Finally, it is just a matter of iterating through the productNames collection and loading each of its members in the list box.

At this stage, you will probably ask why we are not finding the unit price of the selected product. Actually, therein lies the beauty of the XmlDataDocument. Because it has extended the XmlDocument class, all of the members of the XmlDocument class are also available to us. Thus, we could use the same technique as shown in our previous example to find the price. Nevertheless, the reason for not showing the searching technique here is that we will cover it later when we discuss the XPathIterator object.

Using the Relational View of an XmlDataDocument Object

In this example, we will process and display the Catalog3.xml document’s data as a relational table in a DataGrid. The Catalog3.xml is exactly the same as Catalog2.xml except that it has more data. The Catalog3.xml file is available in the accompanying CD. The output of this example is shown in Figure 8.26.

Figure 8.26 Output of XmlDataDocument DataSet View Example

If we want to process the XML data as relational data, we need to load the schema of the XML document first. We have generated the following schema for the Catalog3.xml using VS.NET. The schema specification is shown in Figure 8.27 (also available in the accompanying CD).

Figure 8.27 Catalog3.xsd

NOTE

When we create a schema from a sample XML document, VS.NET automatically inserts an xmlns attribute to the root element. The value of this attribute specifies the name of the schema. Thus when we created the schema for Catalog3.xml, the schema was named Catalog3.xsd and VS.NET inserted the following attributes in the root element of Catalog3.xml:

<Catalog xmlns=“http://tempuri.org/Catalog3.xsd”>

In our .aspx code, we loaded the schema using the ReadXmlSchema method of our XmlDataDocument object as:

myDataDoc.DataSet.ReadXmlSchema(Server.MapPath(“Catalog3.xsd”)).

Next, we have loaded the XmlDataDocument as:

myDataDoc.Load(Server.MapPath(“Catalog3.xml”)).

Since the DataDocument provides two views, we have exploited its DataSet.Table(0) property to load the DataGrid and display our XML file’s information in the grid. The complete listing of the code is shown in Figure 8.28. The code is also available in the XmlDataDocDataSetl.aspx file in the accompanying CD.

Figure 8.28 Complete Listing XmlDataDocDataSet1.aspx

Viewing Multiple Tables of a XmlDataDocument Object

In many instances, an XML document may contain nested elements. Suppose that a bank has many customers, and a customer has many accounts. We have modeled this simple scenario in an XML document with nested elements. This document, named Bank1.xml, is shown in Figure 8.29. It is also available in the accompanying CD.

Figure 8.29 Bank1.xml

If we load the above XML document and its schema in an XmlDataDocument object, it will provide two relational tables’ views: one for the customer’s information, and the other for the account’s information. Our objective is to display the data of these relational tables in two DataGrids as shown in Figure 8.30.

Figure 8.30 Displaying Customer and Accounts Data in Two Data Grids

To develop this application, first we had to generate the schema for our Bank1.xml file. We used the VS.NET XML designer to accomplish this task. It is interesting to observe that while creating the schema, VS.NET automatically generates the 1: Many relationship between the Customer and Accounts elements. To establish the relationship, it also creates an auto-numbered primary key column (Customer_Id) in the Customer DataTable. Simultaneously, it inserts the appropriate values of the foreign keys in the Account DataTable. The DataSet view of the generated schema is shown in Figure 8.31.

Figure 8.31 XmlDataDocument DataSet Representation in Visual Studio .NET

In order to provide the relational view of our XML document (Bank1.xml), VS.NET included the Customer_Id attributes in both Customer and Account elements in its generated schema. It also generated the necessary schema entries to describe the implied relationship among the Customer and Account elements. Figure 8.32 shows an excerpt of the generated schema for our XML file. The complete schema is available in a file named Bank1.xsd in the accompanying CD.

Figure 8.32 Primary Key and Foreign Key Specifications in the Bank1.xsd

In the above fragment of the generated schema, the xsd:unique element specifies the Customer_Id attribute as the primary key of the Customer element. Subsequently, the xsd:keyref element specifies the Customer_Id attribute as the foreign key of the Account element. XPath expressions have been used to achieve the afore-mentioned objectives.

The complete listing of the application is shown in Figure 8.33. It is also available in the xmlDataDocDataSet2.aspx file in the accompanying CD. The code is pretty straightforward. We have loaded two data grids from two DataTables of the DataSet, associated with the XmlDataDocument object.

Figure 8.33 Complete Code of XmlDataDocDataSet2.aspx

NOTE

In a Windows Form, the DataGrid control by default provides automatic drill-down facilities for two related DataTables. Unfortunately, it does not work in this fashion in a Web form. Additional programming is needed to simulate the drill-down functionality.

In this example, we have illustrated how an XmlDataDocument object maps nested XML elements into multiple DataTables. Typically, an element is mapped to a table if it contains other elements. Otherwise, it is mapped to a column. Attributes are mapped to columns. For nested elements, the system creates the relationship automatically.

Querying XML Data Using XPathDocument and XPathNavigator

The XmlDocument and the XmlDataDocument have certain limitations. First of all, the entire document needs to be loaded in the cache. Often, the navigation process via the DOM tree itself gets to be clumsy. The navigation via the relational views of the data tables may not be very convenient either. To alleviate these problems, the XML.NET has provided the XPathDocument and XPathNavigator classes. These classes have been implemented using the W3C XPath 1.0 Recommendation (www.w3.org/TR/xpath).

The XPathDocument class enables you to process the XML data without loading the entire DOM tree. An XPathNavigator object can be used to operate on the data of an XPathDocument. It can also be used to operate on XmlDocument and XmlDataDocument. It supports navigation techniques for selecting nodes, iterating over the selected nodes, and working with these nodes in diverse ways for copying, moving, and removal purposes. It uses XPath expressions to accomplish these tasks.

The W3C XPath 1.0 specification outlines the query syntax for retrieving data from an XML document. The motivation of the framework is similar to SQL; however, the syntax is significantly different. At first sight, the XPath query syntax may appear very complex. But with a certain amount of practice, you may find it very concise and effective in extracting XML data. The details of the XPath specification are beyond the scope of this chapter. However, we will illustrate several frequently used XPath query expressions. In our exercises, we will illustrate two alternative ways to construct the expressions. The first alternative follows the recent XPath 1.0 syntax. The second alternative follows XSL Patterns, which is a precursor to XPath 1.0. Let us consider the following XML document named Bank2.xml. The Bank2.xml document is shown in Figure 8.34, and it is also available in the accompanying CD. It contains data about various accounts. We will use this XML document to illustrate our XPath queries.

Figure 8.34 Bank 2.xml

Sample Query Expression 1: Suppose that we want the names of all account holders. The following alternative XPath expressions will accomplish the job equally well:

 Alternative 1: descendant::Name

 Alternative 2: Bank/Account/Name

The first expression can be read as “Give me the descendents of all Name nodes.” The second expression can be read as “Give me the Name nodes of the Account nodes of the Bank node.” Both of these expressions will return the same node set.

Sample Query Expression 2: We want the records for all customers from Ohio. We may specify any one of the following expressions:

 Alternative 1: descendant::Account[child::State=‘OH’]

 Alternative 2: Bank/Account[child::State=‘OH’]

Sample Query Expression 3: Any one of the following alternative expressions will return the Account node-sets for all accounts with a balance more than 5000.00:

 Alternative 1: descendant::Account[child::Balance > 5000]

 Alternative 2: Bank/Account[child::Balance > 5000.00]

Sample Query Expression 4: Suppose that we want the Account information for those accounts whose names start with the letter “D.”

 Alternative 1: descendant::account[starts-with(child::Name, ‘D’)]

 Alternative 2: Bank/Account[starts-with(child::Name, ‘D’)]

Which of the alternative expressions would you use? That depends on your personal taste and on the structure of the XML document. The second alternative appears to be easier than the first one. However, in the case of a highly nested document, the first alternative will offer more compact expressions. Regardless of the syntax used, please be aware that each of the above queries will return a set of nodes. In our ASP code, we will have to extract the desired information from these sets using an XPathNodeIterator.

NOTE

We found the http://staff.develop.com/aarons/bits/xpath-builder/ site to be very good in learning XPath queries interactively.

Okay, now that we have traveled through the XPath waters, we are ready to venture into the usages of the XPathDocument. In this context, we will provide two examples. The first example will extract the names of the customers from Ohio and load a list box. The second example will illustrate how to find a specific piece of data from an XPathDocument.

Using XPathDocument and XPathNavigator Objects

In this section we will use the XPathDocument and XPathNavigator objects to load a list box from our Bank2.xml file (as shown in Figure 8.34). We will load a list box with the names of customers who are from Ohio. The output of this application is shown in Figure 8.35. The complete code for this application is shown in Figure 8.36. The code is also available in the XPathDoc1.aspx file in the accompanying CD.

Figure 8.35 Using XPathDocument Object

Figure 8.36 Complete Code XPathDoc1.aspx

We loaded the Bank2.xml as an XPathDocument object as follows:

Dim Doc As New XPathDocument(Server.MapPath(“Bank2.xml”))

At this stage, we need two more objects: an XPathNavigator for retrieving the desired node-set, and an XPathNodeIterator for iterating through the members of the node-set. These are defined as follows:

Dim myNav As XPathNavigator

myNav= myDoc.CreateNavigator()

Dim myIter As XPathNodeIterator

myIter=myNav.Select(“Bank/Account[child::State=‘OH’]/Name”)

The Bank/Account[child::State=‘OH’]/Name search expression returns the Name nodes from the Account node-set whose state is “OH.” To get the value inside a particular name node, we need to use the Current. Value property of the Iterator object. Thus, the following code loads our list box:

While (myIter.MoveNext())

lstName.Items.Add(myIter.Current.Value)

End While

Using XPathDocument and XPathNavigator Objects for Document Navigation

This section will illustrate how to search an XPathDocument using a value of an attribute, and using a value of an element. We will use the Bank3.xml to illustrate these. A partial listing of the Bank3.xml is shown in Figure 8.37. The complete code is available in the accompanying CD.

Figure 8.37 Bank3.xml

The Account element of the above XML document contains an attribute named AccountNo, and three other elements. In this example, we will first load two combo boxes, one with the account numbers, and the other with the account holder’s names. The user will select an account number and/or a name. On the click event of the command buttons, we will display the balances in the appropriate text boxes. The output of the application is shown in Figure 8.38. The application has been developed in an .aspx file named XpathDoc2.aspx. Its complete listing is shown in Figure 8.39. The code is also available in the accompanying CD.

Figure 8.38 The Output of XPathDoc2.aspx

Figure 8.39 Complete Code XPathDoc2.aspx

To search for a particular value of an attribute (e.g., of an account number) we have used the following expression:

Bank/Account[@AccountNo=‘“+accNo+”’]/Balance

To search for a particular value of an element (e.g., of an account holder’s name), we have used the following expression:

descendant::Account[child::Name=‘“+accName+”’]/Balance

We needed to call the MoveNext method of the Iterator object in order to get to the balance node. The following expression illustrates the construct:

Bank/Account[@AccountNo=‘“+accNo+”’]/Balance

Transforming an XML Document Using XSLT

Extensible Stylesheet Language Transformations (XSLT) is the transformation component of the XSL specification by W3C (www.w3.org/Style/XSL). It is essentially a template-based declarative language, which can be used to transform an XML document to another XML document or to documents of other types (e.g., HTML and Text). We can develop and apply various XSLT templates to select, filter, and process various parts of an XML document. In .NET, we can use the Transform() method of the XSLTransform class to transform an XML document.

Internet Explorer (5.5 and above) has a built-in XSL transformer that automatically transforms an XML document to an HTML document. When we open an XML document in IE, it displays the data using a collapsible list view. However, the Internet Explorer cannot be used to transform an XML document to another XML document. Now, why would we need to transform an XML document to another XML document? Well, suppose that we have a very large document that contains our entire catalog’s data. We want to create another XML document from it, which will contain only the productId and productNames of those products that belong to the “Fishing” category. We would also like to sort the elements in the ascending order of the unit price. Further, we may want to add a new element in each product, such as “Expensive” or “Cheap” depending on the price of the product. To solve this particular problem, we may either develop relevant codes in a programming language like C#, or we may use XSLT to accomplish the job. XSLT is a much more convenient way to develop the application, because XSLT has been developed exclusively for these kind of scenarios.

Before we can transform a document, we need to provide the Transformer with the instructions for the desired transformation of the source XML document. These instructions can be coded in XSL. We have illustrated this process in Figure 8.40.

Figure 8.40 XSL Transformation Process

In this section, we will demonstrate certain selected features of XSLT through some examples. The first example will apply XSLT to transform an XML document to an HTML document. We know that the IE can automatically transform an XML document to a HTML document and can display it on the screen in collapsible list view. However, in this particular example, we do not want to display all of our data in that fashion. We want to display the filtered data in tabular fashion. Thus, we will transform the XML document to a HTML document to our choice (and not to IE’s choice). The transformation process will select and filter some XML data to form an HTML table. The second example will transform an XML document to another XML document and subsequently write the resulting document in a disk file, as well as display it in the browser.

Transforming an XML Document to an HTML Document

In this example, we will apply XSLT to extract the account’s information for Ohio customers from the Bank3.xml (as shown in Figure 8.37) document. The extracted data will be finally displayed in an HTML table. The output of the application is shown in Figure 8.41.

Figure 8.41 Transforming an XML Document to an HTML Document

If we need to use XSLT, we must at first develop the XSLT style sheet (e.g., XSLT instructions). We have saved our style sheet in a file named XSLT1.xsl. In this style sheet, we have defined a template as <xsl:template match=“/”> … </xsl:template>. The match=“/” will result in the selection of nodes at the root of the XML document. Inside the body of this template, we have first included the necessary HTML elements for the desired output.

The “<xsl:for-each select=“Bank/Account[State=‘OH’]” >” tag is used to select all Account nodes for those customers who are from “OH.” The value of a node can be shown using a <xsl:value-of select=attribute or element name>. In case of an attribute, its name must be prefixed with an @ symbol. For example, we are displaying the value of the State node as <xsl:value-of select=“State”/>. The complete listing of the XSLT1.xsl file is shown in Figure 8.42. The code is also available in the accompanying CD. In the .aspx file, we have included the following asp:xml control.

Figure 8.42 Complete Code for XSLT1.xsl

<asp:xml id=“ourXSLTransform” runat=“server” DocumentSource=“Bank3.xml” TransformSource=“XSLT1.xsl”/>

While defining this control, we have set its DocumentSource attribute to “Bank3.xml”, and its TransformSource attribute to XSLT1.xsl. The complete code for the .aspx file, named XSLT1.aspx, is shown in Figure 8.43. It is also available in the accompanying CD.

Figure 8.43 XSLT1.aspx

Transforming an XML Document into Another XML Document

Suppose that our company has received an order from a customer in XML format. The XML file, named OrderA.xml, is shown in Figure 8.44. The file is also available in the accompanying CD.

Figure 8.44 An Order Received from a Customer in XML Format (OrderA.xml)

Now we want to transmit a purchase order to our supplier to fulfill the previous order. Suppose that the XML format of our purchase order is different from that of our client as shown in Figure 8.45. The OrderB.xml file is also available in the accompanying CD.

Figure 8.45 The Purchase Order to Be Sent to the Supplier in XML Format (OrderB.xml)

The objective of this example is to automatically transform OrderA.xml (Figure 8.44) to OrderB.xml (Figure 8.45). The outputs of this application are shown in Figures 8.46 and 8.47.

Figure 8.46 Transformation of an XML Document to Another XML Document

Figure 8.47 The Target XML File as Displayed in Internet Explorer

We have developed an XSLT file (shown in Figure 8.48) to achieve the necessary transformation. In the XSLT code, we have used multiple templates. The complete listing of the XSLT code is shown in Figure 8.48. The code is also available in the order.xsl file in the accompanying CD.

Figure 8.48 Complete Listing of order.xsl

Subsequently, we have developed the XSLT2.aspx file to employ the XSLT code in the order.xsl file to transform the OrderA.xml to OrderB.xml. The complete listing of the .aspx file is shown in Figure 8.49. This code is also available in the accompanying CD. The transformation is performed in the ShowTransformed() subprocedure of our .aspx file. In this code, the Transform method of an XSLTransform object is used to transform and generate the target XML file.

Figure 8.49 Complete Listing for XSLT2.aspx

Working with XML and Databases

Databases are used to store and manage organization’s data. However, it is not a simple task to transfer data from the database to a remote client or to a business partner, especially when we do not clearly know how the client will use the sent data. Well, we may send the required data using XML documents. That way, the data container is independent of the client’s platform. The databases and other related data stores are here to stay, and XML will not replace these data stores. However, XML will undoubtedly provide a common medium for exchanging data among sources and destinations. It will also allow various software to exchange data among themselves. In this context, the XML forms a bridge between ADO.NET and other applications. Since XML is integrated in the .NET Framework, the data transfer using XML is lot easier than it is in other software development environments. Data can be exchanged from one source to another via XML. The ADO.NET Framework is essentially based on Datasets, which, in turn, relies heavily on XML architecture. The DataSet class has a rich collection of methods that are related to processing XML. Some of the widely used ones are ReadXml, WriteXml, GetXml, GetXmlSchema, InferXmlSchema, ReadXmlSchema, and WriteXmlSchema.

In this context, we will provide two simple examples. In the first example, we will create a DataSet from a SQL query, and write its contents as an XML document. In the second example, we will read back the XML document generated in the first example and load a DataSet. What are the prospective uses of these examples? Well, suppose that we need to send the products data of our fishing products to a client. In earlier days, we would have sent the data as a text file. But in the .NET environment, we can instead develop a XML document very fast by running a query, and subsequently send the XML document to our client. What is the advantage? It is fast, easy, self-defined, and technology independent. The client may use any technology (like VB, Java, Oracle, etc.) to parse the XML document and subsequently develop applications. On the other hand, if we receive an XML document from our partners, we may as well apply XML.NET to develop our own applications.

Creating an XML Document from a Database Query

In this section, we will populate a DataSet with the results of a query to the Products table of SQL Server 7.0 Northwind database. On the click event of a command button, we will write the XML file and its schema. (The output of the example is shown in Figure 8.50). We have developed the application in an .aspx file named DataSet1.aspx. The complete listing of the .aspx file is shown in Figure 8.51. The file is also available in the accompanying CD.

Figure 8.50 Output of DataSet1.aspx Application

Figure 8.51 Complete Listing DataSet1.aspx

The XML file created by the application is as follows:

The code for the illustration is straightforward. The DataSet’s WriteXml and WriteXmlSchema methods were used to accomplish the desired task.

Reading an XML Document into a DataSet

Here, we will read back the XML file created in the previous example (as shown in Figure 8.50) and populate a DataSet in the Page_Load event of our .aspx file. We will use the ReadXml method of the DataSet object to accomplish this objective. The output of the application is shown in Figure 8.52. The application has been developed in an .aspx file named DataSet2.aspx. The complete code for this application is shown in Figure 8.53. The code is also available in the accompanying CD. The code is self-explanatory.

Figure 8.52 Output of DataSet2.aspx Application

Figure 8.53 Complete Listing of DataSet2.aspx

Summary

In this chapter, we have introduced the basic concepts of XML, and we have provided a concise overview of the .NET classes available to read, store, and manipulate XML documents. The examples presented in this chapter also serve as good models for developing business applications using XML and ASP.NET.

The .NET’s System.Xml namespace contains probably the richest collection of XML-related classes available thus far in any other software development platform. The System.Xml namespace has been further enriched by the recent addition of XPathDocument and XPathNavigator classes. We have tried to highlight these new features in our examples. Since XML can be enhanced using a family of technologies, there are innumerable techniques a reader should judiciously learn from other sources to design, develop, and implement complex real-world applications.

Solutions Fast Track

An Overview of XML

 XML stands for eXtensible Markup Language. It is a subset of a larger framework named SGML. The W3C developed the specifications for SGML and XML.

 XML provides a universal way for exchanging information between organizations.

 XML cannot be singled out as a stand-alone technology. It is actually a framework for exchanging data. It is supported by a family of growing technologies such as XML parsers, XSLT transformers, XPath, XLink, and Schema Generators.

 An XML document may contain Declaration, Comment, Elements, and Attributes.

 An XML element has a start-tag and an end-tag. An element may contain other elements, or character data, or both.

 An attribute provides an additional way to attach a piece of data to an element. An attribute must always be enclosed within start-tag of an element, and its value is specified using double quotes.

 An XML document is said to be well formed when it satisfies a set of syntax-related rules. These rules include the following:

 The document must have exactly one root element.

 Each element must have a start-tag and end-tag.

 The elements must be properly nested.

 An XML document is case sensitive.

 DTD and schema are essentially two different ways two specify the rules about the contents of an XML document.

 An XML schema contains the structure of an XML document, its elements, the data types of the elements and associated attributes including the parent-child relationships among the elements.

 VS.NET supports the W3C specification for XML Schema Definition (also known as XSD).

 XML documents stores data in hierarchical fashion, also known as a node tree.

 The top-most node in the node tree is referred to as the root.

 A particular node in a node tree can be of element-type, or of text-type. An element-type node contains other element-type nodes or text-type node. A text-type node contains only data.

Processing XML Documents Using .NET

 The Sytem.Xml namespace contains XmlTextReader, XmlValidatingReader, and XmlNodeReader classes for reading XML Documents. The XmlTextWriter class enables you to write data as XML documents.

 XmlDocument, XmlDataDocument, and XPathDocument classes can be used to structure XML data in the memory and to process them.

 XPathNavigator and XPathNodelterator classes enable you to query and retrieve selected data using XPath expressions.

Reading and Parsing Using the XmlTextReader Class

 The XmlTextReader class provides a fast forward-only cursor to pull data from an XML document.

 Some of the frequently used methods and properties of the XmlTextReader class include AttributeCount, Depth, EOF, HasAttributes, HasValue, IsDefault, IsEmptyElement, Item, ReadState, and Value.

 The Read() of an XmlTextReader object enables you to read data sequentially. The MoveToAttribute() method can be used to iterate through the attribute collection of an element.

Writing an XML Document Using the XmlTextWriter Class

 An XmlTextWriter class can be used to write data sequentially to an output stream, or to a disk file as an XML document.

 Its major methods and properties include Close, Flush, Formatting, WriteAttribues, WriteAttributeString, WriteComment, WriteElementString, WriteElementString, WriteEndAttribute, WriteEndDocument, WriteState, and WriteStartDocument.

 Its constructor contains a parameter that can be used to specify the output format of the XML document. If this parameter is set to “Nothing,” then the document is written using UTF-8 format.

Exploring the XML Document Object Model

 The W3C Document Object Model (DOM) is a set of the specifications to represent an XML document in the computer’s memory.

 XmlDocument class implements both the W3C specifications (Core level 1 and 2) of DOM.

 XmlDocument object also allows navigating through XML node tree using XPath expressions.

 XmlDataDocument is an extension of XmlDocument class.

 It can be used to generate both the XML view as well as the relational view of the same XML data.

 XmlDataDocument contains a DataSet property that exposes its data as relational table(s).

Querying XML Data Using XPathDocument and XPathNavigator

 XPathDocument class allows loading XML data in fragments rather than loading the entire DOM tree.

 XPathNavigator object can be used in conjunction with XPathDocument for effective navigation through XML data.

 XPath expressions are used in these classes for selecting nodes, iterating over the selected nodes, and working with these nodes for copying, moving, and removal purposes.

Transforming an XML Document Using XSLT

 You can use XSLT (XML Style Sheet Language Transformations) to transform an XML document to another XML document or to documents of other types (e.g., HTML and Text).

 XSLT is a template-based declarative language. We can develop and apply various XSLT templates to select, filter, and process various parts of an XML document.

 In .NET, you can use the Transform() method of XSLTransform class to transform an XML document.

Working with XML and Databases

 A DataSet’s ReadXml() can read XML data as DataTable(s).

 You can create an XML document and its schema from a database query using DataSet’s WriteXml() and WriteXmlSchema().

 Some of the widely used ones include ReadXml, WriteXml, GetXml, GetXmlSchema, InferXmlSchema, ReadXmlSchema, and WriteXmlSchema.

Frequently Asked Questions

The following Frequently Asked Questions, answered by the authors of this book, are designed to both measure your understanding of the concepts presented in this chapter and to assist you with real-life implementation of these concepts. To have your questions about this chapter answered by the author, browse to www.syngress.com/solutions and click on the “Ask the Author” form.

Q: What is the difference between DOM Core 1 API and Core 2 API?

A: DOM Level 2 became an official World Wide Web Consortium (W3C) recommendation in late November 2000. Although there is not much of difference in the specifications, one of the major features was the namespaces in XML being added, which was unavailable in prior version. DOM Level 1 did not support namespaces. Thus, it was the responsibility of the application programmer to determine the significance of special prefixed tag names. DOM Level 2 supports namespaces by providing new namespace-aware versions of Level 1 methods.

Q: What are the major features of System.XML in the Beta 2 edition?

A: The most significant change in the Beta 2 edition was the restructuring the XmlNavigator Class. XmlNavigator initially was designed as an alternative to the general implementation of DOM. Since Microsoft felt that there was a mismatch in the XPath data model and DOM-based data model, XmlNavigator was redesigned to XpathNavigator, employing a read-only mechanism. It was conceived of using with XPathNodelterator that acts as an iterator over a node set and can be created many times per XPathNavigator.

    Alternatively, one can have the DOM implementation as XmlNode, and methods such as SelectNodes() and SelectSingleNodes() can be used to iterate through a node set. A typical code fragment would look like this:

    Dim nodeList as XmlNodeList

    Dim root as XmlElement = Doc.DocumentElement

    nodeList =

    root.SelectNodes(“descendant::account[child::State=‘OH’]”)

    Dim entry as XmlNode

    For Each entry in nodeList

    ’Do the requisite operations

    Next

    Although XPathNavigator is implemented as a read-only mechanism to manipulate the XML documents, it can be noted that certain other classes like XmlTextWriter can be implemented over XPathNavigator to write to the document.

Q: How is XPath different from XSL Patterns?

A: XSL Patterns are predecessors of XPath 1.0 that have been recognized as a universal specification. Although similar in syntax, there are some differences between them. XSL pattern language does not support the notion of axis types. On the other hand, the XPath supports axis types. Axis types are general syntax used in Xpath, such as descendant, parent, child, and so on. Assume that we have an XML document with the root node named Bank. Further, assume that the Bank element contains many Account elements, which in turn contains account number, name, balance, and state elements. Now, suppose that our objective is to retrieve the Account data for those customers who are from Ohio. We can accomplish the search by using any one of the following alternatives:

 XSL Pattern Alternative: Bank/Account[child::State=‘OH’]

 XSL Path 1.0 Alternative: descendant::Account[child::State=‘OH’]

    Which of the above alternatives would you use? That depends on your personal taste and on the structure of the XML document. In case of a very highly nested XML document, the XSL Path offers more compact search string.