XML Data Sources

For the most part, information content for dynamic Web pages comes from databases. The reason is simply that databases are the main repositories of information that characterize the enterprises and individuals about which that information is reported. Yet, there are other popular external sources of information that often make their way onto Web pages. These sources include XML files, text files, and spreadsheet files. Using XML files as sources of information content is described in this tutorial; follow-up tutorials describe these other data sources.

XML Data Sources

XML (eXtensible Markup Language) is a popular "data structuring" language using special tags to organize information into hierarchical data elements. It is a data markup language that, like XHTML, uses tags to describe the structure of data. Unlike the XHTML language, though, it does not have predefined tags with special meanings. XML tags are created by the data designer to fit the purpose and organization of the data structure. Different sets of tags would be used, for instance, to describe a memo, a letter, an email message, a financial statement, a legal document, a personnel record, or a thousand other data collections.

XML is not limited to describing linear lines of text as is common to text files; neither is it limited to describing relational information organized into the rows and columns of database tables. Virtually any structure of data can be represented by XML to make it accessible for computer processing.

Although XML is not a replacement for database methods of information storage, it is often used to repackage database information for sharing between computers. Database management systems include conversion routines to output tables of database information as XML documents that can be transmitted across the Web. These Web-accessible documents make it possible to share information without compromising server security, which is at risk if outside permission is given to internal databases. Instead, internal data is repackaged as external XML files for sharing of database information.

XML Data Structures

A database table is easily transformed into an XML data structure. Consider the Books table of the BooksDB.mdb database. To make this information available to outside clients without giving them access to the internal database, it can be repackaged as an XML document. As shown in the partial listing in Figure 3-25, the database table name <books> is used as the enclosing tag for the entire structure. Each record is identified by a <book> tag. Database field names are applied as the XML tag names <bookid>, <booktype>, <booktitle>, <bookauthor>, <bookprice>, and <bookqty> to identify the data elements corresponding to the database fields. (The <bookdescription> and <booksale> fields are not coded in this example in order to reduce the file size.) These tag names are purely arbitrary; however, they should be informative of the data they contain. Although this is not a tutorial on XML syntax, you should be able to see the relationships between the database table, row, and column structure and the XML data structure.

<?xml version="1.0" ?>
<books>
  <book>
    <bookid>DB111</bookid>
    <booktype>Database</booktype>
    <booktitle>Oracle Database</booktitle>
    <bookauthor>K. Loney</bookauthor>
    <bookprice>69.99</bookprice>
    <bookqty>10</bookqty>
  </book>
  <book>
    <bookid>DB222</bookid>
    <booktype>Database</booktype>
    <booktitle>Databases in Depth</booktitle>
    <bookauthor>C. J. Date</bookauthor>
    <bookprice>29.95</bookprice>
    <bookqty>6</bookqty>
  </book>
  <book>
    <bookid>DB333</bookid>
    <booktype>Database</booktype>
    <booktitle>Database Processing</booktitle>
    <bookauthor>D. Kroenke</bookauthor>
    <bookprice>136.65</bookprice>
    <bookqty>12</bookqty>
  </book>
  <book>
    <bookid>DB444</bookid>
    <booktype>Database</booktype>
    <booktitle>Access Database Design</booktitle>
    <bookauthor>S. Roman</bookauthor>
    <bookprice>34.95</bookprice>
    <bookqty>25</bookqty>
  </book>
  <book>
    <bookid>DB555</bookid>
    <booktype>Database</booktype>
    <booktitle>Microsoft SQL Server</booktitle>
    <bookauthor>P. Debetta</bookauthor>
    <bookprice>29.99</bookprice>
    <bookqty>0</bookqty>
  </book>

  ...

</books>
Figure 3-25. XML data structure equivalent to a database table.

XML data structures are hierarchical structures; that is, the overall structure of data is a logical unit composed of subunits which in turn are composed of lower-level subunits, working down the hierarchy to the lowest level of individual data items. The above structure, for example, is a three-level hierarchy in which the encompassing <books> element is comprised of <book> elements which, in turn, are comprised of individual elements of data. A hierachy can encompasses any number of levels, although three are sufficient to represent relational database tables.

After adding an XML declaration tag at the beginning of the data structure, it can be saved as a simple text document to make it available to any Web page which needs this information. The file must be saved with the .xml extension to identify it as an XML file. All that is needed for access to this file is its URL. By clicking the following link, the above XML document, named Books.xml and located in the same directory as the BooksDB.mdb database, is opened in a separate browser window.

View Books.xml XML File
Figure 3-26. Link to Books.xml file.

In this view of the data, elements are preceded by "+" and "-" symbols for expanding and collapsing the structure. This is the default browser view of an XML document. It isn't a particularly attractive or readable view, but that's not the point. More importantly, the information is in a file format for sharing among computers and for computer processing.

Displaying XML Files as Hierarchical Data

One of the ways in which XML files can be imported to a Web page is by combining them with special style sheets and displaying the information through an <asp:Xml> control. The general format for this control is shown below.

<asp:Xml id="string" Runat="Server"
  DocumentSource="path"
  TransformSource="path"
/>
Figure 3-27. General format for <asp:Xml> control.

The DocumentSource property gives the server path to the XML file; the TransformSource property gives the location of an XSLT Stylesheet to be applied to the XML data to format it for display. An XSLT (eXtensible Stylesheet Language Transformation) style sheet is a document written in the special XSLT language to transform XML data into other formats. You can view an example XSLT style sheet applicable to the Books.xml file by clicking the following button. As you probably can surmise from the code, this style sheet formats the XML data as an XHTML table.

View Books.xls XSLT File
Figure 3-28. Link to Books.xls file.

This Books.xsl XSLT file is brought together with the Books.xml XML file by the <asp:Xml> control to produce the following output.


IDTypeTitleAuthorPriceQty
DB111DatabaseOracle DatabaseK. Loney$ 69.9910
DB222DatabaseDatabases in DepthC. J. Date$ 29.956
DB333DatabaseDatabase ProcessingD. Kroenke$ 136.6512
DB444DatabaseAccess Database DesignS. Roman$ 34.9525
DB555DatabaseSQL Server 2005P. Debetta$ 29.990
GR111GraphicsAdobe Photoshop CS2Adobe$ 29.994
GR222GraphicsLearning Web DesignJ. Niederst$ 39.958
GR333GraphicsMacromedia Flash ProfessionalT. Green$ 44.9917
GR444GraphicsDigital Photographer HandbookM. Freeman$ 24.9522
GR555GraphicsCreating Motion GraphicsT. Meyer$ 59.9513
HW111HardwareHow Computers WorkR. White$ 29.998
HW222HardwareUpgrading and Repairing PCsS. Mueller$ 59.995
HW333HardwareUSB System ArchitectureD. Anderson$ 49.991
HW444HardwareDesigning Embedded HardwareJ. Catsoulis$ 44.953
HW555HardwareContemporary Logic DesignR. Katz$ 102.952
SW111SoftwareJava How to ProgramDeitel$ 98.599
SW222SoftwareC Programming LanguageB. Kernighan$ 44.2512
SW333SoftwareProgramming C#J. Liberty$ 44.950
SW444SoftwareProgramming PHPR. J. Lerdorf$ 39.9517
SW555SoftwareVisual Basic.NET ProgrammingP. Vick$ 49.9913
SY111SystemsOperating Systems ConceptsA. Silberschatz$ 95.751
SY222SystemsThe UNIX Operating SystemJ. D. Peek$ 19.9512
SY333SystemsWindows Server 2003W. R. Stanek$ 29.9925
SY444SystemsLinux in a NutshellS. Figgins$ 44.9514
SY555SystemsMastering Active DirectoryR. R. King$ 49.998
WB111WebAjax in ActionD. Crane$ 22.6714
WB222WebProfessional ASP.NET 2.0B. Evjen$ 32.9921
WB333WebCascading Style SheetsE. A. Meyer$ 39.956
WB444WebDOM ScriptingJ. Keith$ 23.098
WB555WebMicrosoft ASP.NET 2.0D. Esposito$ 29.9912

Figure 3-29. Applying XSLT transformation to an XML file.
<asp:Xml id="XMLTransform" Runat="Server"
  DocumentSource="../Databases/Books.xml"
  TransformSource="../Databases/Books.xsl"/>
Listing 3-36. Code to apply XSLT transformation to an XML file.

When using the <asp:Xml> control to apply a style sheet to an XML file, both the DocumentSource and TransformSource paths must be physical or relative paths to local files. They cannot be URLs to remote files. You can, however, save a remote XML file locally through a standard <a> link and then apply a style sheet to this local document through the <asp:Xml> control.

Pairing of an XML file with an XSLT style sheet is one of the ways to process XML data for display. As noted, this method can be applied only to local files, restricting somewhat the ability to conveniently access and process files shared from remote sources. A different method, which does give direct access to remote XML files, is presented later.

XML Web Pages

The argument has been made that in the extreme case there is no need to code any text or its formatting XHTML on a Web page. It all can be maintained in external files and imported onto the page as needed. This scenario is practical when working with Web pages stored as XML documents. By clicking the "XML Web Page" link below you can view an example Web page whose content is organized as an XML document. The accompanying "XSLT Style Sheet" link shows formatting information for the page.

XML Web Page
XSLT Style Sheet
Figure 3-30. XML and XSLT files to create a Web page.

Notice that there are no conventional XHTML tags surrounding the content in the XML file even though the file describes a Web page. Rather, XML tags are defined to represent only the hierarchical structure of this content—its organization and data relationships. Its formatting as a Web page is handled by the separate style sheet file.

With its content stored in one file and its style sheet stored in a separate file, the Web page itself requires minimal coding to bring these two external documents together. Web page coding to display and style the above XML file is shown below, followed by the actual output produced by this code. The previous Books.xml table also has been added to the page through a second <asp:Xml> control.

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE html 
  PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
	
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
  <title>XML Page Display</title>
</head>
<body>
<form Runat="Server">

<asp:Xml id="XMLPage" Runat="Server"
  DocumentSource="../Databases/Page.xml" 
  TransformSource="../Databases/Page.xsl"/>

<asp:Xml id="XMLTable" Runat="Server"
  DocumentSource="../Databases/Books.xml"
  TransformSource="../Databases/Books.xsl"/>

</form>
</body>
</html>
Listing 3-37. Code to display a Web page coded as an XML document.


eXtensible Markup Language

The eXtensible Markup Langue (XML) is used to describe data and its structure. Its purpose is to provide a method for packaging data and for transmitting it between computers. In one sense XML is no different in purpose from text files and databases -- it is used to create data stores. Its significance, though, is in establishing a common data formatting standard that is recognizable and shareable among widely diverse computer systems. In addition, it is text based; therefore, it permits easy data exchange across the Web using the common HTTP protocol.

XML Markup

XML is a data markup language that, like XHTML, uses tags to describe the structure of data. Unlike the XHTML Web page markup language, it does not have predefined tags with special meanings. XML tags are created by the data designer to fit the purpose of the data structure. Different sets of tags would be used, for instance, to describe a memo, a letter, a book chapter, a financial statement, an email message, a legal document, a restaurant menu, a personnel record, a course catalog, a driver's license, a television listing, or a thousand other data collections.

As you might recognize from the above listing, all variety of data structures can be represented in XML. It is not limited to linear lines of text as is common to text files; neither is it limited to relational information organized into the rows and columns of database tables. Virtually any structure of data can be represented by XML. Common to all of these data structures, however, are the XML standards that make them all accessible by and transmittable between any computers on the Internet.

Much of the information that needs to be shareable between computers does, of course, reside in databases. The following table shows the structure of information in a Books table containing Book records, each of which is composed of a BookID field, a BookType field, a BookTitle field, a BookAuthor field, a BookPrice field, and a BookQty field.

IDTypeTitleAuthorPriceQty
DB111DatabaseOracle DatabaseK. Loney$ 69.9910
DB222DatabaseDatabases in DepthC. J. Date$ 29.956
DB333DatabaseDatabase ProcessingD. Kroenke$ 136.6512
DB444DatabaseAccess Database DesignS. Roman$ 34.9525
DB555DatabaseSQL Server 2005P. Debetta$ 29.990
GR111GraphicsAdobe Photoshop CS2Adobe$ 29.994
GR222GraphicsLearning Web DesignJ. Niederst$ 39.958
GR333GraphicsMacromedia Flash ProfessionalT. Green$ 44.9917
GR444GraphicsDigital Photographer HandbookM. Freeman$ 24.9522
GR555GraphicsCreating Motion GraphicsT. Meyer$ 59.9513
HW111HardwareHow Computers WorkR. White$ 29.998
HW222HardwareUpgrading and Repairing PCsS. Mueller$ 59.995
HW333HardwareUSB System ArchitectureD. Anderson$ 49.991
HW444HardwareDesigning Embedded HardwareJ. Catsoulis$ 44.953
HW555HardwareContemporary Logic DesignR. Katz$ 102.952
SW111SoftwareJava How to ProgramDeitel$ 98.599
SW222SoftwareC Programming LanguageB. Kernighan$ 44.2512
SW333SoftwareProgramming C#J. Liberty$ 44.950
SW444SoftwareProgramming PHPR. J. Lerdorf$ 39.9517
SW555SoftwareVisual Basic.NET ProgrammingP. Vick$ 49.9913
SY111SystemsOperating Systems ConceptsA. Silberschatz$ 95.751
SY222SystemsThe UNIX Operating SystemJ. D. Peek$ 19.9512
SY333SystemsWindows Server 2003W. R. Stanek$ 29.9925
SY444SystemsLinux in a NutshellS. Figgins$ 44.9514
SY555SystemsMastering Active DirectoryR. R. King$ 49.998
WB111WebAjax in ActionD. Crane$ 22.6714
WB222WebProfessional ASP.NET 2.0B. Evjen$ 32.9921
WB333WebCascading Style SheetsE. A. Meyer$ 39.956
WB444WebDOM ScriptingJ. Keith$ 23.098
WB555WebMicrosoft ASP.NET 2.0D. Esposito$ 29.9912

Figure 3-31. Web page output produced from external XML data sources.