XML Parser Tutorial

 

 

An XML parser is used for XML document. It reads the document and analysis the document structure and the data properties. The parser splits the data into many parts for the use of other components. There are two kinds of XML parsers: non-validating parsers and validating parsers.

The latest XML parser developed by Microsoft is MSXML4.0. It is developed according to W3C standards and supports standard DOM, XPath, Schema, XSLT implementations and SAX. So it provides you with productivity and scalability.

To facilitate beginners to get to know how to use MSXML4.0 quickly, this tutorial is going to introduce following contents:

  1. Download of MSXML4.0
  2. Installation of MSXML4.0
  3. Example of using MSXML4.0
 

 

Download of MSXML4.0

From the below Microsoft website, you can download MSXML4.0:
http://www.microsoft.com/downloads/details.aspx?familyid=3144b72b-b4f2-46da-b4b6-c5d7485f2b42&displaylang=en

At the bottom of the page, click the ^msxml.msi ̄, then the ^File Download ̄ dialog appears:



Save the msxml.msi file to where you want to save in your local PC. Click ^Save ̄, the download will be done.

Back to top

Installation of MSXML4.0

Double click the file you just downloaded.

The MSXML License Agreement dialog appears. Accept the agreement and click ^Next ̄


In the Customer Information dialog, input the User Name and Organization information. Then click the ^Next ̄.


Then the Setup Type dialog shows. Click the ^Install Now ̄ button.

 


The installation of the MSXML4.0 starts. When it is completed, click the ^Finish ̄ button.


Now the MSXML4.0 installation is done.

Back to top

Example usage of MSXML4.0

Now we use the MSXML in a HTML file. For the use of the parser, we must add this sentence into our HTML file:

var xmlDoc = new ActiveXObject("Msxml2.DOMDocument.4.0");

The following HTML file is created as sample used to parse and validate XML file against its DTD file. Save it as validator.htm.

<HTML>
<HEAD>
<TITLE>XML Validation</TITLE>
<script language="javascript">
<!--
  function ValidateXML()
  {
   var xmlDoc = new ActiveXObject("Msxml2.DOMDocument.4.0");
   xmlDoc.async = false;
   xmlDoc.validateOnParse = true;

   var strXML = document.all("XML").value
   bol = xmlDoc.loadXML(strXML);


   if (xmlDoc.parseError.errorCode != 0)
   {
    alert(xmlDoc.parseError.reason + "\n" + xmlDoc.parseError.srcText);
   }
   else {
    document.all("XML").value=xmlDoc.xml
    alert("This XML file is valid!");
   }
  }
//--></script>
</HEAD>
<BODY>
 <form ID="MyForm">

 <h1><font color="#0000ff">Input XML File to Validate</font></h1>
 <textarea id="XML" rows="30" cols="75"></textarea>
 <br>
 <input type="button" value="Validate This XML File Against DTD" onclick="ValidateXML()" />
 </form>
</BODY>
</HTML>

We use the addressbook.xml as input file of the MSXML parser:


<?xml version = "1.0"?>
<!-- addressbook.xml for the DTD -->
<!DOCTYPE addressbook SYSTEM "addressbook.dtd">
<addressbook>
 <addressrecord>
  <name>
   <firstname> James </firstname>
   <lastname> Smith </lastname>
  </name>
  <address>
   <street> 101 South Street</street>
   <city> Halifax </city>
   <email> james@dal.ca </email>
   <phone> 4940001 </phone>
  </address>
 </addressrecord>

 <addressrecord>
  <name>
   <firstname> Tom </firstname>
   <lastname> White </lastname>
  </name>
  <address>
   <street> 202 Victoria Road </street>
   <city> Dartmouth </city>
   <email> tom@dal.ca</email>
   <phone> 4940002 </phone>
  </address>
 </addressrecord>
</addressbook>

The line"<!DOCTYPE addressbook SYSTEM "addressbook.dtd">" indicates its DTD is addressbook.dtd

The DTD file of addressbook.xml is addressbook.dtd.

<!ELEMENT addressbook (addressrecord+)>
<!ELEMENT addressrecord (name,address)>
<!ELEMENT name (firstname,lastname)>
<!ELEMENT firstname (#PCDATA)>
<!ELEMENT lastname (#PCDATA)>
<!ELEMENT address (street,city,email,phone)>
<!ELEMENT street (#PCDATA)>
<!ELEMENT city (#PCDATA)>
<!ELEMENT email (#PCDATA)>
<!ELEMENT phone (#PCDATA)>

This DTD is saved in the same directory of the XML or you need to specify the location of it in the XML file.

After the validator.htm runs, input or paste the addressbook.xml into the edit box:


Click the ^Validate This XML File Aaginst DTD ̄ button, the parser tell you the addressbook.xml file is correct.

 

If we modify something of the addressbook.xml, for example, we change the ^phone ̄ to "phonenumber" of the second addressrecord, then we click the button to validate the XML file.


The parser will detect the wrong line by checking the definition in DTD file and report the error.

 

Back to top