ORIGINAL DRAFT
Occasionally, we need to test ourselves against a set of questions to verify our understanding. This may be because we want to prove our value, gauge our ability to learn new material or simply because we enjoy measuring ourselves against the world. For most, taking a test can be a difficult experience, sometimes facilitated through automation. It’s comforting not to have another person hovering around, judging our progress.
Web forms are ideal for presenting the question and answer sessions required in a testing scenario. Not only can users take the test from anywhere, thanks to ubiquitous Web browsers, but our automation can also grade the test, providing useful explanations to teach readers after they’ve answered the questions. In principle this is all very easy to do, especially if you pick the right technologies.
XML is well suited to representing structured data. It’s been used in the Channel Definition Language, to describe chemical structures, and even to embed stock or other information in web pages. The key strength in XML is that it was designed to represent arbitrary, hierarchical data, lending structure to an otherwise potentially undistinguished piece of information. We’re going to use XML to represents the questions in our test.
When I went looking for an XML parser, I found three almost immediately. Sun has what they call ProjectX posted on the JavaSoft site, which at the time of this writing was in early access. Microsoft is a big proponent and early adopter of XML technology and has its own Java XML parser, which is more than suitable for most applications. The one I chose is from IBM. It seemed to be the most complete, supporting both DOM and SAX. It also had the best documentation. XML4J is the library I used to implement this solution, though the others I mentioned should be equally suitable.
XML, DOM, SAX and DTDs
There certainly isn’t enough room in a short article like this to cover the deeper intricacies of XML. It’s generally simple enough to work with and fairly elegant. This article assumes you already have experience working with Java and that you’re familiar with the basic concepts behind Java servlet implementation. None of the code in this article is particularly complex, so even if this is new to you, it should become accessible after reading a quick tutorial. XML is fairly new to most of us, so I’ll cover the basic concepts from a high-level perspective before we start looking at the code.
XML evolved from the more complex SGML (Standard Generalized Markup Language) which was adopted by serious document management companies but seldom completely implemented. In SGML, and XML, document syntax can be described using a DTD (Document Type Definition) either inline or as an external file. The DTD defines the relationship between elements and the permitted structure of the document. XML parsers can work without a DTD or may be told to validate the document against the DTD.
The DOM (Document Object Model) uses a tree representation to store the data structure of an XML document. This is a standardized model managed by the W3C and evolved from the earlier Netscape Navigator and Internet Explorer models. The browser models are referred to as level zero and the current XML standard is level one. A more recent level two standard is also emerging. DOM is a very simple model, centering on the root Document, with a tree of Elements and Attributes, along with a few other supporting concepts.
SAX (Simple API for XML) is an event model that can be used to avoid building a complete tree representation. This is especially important if documents are expected to be very large or if only some portions of the document require attention. SAX events are generated by the parser when it enters or leaves a document, element, attribute or other subtrees. Changes can be made on-the-fly. The model can be use to detect features or relevant data without needing to pay attention to the whole document structure.
Structural Overview
XML documents have a predictable structure, so parsers can loosely handle this structure as long as it is syntactically correct. It’s also possible to impose a limited set of elements and attributes to ensure the data is valid. To do this, XML requires a DTD that can be used to specify what is permissible in a given document.
Listing 1 shows our question DTD. We’ll be using the looser, unvalidated interpretation because we want to accommodate embedded HTML tags, should the author wish to enhance the presentation, but we’ll take a quick look at the DTD because it describes the expected format of our test files. If the files do not adhere to this format, the application will produce unexpected results (possibly failing altogether without much useful feedback).
The DTD first declares that it uses US-ASCII encoding and defines the elements and attributes permitted or expected for each of the elements. The root of our question documents are the test element, which contains question, answer and explain subsections. The test element has a required name attribute, which is the working name, followed by an equal sign and some quoted text. A pair of question tags also contain this type of text data, as do the explain tags.
An answer is one or more item elements and requires two attributes, the type of answer (which has to be either CHECKBOX or RADIO) and a valid answer string (which must contain a comma-delimited list of values for correct answers, with the first item index starting at zero). The items themselves are plain text surrounded by item tags. This definition syntax is very powerful and much more comprehensive than this example might indicate. We have no need for a more complex definition in our application, but the power to do much more is built-in to XML.
Listing 2 shows what a question file actually looks like. You can compare this against Figure 1, which shows the hierarchical relationship for each of the elements in the file. If you look closely at the listing, you’ll see that we use the attributes to define the question name in the test tag, and the type and value for the answer.
Figure 2 shows what a question looks like in your browser. The QuestionServlet translates the XML representation into an HTML page before serving it up. By doing this dynamically on the server side, we introduce considerable flexibility in that all the mechanics are taken care of for us in the servlet. A test author can create questions and drop files in the questions directory where they become immediately accessible.
Our servlet is made aware of the number and position of each question in the test. Figure 2 shows both a previous and next button, along with the score button. If no preceding questions exist, the previous button will not be shown. The same applies to the next button if no subsequent questions are available.
Document Object Model
Our implementation makes use of the Document Object Model. In order to keep our code as clear as possible, we’ll implement a handful of static methods that do a few common things. Listing 3 shows the DOMUtil class which includes the following methods.
Method</td> | Description | </tr> </thead>
---|---|
readDocument | Given a filename, return a Document object containing the parsed model in tree form. |
findNode | Given a Node object, locate and return a child Node with the given name. |
getNodeAttribute | Get a Node attribute by name, returning its String representation. |
printSubtree | Write out the subtree from a given Node down to the specified PrintWriter. |
"); HTMLUtil.writeHighlightHeader(writer, show, isValid, isSelected); DOMUtil.printSubtree(writer, item); HTMLUtil.writeHighlightFooter(writer, show, isValid, isSelected); writer.println(" | |
"); writer.println("<H1" + HTMLUtil.formatAttr("ALIGN", "center") + ">" + title + "</H1>"); DOMUtil.printSubtree(writer, question); writer.println(" | |
"); writer.println("<FONT COLOR=#008000>"); writer.println("ANSWER:"); writer.println("</FONT>"); DOMUtil.printSubtree(writer, explain); writer.println(" | |