How To Use

The following examples show you how to use NQXML's parsers, writer, and context-sensitive callback mechanism. See also the files in the examples directory.

Parsers

There are two flavors of XML parser. Both check for the well-formedness of documents and create entities representing XML tags and text.

The first kind of parser returns entities one at a time. Perhaps the most well-known of this type are SAX parsers (Simple API for XML). The NQXML streaming parser isn't a SAX parser because it doesn't use callbacks to return entities. Instead, the streaming parser iterates over the entities via calls to NQXML::StreamingParser.each.

The second kind of parser creates a tree of entity objects in memory and returns a document object containing the document prolog, object tree, and epilogue. The Document Object Model (DOM) is often used by these parsers. The NQXML tree parser isn't a DOM parser because it doesn't use exactly the same class names or hierarchy for the elements contained in the NQXML::Document object.

Creating XML Output

An XML writer can be used to build well-formed XML output. The NQXML::Writer has two ways of doing this. First, there are methods that output tags, attributes, and text bit-by-bit. For this purpose, the writer class has an interface similar to James Clark's com.jclark.xml.output.XMLWriter Java class.

Additionally, the writeDocument method accepts an NQXML::Document and prints out the entire document's XML.

Examples

Checking an XML document for well-formedness

The code in Example 1 shows an NQXML::StreamingParser being used to check the well-formedness of an XML document. First we create the parser and hand it either an XML string or a readable object (for example, IO, File, or Tempfile). Next, we iterate over all of the entities in the document. We ignore them because we are only interested in finding any errors. If an NQXML::ParserError exception is raised, the document is not well-formed.

Using the Dispatcher

The NQXML::Dispatcher class by David Alan Black allows you to register handlers (callbacks) for entering and/or exiting a given context. This section comes from the RDTool documentation found in the source code for NQXML::Dispatcher.

Register Handlers For Various Events

The streaming parser provides a stream of four types of entity: (1) element start-tags, (2) element end-tags, (3) text segments, and (4) comments. You can register handlers for any or all of these. You do this by writing a code block which you want executed every time one of the four types is encountered in the stream in a certain context.

"Context," in this context, means nesting of elements -- for instance, (book(chapter(paragraph))). See the examples, below, for more on this.

The handler will return the entity that triggered it back to the block, so the block should be prepared to grab it. (See documentation for NQXML::StreamingParser and other components of NQXML for more information on this.)

Note: when you register a handler, you must specify an event, a context, and an action (block). The event must be a symbol. The context may be a list of strings, a list of symbols, an array of strings, or an array of symbols.

Examples:

  1. Register a handler for starting an element. Arguments are: context and a block, where context is an array of element names, in order of desired nesting, and block is a block.

    # For every new <chapter> element inside a <book> element:
    nd.handle(:start_element, [ :book, :chapter ] ) { |e|
      puts "Chapter starting"
    }
  2. Register a handler for dealing with text inside an element:

    # Print book chapter titles in bold (LaTex):
    nd.handle(:text, "book", "chapter", "title" ) { |e|
      puts "\\textbf{#{e.text}}"
    }
  3. Register a handler for end of an element.

    nd.handle(:end_element, %w{book chapter} ) { |e|
      puts "Chapter over"
    }
  4. Register a handler for all XML comments:

    # Note that this can be done one of two ways:
    nd.handle(:comment) { |c| puts "Comment: #{c} }
    nd.handle(:comment, "*") { |c| puts "Comment: #{c} }

Writing XML

The NQXML::Writer class creates and outputs well-formed XML. There are two ways to use a writer: call methods that create the XML a bit at a time or create an NQXML::Document object and hand it to the writer.

For writing XML a bit at a time, NQXML::Writer has an interface similar to James Clark's com.jclark.xml.output.XMLWriter Java class. For printing entire document trees, there is NQXML::Writer.writeDocument.

A writer's constructor has two arguments. The first is the object to which the XML is written. This argument can be any object that responds to the << method, including IO, File, Tempfile, String, and Array objects.

The second, optional boolean argument to the constructor activates some simple ``prettifying'' code that inserts newlines after tags' closing brackets, indents opening tags, and minimizes empty tags. This behavior is turned off by default. The ``prettifying'' behavior can be turned on or off at any time by modifying the writer's prettify attribute.

Writers check to make sure that tags are nested properly. If there is an error, an NQXML::WriterError exception is raised.

When a writer outputs an empty tag such as <foo attr="x"/>, it normalizes the tag by printing <foo attr="x"></foo>.

More Example Scripts

Here are short descriptions of each of the examples found in the examples directory.

There are also a few XML data files in the examples directory.