Package net.sf.saxon.pull


package net.sf.saxon.pull

This package provides an experimental pull API for Saxon: that is, it allows an application to read serially through a document, reading "events" such as the start and end of elements, text nodes, comments, and processing instructions, in the order in which they appear. In fact, the API allows access not just to a single document, but to any sequence consisting of nodes and atomic values: when a node is encountered, the pull API does a traversal of the subtree rooted at that node, before moving on to the next item in the sequence.

The API, defined in class PullProvider, is loosely modelled on the StAX XMLReader API. It is not identical, because it is designed as an intimate and efficient interface that integrates with Saxon concepts such as the SequenceIterator and the NamePool. A class StaxBridge is available that provides the PullProvider interface on top of a StAX pull parser; however, because pull parsing is not yet a standard feature of the Java platform, and because at the time of writing the available StAX parsers appear to be buggy, StaxBridge is not included in the saxon.jar distribution, but is instead supplied as a sample application in the samples directory.

The three main kinds of PullProvider are:

  • StaxBridge, which is an interface to a pull-mode XML parser

  • TreeWalker, which delivers events based on an in-memory tree. There is one general-purpose TreeWalker that can handle any Saxon tree (any tree that implements the NodeInfo interface) and aother that is optimized to the TinyTree implementation.

  • VirtualTreeWalker, which delivers events representing the nodes constructed by a stylesheet or query, without actually constructing the nodes in memory. (Note that this doesn't currently work if the constructed nodes need to be schema-validated).

Some examples of application code using the pull interface with Saxon are provided in the PullExamples.java file in the samples directory.

Michael H. Kay
Saxonica Limited
30 March 2005

  • Class
    Description
    This is a filter that can be added to a pull pipeline to remove START_DOCUMENT and END_DOCUMENT events.
    This is a filter that can be added to a pull pipeline to remember element names so that they are available immediately after the END_ELEMENT event is notified
    This class bridges between the JAXP 1.3 NamespaceContext interface and Saxon's equivalent NamespaceResolver interface.
    A PullConsumer consumes all the events supplied by a PullProvider, doing nothing with them.
    PullFilter is a pass-through filter class that links one PullProvider to another PullProvider in a pipeline.
    This class delivers any XPath sequence through the pull interface.
    PullNamespaceReducer is a PullFilter responsible for removing duplicate namespace declarations.
    PullProvider is Saxon's pull-based interface for reading XML documents and XDM sequences.
    This class copies a document by using the pull interface to read the input document, and the push interface to write the output document.
    PullPushTee is a pass-through filter class that links one PullProvider to another PullProvider in a pipeline, copying all events that are read into a push pipeline, supplied in the form of a Receiver.
    A PullSource is a JAXP Source that encapsulates a PullProvider - that is, an object that supplies an XML document as a sequence of events that are read under the control of the recipient.
    This class bridges PullProvider events to XMLStreamReader (Stax) events.
    PullTracer is a PullFilter that can be inserted into a pull pipeline for diagnostic purposes.
    This class implements the Saxon PullProvider API on top of a standard StAX parser (or any other StAX XMLStreamReader implementation)
    This implementation of the Saxon pull interface starts from any NodeInfo, and returns the events corresponding to that node and its descendants (including their attributes and namespaces).
    A document node whose construction is deferred.
    An element node whose construction is deferred.
    This class represents a virtual element node, the result of an element constructor that (in general) hasn't been fully evaluated.
    This class is used to represent unparsed entities in the PullProvider interface
    A virtual tree walker provides a sequence of pull events describing the structure and content of a tree that is conceptually being constructed by expressions in a query or stylesheet; in fact the tree is not necessarily constructed in memory, and exists only as this stream of pull events.