NCSA
emerge@ncsa.uiuc.edu

The Grunk API

To invoke grunk as a parsing library, you need to create an instance of it, setting a configuration file and invoking it with a data source. As opposed to streams, we use org.xml.sax.InputSource. This has the advantage that it can accept a wide variety of source types and gives a unified way of treating them. We do wish to emphasize the text-based nature of grunk: It will read in the source looking specifically for end of line markers and process each line in turn, checking it, if desired, until there are no more recognized structures before moving on to the next line. This may be altered by changing configuration parameters to allow for, say a regular expression to recognize multiple blank lines.

The Complete Grunk API

This consists of precisely one method.

You will have to make an instance of grunk first. There are three possible constructors to choose from, depending upon your needs.

Setting a configuration is required for grunk to operate. The various formats are recorded in detail elsewhere, but the are two formats to be aware of. The full format, called TOSCA (for The One Syntax for the Configuration Analyzer) and its lightweight cousin, grunkLite. These may be set directly by invoking the appropriate method:

As mentioned above, a grunkLite configuration file is transformed into a TOSCA file. This is done with a XSL transformation and once you have a transformation you want to use, you may invoke the an appropriate method in the utility class ncsa.emerge.grunk.configuration.ConfigurationTransformation. See the javadoc for more information.

The XMLExporter Interface

It is best to read the javadoc for this class. Writing a custom exporter is not hard and it merely has one method, writeNode(TreeNodeInterface). It is unlikely though you will need to do this. You should realize that the interface returns a java.lang.Object from this interface. This permits you to customize the output of grunk into just about anything you can figure out how to program. The two examples of XMLExporter (which is the default that grunk makes unless you tell it something else) and DOMExporter should be good guides and cover most cases of interest. They respectively yield a java.lang.String and an org.w3c.dom.Document, results so you must cast the results before using them.

Here is an example of how to use ncsa.emerge.grunk.io.DOMExporter. to get a DOM document of your source. All of this should be in a try ... catch block to intercept any GrunkException that arises. We omit this to keep it more readable.

// ... whatever you need up to this point.
// Grunk needs its configuration, say it lives in the file myConfig.grk
InputSource myConfig = new InputSource(new java.io.FileReader("myConfig.grk"));

// Let's make the grunk instance.
Grunk grunk = new Grunk(myConfig, new DOMExporter());

// And now for the actual data source, assumed to be in the file mySource.dat
InputSource dataSource = new InputSource(new java.io.FileReader("mySource.dat"));

// Let's grunk it
Document myDomDoc = (Document) grunk.grunk(dataSource);

//... whatever else you need to do with it. You now have your data in
// a DOM document!

Here is a sample barebones invocation. It is assumed that you want to open a configuration file, open a source file and grunk it. This is a complete method for doing this. This returns a string, but remember that it must be cast. It is assumed that you have imported org.xml.sax.InputSource into your class as well as ncsa.emerge.grunk.*. We catch the possible error.

String parseSource(String configFileName, String sourceFileName){
try{
Grunk grunk = new Grunk();
grunk.setConfiguration(new InputSource(new java.io.FileReader(configFileName)));
return (String)grunk.grunk(new InputSource(new java.io.FileReader(sourceFileName)));
}catch(GrunkException ge){
System.out.println("An informative diagnostic message:\n" + ge.getMessage());
}
}

ErrorMinder.

It is quite likely that grunk will be running in an environment where an error should not be communicated to the user. For example, if grunk were invoked from a browser to interpret some downloaded text from a database query with the aim of converting it into some simple HTML. Here good programming would dictate that there be an exception to catch and that some reasonable message be sent to the user, rather than a bewildering stack trace. Grunk aims at uphollding decorum at all times, and this is the reason for having a class to mind errors. This class takes a java.io.OutputStream as the single argument for its constructor. All console messages will be sent there.