Saturday, September 17, 2011

Learning Apache Commons Digester 3


Apache Commons Digester 3 is a Java library to translate XML data to Java objects. It makes configuring Java applications with XML files much easier than other wise. In this tutorial, we are going to create a Family (Listing 2), an Address (Listing 3), and three Member (Listing 4) objects corresponding to the XML data in Listing 1.

To master Apache Common Disgester 3, one must really understand the key concepts: rules, matching patterns, and the object stacks.

Rules and Matching Patterns

A rule is a instance of a subclass of the Rule class, representing a set of actions (more on this later). For a rule to have effect, it must be registered on a matching pattern for XML elements (see examples in Listing 7 - Registering Rules), and must be associated with a digester (see an example at Line 24, Listing 6 - Parsing XML). When the digester walks through the XML element tree during the parsing phase, it will invoke the actions of a rule when it encounters the elements matched by the pattern. More specifically, the digester will call the begin(), body(), and end() methods on the rule object when it encounters the beginning tag, content, and ending tag of a matched element, respectively. The actions of a rule are implemented in the body of its begin(), body(), and end() methods. For the sake of briefness, we are going to just refer to the begin(), body(), end() methods as the begin(), body(), end() actions. And we are going to refer to an action invoked when the digester encounters a certain XML element as an action for the XML element; and the owner rule of the action as a rule for the XML element. Of course, a rule may be a rule for Element A and Element B at the same time as long as it is registered on patterns that match Element A and Element B.

The matching pattern syntax is very simply. For elements in Listing 1,
  • the <family> element can be matched by pattern "family"
  • the <address> element by either "family/address" or "*/address"
  • a <firstname> element by "family/member/firstname" or "*/firstname".

An action can be anything. The most frequent actions are creating Java objects, setting JavaBean properties with XML element contents or attribute values, and linking a Java object with another.  There are many built in subclasses of the Rule class, for example, ObjectCreateRule,  SetPropertiesRule, BeanPropertySetterRule,  SetNextRule, etc, and are just for those purposes.

Order of Actions

For rules registered on patterns that match different elements, the order of rule registration does not matter. A begin() action for a XML element is always invoked before any actions for its nested elements; similarly, a end() action for a XML element is always invoked after any actions for its nested elements. If there are two rules for the same element, say Rule A is registered on a pattern that matches the element before Rule B is regiestered on the same pattern or another pattern that also matches the element, the order of action execution will be:
  1. Rule A's begin() action
  2. Rule B's begin() action
  3. Rule B's end() action
  4. Rule A's end action

The Object Stacks

A digester maintains many object stacks. One is called the default stack, another is the parameter stack. In addition, it may hold any number of named stacks. Java objects created during the parsing process are pushed to and popped out of the stacks (by the rules). Many built-in rules, such as SetPropertiesRule, BeanPropertySetterRule, and CallMethodRule, just work on the object on the top of the default stack. An ObjectCreateRule creates a new Java object and pushes it to the default stack during its begin() method execution, and pops it out during its end() method execution.

Any Rule object can call its getDigester() method to retrieve a reference to the digester that it associates with. Via the digester, a rule can push object to, pop objects out, or peek objects in the default stack, the parameter stack, or any named stack by calling the digester's methods:
  • push() - push to the default stack
  • pop() - pop out the default stack
  • peek() - peek into the default stack
  • pushParams() - push to the parameter stack
  • popParams() - pop out the parameter stack
  • peekParams() - peek into the parameter stack
  • push(stackName) - push to the named stack
  • pop(stackName) - pop out the named stack
  • peek(stackName) - peek into the named stack
When you create your own rule class that pushes to/pops from a stack. You better use a named stack specific for that rule class to avoid intervention with the built-in rule classes, which use the default stack, or other rule classes created by you. Of course, a rule class should only pop object (in its end() method) that was pushed by itself (in its begin() method). Be aware of the fact that a stack of a disgeter is shared by all rules associate with it. If two rules for the same element push to/pop out the same stack, it will be difficult for other rules on that element, or elements nested in it to know which object is on the top of the stack at the time of its actions. The stacks, particularly the default stack, give the developers a lot of convenience. They, however, also introduce a lot of tight coupling among the rule actions. Treat them as sharp knifes that can easily cut fingers accidentally.

Calling the parse() method on a digester returns the object at the bottom of the default stack. For example, the call to the parse() method at Line 26, Listing 6 returns a Family object for this object is created when the digester encounters the <family> element and is the first object pushed to the default stack.

A Simple Example

Listing 1 - family.xml

<family name='Addison'>
    <address city='New York' state='New York' country='USA'>
        <street>Apt. 3522, 10 West Street</street>

The following three classes in Listing 2, 3, and 4 are simple Java class with basically setter/getter methods. The only thing that readers should pay a little attention is that the Family class has an addMember() method to add a member to the family per call (Line 32 - 34, Listing 2).

Listing 2 - The Family class

package commons.digester3.example;

import java.util.ArrayList;
import java.util.Collections;
import java.util.List;

public class Family {
    private String name;
    private List<Member> members = new ArrayList<Member>();
    private Address address;
    public String getName() {
        return name;

    public void setName(String name) { = name;

    public Address getAddress() {
        return address;

    public void setAddress(Address address) {
        this.address = address;

    public List<Member> getMembers() {
        return Collections.unmodifiableList(members);
    public void addMember(Member member) {

Listing 3 - The Address class

package commons.digester3.example;

public class Address {
    private String street;
    private String city;
    private String state;
    private String country;
    public String getCity() {
        return city;
    public void setCity(String city) { = city;
    public String getState() {
        return state;
    public void setState(String state) {
        this.state = state;
    public String getCountry() {
        return country;
    public void setCountry(String country) { = country;

    public String getStreet() {
        return street;

    public void setStreet(String street) {
        this.street = street;

Listing 4 - The Member class

package commons.digester3.example;

public class Member {
    private String firstname;
    private char gender;
    private int age;
    public String getFirstname() {
        return firstname;
    public void setFirstname(String firstname) {
        this.firstname = firstname;
    public char getGender() {
        return gender;
    public void setGender(char gender) {
        this.gender = gender;
    public int getAge() {
        return age;
    public void setAge(int age) {
        this.age = age;

Listed in Listing 5 is the code to parse the XML and to create the Family, Address, and Member objects. The only Digester-specific code are at Line 24 - 26. The FamilyModule class is a rule module class, which we are going to discuss later in this tutorial. From the code in Listing 5, we can see that a Family object is returned from the call to the parse() method on a Digester object. In fact, the Address and Member objects are also created and associated with the Family object. We can see it via the unit testing code in Listing 6.

Listing 5 - Parsing XML

package commons.digester3.example;


import org.apache.commons.digester3.Digester;
import org.apache.commons.digester3.binder.DigesterLoader;
import org.xml.sax.SAXException;

public class FamilyCreator {    
     * Creates a Family object (and Address, Member objects contained by it) based
     * on XML data.
     * @param source - name of the XML file
     * @throws SAXException
     * @throws IOException
    public static Family createFamily(String source) throws SAXException, IOException {
        Family result = null;
        InputStream inputStream = FamilyModule.class.getClassLoader().getResourceAsStream(source);
        DigesterLoader digesterLoader = DigesterLoader.newLoader(new FamilyModule());
        Digester digester = digesterLoader.newDigester();    
        result = digester.parse(inputStream);
        return result;

Listing 6 - JUnit Tests

package commons.digester3.example;

import java.util.List;

import junit.framework.Assert;

import org.junit.BeforeClass;
import org.junit.Test;
import org.xml.sax.SAXException;

public class SimpleDigesterTest {
    private static Family family = null;
    public static void setup() throws IOException, SAXException {
        family = FamilyCreator.createFamily("family.xml");
    public void testFamily() {
        Assert.assertNotNull("Family was not created.", family);
        Assert.assertEquals("Incorrect family last name", "Addison", family.getName());
    public void testAddress() {
        Address address = family.getAddress();
        Assert.assertNotNull("Address was not created.", address);
        Assert.assertEquals("Incorrect street line", "Apt. 3522, 10 West Street", address.getStreet());
        Assert.assertEquals("Incorrect city", "New York", address.getCity());
        Assert.assertEquals("Incorrect state", "New York", address.getState());
        Assert.assertEquals("Incorrect coutry", "USA", address.getCountry());
    public void testMember() {
        List>Member< members = family.getMembers();
        Assert.assertNotNull("Family members were not created.", members);
        Assert.assertEquals("Incorrect member count.", 3, members.size());
        Member member = members.get(1);
        Assert.assertEquals("Incorrect first name", "Linda",member.getFirstname());
        Assert.assertEquals("Incorrect gender", 'F', member.getGender());
        Assert.assertEquals("Incorrect age", 24, member.getAge());

The FamilyModule class in Lising 7 is a rule module class. A rule module class is basically a set of pairs of rule and matching pattern. A digester will take a rule module (see Line 24 in Listing 5) to figure out which rule to fire for which element. The in-line comments explain the rules.

Listing 7 - Registering Rules

package commons.digester3.example;

import org.apache.commons.digester3.binder.AbstractRulesModule;

public class FamilyModule extends AbstractRulesModule {
    protected void configure() {

        // Register a ObjectCreatRule on matching pattern "family". Later on, in the parsing phase, 
        // when encounters a <family> element, the digester will fire this rule to create a Family object.
        // Also register a SetPropertiesRule on the same pattern. Later on, in the parsing phase,
        // the digester will fire this rule to set properties of the Family object  
        // with the attribute values of the <family> element
        // For the setProperties() to work this way, a property name must be the same as the attribute name.

        // ... Also register a SetNextRule on matching pattern "family/address" to establish relationship 
        // between the Family and the Address object by calling the setAddress() method on the Family
        // object (expected to be the object next to top of the default stack) and passing the Address object
        // (expected to be the object on top of the default stack) as argument to it.
        // Register a BeanPropertySetterRule on matching pattern "family/address/street", to
        // set the property of the Address object named street with the content of the <street>
        // element.

        // ... to establish relationship between the Family and the Member object by calling
        // the addMember() method on the Family object and passing the Member object as argument to it.

Beyond The Simplest

Mismatch Between Attribute and Property Name

In out example above, all element attribute names match the corresponding JavaBean property names. What we have to do, if there is a mismatch, for example, the <family> element has an attribute named "name", but the Family object has setLastname() and getLastname() methods? All we have to do is to make an addAlias(attributeName, propertyName) call after calling setProperties(). For example, instead of have Line 15 - 16 in Listing 7, we are going to have the following code:

        .then().setProperties().addAlias("name", "lastname");

Mismatch Between Nested Element and Property Name

In our example above, all nested element names match the corresponding JavaBean property names. For example, a <member> element has nested elements <firstname>, <age>, and <gender>, and the corresponding Member object has setFirstname(), setGender(), and setAge(). What we have to do, if there is a mismatch, for example, the <member> element has nexted <name> instead of <firstname>? All we have to do is to change the matching pattern and make an withName(propertyName) call after calling setBeanProperty(). For example, instead of have Line 38 in Listing 7, we are going to have the following code:


Using CallMethodRule

Some mismatches are OK. Digester has some default converter to convert them. For example, even though all element attribute values and contents are strings, properties of type char, int, will not need explicit conversion. In our example, age of a member is of type int, and Digester implicitly converts a string into an int.

Some type mismatches will be issues. Suppose that our Member class is like in Listing 8. The getGender() method, instead of return a character, returns a enum Gender, which can be either F or M, as in Listing 9. Even though the setGender(char) method signature is the same as before, gender for Member is no longer a JavaBean property for the type of return of getGender() is not the same as the type of parameter to the setGender() method. For this reason, a BeanPropertySetterRule will not work for this case. To still call the setGender() method on a Member object to set the gender of the member based on content of the nested <gender> element, we need to use the CallMethodRule. Below is the new code to replace Line 39 in Listing 7.


Listing 8 - A new Version of the Member class

package commons.digester3.example;

public class Member {
    private String firstname;
    private Gender gender;
    private int age;
    public String getFirstname() {
        return firstname;
    public void setFirstname(String firstname) {
        this.firstname = firstname;
    public Gender getGender() {
        return this.gender;
    public void setGender(char gender) {
        if (gender == 'F') {
            this.gender = Gender.F;
        } else if (gender == 'M') {
            this.gender = Gender.M;
        } else {
            throw new RuntimeException("Invalid gender code " + gender + ". It can only be 'F' or 'M'");
    public int getAge() {
        return age;
    public void setAge(int age) {
        this.age = age;

Listing 9 - The Gender enum

package commons.digester3.example;

public enum Gender {
    F, M


Anonymous said...

I tried running the application.. but couldn't.. can you clarify on folder structure or how to, suppose, print any family member..

Tanika Co Valda said...

Very much useful article. Kindly keep blogging

Java Training in Chennai

Java Online Training India

akhila priya said...

This is one awesome blog article. Much thanks again selenium Online Training

sai said...

Your very own commitment to getting the message throughout came to be rather powerful and have consistently enabled employees just like me to arrive at their desired goals.
python interview questions and answers
python tutorials
python course institute in electronic city

jai said...

Its really an Excellent post. I just stumbled upon your blog and wanted to say that I have really enjoyed reading your blog. Thanks for sharing....
Data Science Course in Indira nagar
Data Science Course in btm layout
Python course in Kalyan nagar
Data Science course in Indira nagar
Data Science Course in Marathahalli
Data Science Course in BTM Layout

priya said...

Awesome article. It is so detailed and well formatted that i enjoyed reading it as well as get some new information too.
Data Science course in Indira nagar
Data Science course in marathahalli
Data Science Interview questions and answers
Data science training in tambaram
Data Science course in btm layout
Data science course in kalyan nagar

manisha said...

Awesome article. It is so detailed and well formatted that i enjoyed reading it as well as get some new information too.
Best Devops Training in pune
Microsoft azure training in Bangalore
Power bi training in Chennai

Saro said...

Very nice post here and thanks for it .I always like and such a super contents of these post.Excellent and very cool idea and great content of different kinds of the valuable information's.
rpa training in bangalore
best rpa training in bangalore
rpa training in pune | rpa course in bangalore
rpa training in chennai