Sunday, November 11, 2012

Overriding hashCode() in Java May Cause Adverse Effects

One important method of the java.lang.Object class is hashCode(). It is a popular practice to override the hashCode() method in classes that extend the Object class (All Java classes, directly or indirectly extend the Object class). Joshua Block, in his popular book Effective Java, advices Java programmers to "always override hashCode when you override equals" (Item 8, Effective Java, Second Edition). Joshua, however, did not mention the possible adverse effect of overriding the hashCode() method. This article demonstrates such adverse effects in an example.

Our example is a short program that manages student-tutor relationship between students and tutors. Students are represented by objects of the Student class, in Listing 1. Tutors are represented by objects of the Tutor class, in Listing 2. The main program is in Listing 3.

Listing 1


package hashcode.issue;

public class Student {
    public String firstName;
    public String lastName;
    public String phoneNumber;
    
    public Student(String firstName, String lastName, String phoneNumber) {
        this.firstName = firstName;
        this.lastName = lastName;
        this.phoneNumber = phoneNumber;
    }
}

Listing 2

package hashcode.issue;

public class Tutor {
    public String firstName;
    public String lastName;
    public String phoneNumber;
    
    public Tutor(String firstName, String lastName, String phoneNumber) {
        this.firstName = firstName;
        this.lastName = lastName;
        this.phoneNumber = phoneNumber;
    }
}

Listing 3

package hashcode.issue;

import java.util.HashMap;
import java.util.Map;

public class HashCodeIssue {
    private static Map<Student, Tutor"> relationMap = new HashMap<Student, Tutor>();

    public static void main(String[] args) {
        Student john = new Student("John", "Jones", "8945");    // Create a Student object named john
        Tutor tom = new Tutor("Tom", "Petzold", "4627");    // Create a Tutor object named tom
        
        relate(john, tom);    // record the student-tutor relationship between john and tom
        
        // do many other things ...
        
        Tutor johnsTutor = findTutor(john);    // find the tutor of john
        if (johnsTutor != null) {    // print out what we found
            System.out.printf("John's tutor is %s %s\n", johnsTutor.firstName, johnsTutor.lastName);
        } else {
            System.out.println("John does not have a tutor.");
        }
        
        
        // John changed phone number
        john.phoneNumber = "513-326-5489";
        
        johnsTutor = findTutor(john);    // find the tutor of john again
        if (johnsTutor != null) { // print out what we found
            System.out.printf("John's tutor is %s %s\n", johnsTutor.firstName, johnsTutor.lastName);
        } else {
            System.out.println("John does not have a tutor.");
        }

    }
    
    private static void relate(Student student, Tutor tutor) {
        relationMap.put(student, tutor);
    }
    
    private static Tutor findTutor(Student student) {
        return relationMap.get(student);
    }
}

The program maintains the student-tutor relationship by a HashMap, using Student objects as map keys and Tutor objects as map values. A Student object is mutable. One can change a student's first name, last name, and phone number. After we establish a student-tutor relationship between a Student object john and a Tutor object tom by calling the relate() method, we call the findTutor() method to find the tutor of john. We found it. Then we change phone number of john, and call the findTutor() method again to find the tutor. We found the tutor again.

Program execution output:

John's tutor is Tom Petzold 
John's tutor is Tom Petzold 

So far, so good.

Now we override the hashCode() method in the Student class, as in Listing 4 (Line 16 - 19). Then we run the program again. This time, the second call to the findTutor() method failed to find the tutor.

Program execution output:

John's tutor is Tom Petzold 
John does not have a tutor


Listing 4

package hashcode.issue;

import org.apache.commons.lang3.builder.HashCodeBuilder;

public class Student {
    public String firstName;
    public String lastName;
    public String phoneNumber;
    
    public Student(String firstName, String lastName, String phoneNumber) {
        this.firstName = firstName;
        this.lastName = lastName;
        this.phoneNumber = phoneNumber;
    }
    
    @Override
    public int hashCode() {
        return new HashCodeBuilder().append(firstName).append(lastName).append(phoneNumber).build();
    }
}

The failure is due to the fact that when we use a Student object as key in a HashMap, the map calls the hashCode() method on the Student object and uses the returned number to determine hash bucket where the corresponding value resides. In our case, we overrode the hashCode() method to return a  number that depends on value of the student's first name, last number, and phone number. When the student's phone number changes, so does its hash code.

The hashCode() method defined in the java.lang.Object class is different, it always return the same number for the same object. In other words, the returned hash code can be used as the object's identity. When we overrode the hashCode() method to return a number that depends on the object's state (in our example, the Student's first name, last name, and phone number), the hash code can no longer be used as the object's identity.

Conclusion
If an object is mutable and its hashCode() method returns a number depending on its state, do not use the object as hash map key.  In case that you want to use an object as hash map key, ensure at first that it is guaranteed that either the object is immutable or its hashCode() method will always return the same number.

No comments: