Tuesday, November 29, 2011

The Basics of Java Generics


For the sake of generics, Java types (classes and interfaces) can be grouped into three categories:
  • Ordinary type, e.g. String, Integer
  • Generic type, e.g. java.lang.Comparable<T>, java.util.List<E> , and java.util.ArrayList<E>
  • Parameterized type, e.g. java.lang.Comparable<Integer>, java.util.List<String>, and java.util.ArrayList<String>
In Java, there are four kinds of generic constructs:
  • generic interface
  • generic class
  • generic method
  • generic constructor
Constructors are very much like methods, except that there is not any return for constructors. For this reason, we are going to omit any discussion about generic constructors since all discussions about generic methods, except what about method returns, also apply to generic constructors.

Coding with generics usually involves one or more of the following:
  • Defining a generic interface, class, or method
  • Invoking a generic interface, or class
  • Invoking a generic method
  • Defining a non-generic method with at least a parameter of a generic type, or with the return of a generic type. Such a method must be a member of a generic type
  • Defining a method with at least a parameter of a parameterized type, or with the return of a parameterized type
  • Invoking a method with at least a parameter of a parameterized type
  • Invoking a method with the return of a parameterized type

Defining Generic Types

Example 1 – Defining a generic interface

public interface Iterable<T>

The T in Iterable<T> is called a type parameter. In the language of Java generics, we say that the generic type Iterable takes a type parameter, T. Conventionally, a single upper case T is used as identifier for a type parameter (T stands for type).

Example 2 – Defining a generic interface that extends another generic interface

public interface List<E> extends Collection<E>

Here the E in List<E> is the type parameter. Conventionally, E is used as identifier for type parameter of collections. (E stands for element)

Example 3 – Defining a generic class that implements a generic interface

public class ArrayList<E> implements List<E>

Defining Generic Method

Example  4 – Defining a generic method

<T> T[] toArray(T[] a);

Above is the definition of a toArray method in the body of java.util.List. The first <T> tells that this is a generic method and the method takes a type parameter T. This means that this method has a hole that will be filled later with a concrete type. Then it also tells that the type of the method return is T[],  and the type of the method parameter is a T[] (array of T). Essentially, the type parameter of this method establishes a constrain, in term of type, between the method return and parameters. If we want to turn a list into a String [], we must pass to the toArray method a String[].

Please note that a method whose parameters or return is of a type parameter is not necessarily a generic method unless in its definition  <T>  is placed before its return type (or void).  For example, the methods shown in Example 5 below are not generic methods.

On the other hand, even a non-generic type may have a generic method as its member.

Type Parameter

A type parameter is a placeholder for a concrete type.  It is important to understand that a type parameter is either taken by a generic type,  a generic method, or a generic constructor. On the other hand a generic type or method takes at least one type parameter.

Inside the body of a generic interface or class, a type parameter taken by the interface or class, can server as the type of parameters, the type of return, or the type of local variables, of an instance method. It can also server as the type of instance fields.

A type parameter taken by a generic method can server as type of its parameters, type of its return, or type of its local variables. (Note: It is legal for a generic method to be a static member of a class or interface)

A type parameter taken by a class or interface cannot be:
  • Type of its static fields (because there is only one class vs. many different T)
  • Anywhere in its static member methods (same reason)
  • In a static initial block (same reason)

In addition, none type parameter can be
  • used in new T() statement  to create a new object (because erasure)
  • used in new T[size]() to create a new array of objects (because erasure)

Example 5 - Defining methods with parameters or return of type parameter

boolean add(E e);
E get(int index);

The above two methods are defined in the body of generic List<E>, the type parameter E servers as the type of parameter named e for the method named add, and the return type of the method named get. These two methods are not generic methods. The type parameter E is not taken by the methods but by their owner type (i.e. List<E>).

Generic Type v.s.  Parameterized Type

It is critical to understand the difference and relationship between generic type and parameterized type. For example, ArrayList<E> is a generic type and  ArrayList<String> is a parameterized type.  They differ in the following aspects:
  • E in  ArrayList<E>  is a type parameter, and String  in  ArrayList<String> is an concrete type (particularly, a ordinary class). In regard to  ArrayList<String>, the concrete type String servers as the type argument, to fill the place held by type parameter E, which is taken by  ArrayList<E>.
  • A parameterized type, like an ordinary type, is a concrete type, while a generic type is an abstract type
  • It is legal to create an object of ArrayList<String> via statement new ArrayList<String>();, statement new ArrayList<E>(); is, however, illegal.
  • More generally, the usage of a parameterized type is exactly the same as an ordinary type. A parameterized type can be used at any place where an ordinary type is to be used, i.e. to be used as the type of a variable or a method return. The variable may be a method parameter, a local variable, or a field.
A parameterized type always has a special relationship with a generic type: a parameterized type is always instantiated out of a generic type. For example, ArrayList<String> is instantiated out of ArrayList<E>, by replacing a type argument, String, for the type parameter, E. In order for ArrayList<String> to exist, ArrayList<E> must exist first.

A type parameter is like a hole. When the hole is filled with a concrete type, a parameterized type comes out of the generic type. Replacing a type parameter by a concrete type is called invocation of a generic type. While one can invoke a method passing arguments, one can invoke a generic type passing type arguments.

In a parameterized type a type argument takes all places used to be held by its corresponding type parameter.  For example, in List<String>, there are effectively
boolean add(String e);
String get(int index);

(For the formal specification, see 4.5.2 Members and Constructors of Parameterized Types, The Java Language Specification, Third Edition, Addision Wesley, 2004)

Bounded Type Parameter

In a generic type definition, a type parameter may be given an upper bound.

Example 6 – Defining a generic type named SortedSet which is a set with elements sorted

public interface SortedSet<E extends Comparable<E>> extends Set<E>

Here <E extends Comparable<E>> indicates that E is a bounded type parameter and Comparable<E> is the upper bound. Any parameterized type out of this generic type must have the type argument as a sub-type of Comparable<E>.  For example, we may have a parameterized type SortedSet<Integer>. It is OK since Integer implements Comparable<Integer>. However, we cannot have a parameterized type SortedSet<java.io.File> because File does not implement Comparable<File>. In short, the bound of a type parameter is used to restrict the type arguments to the generic type. Without a bound, any type will be accepted as legal type argument at compile time. Some of them may lead to runtime exception.

A few more words about the example, the upper bound, java.lang.Comparable<E>,  is also a generic type, and its type parameter is E, the same as of SortedSet.

If there are multiple such bounds, separate them by & in the generic type definition.

Calling a Generic Method

Example 7 - Calling a generic method

         List<String> list = new ArrayList<String>();
         String[] stringArray = list.toArray(new String[]{});

Usually, it is not required to specifying the type argument (i.e. the concrete type to take the place of the type parameter) when call a generic method, as the example above shows, because the compiler can infer the type argument from the type of the argument to the method (i.e. String[] in the example). That is however, not always the case. In some cases, the compiler cannot determine the concrete type by inference. Then the type argument has to be explicitly specified. The right syntax to specify the type argument to a generic method is show in the example below:

String[] stringArray = list.<String>toArray(new String[]{});

The type argument (e.g. String in the example above) is place between < and >, and immediately before the name of the generic method being called.

By the way, be aware that the toArray method does not bring complete type-safety. The following code fragment compiles but causes run time exception.

        List<String> list = new ArrayList<String>();

        Integer[] intArray = list.toArray(new Integer[]{});

Type Wildcard

Type wildcard in Java Generics is a complex topic. It is discussed in my other post Type Wildcard in Java Generics.

1 comment:

John Hinnegan said...

Hello Ted. I'm the CTO and cofounder of ThinkNear. Found your blog, great stuff. I'd love to discuss opportunities here at ThinkNear. Feel free to contact me at john@thinknear.com. I look forward to hearing from you