Tuesday, November 29, 2011

The Basics of Java Generics


For the sake of generics, Java types (classes and interfaces) can be grouped into three categories:
  • Ordinary type, e.g. String, Integer
  • Generic type, e.g. java.lang.Comparable<T>, java.util.List<E> , and java.util.ArrayList<E>
  • Parameterized type, e.g. java.lang.Comparable<Integer>, java.util.List<String>, and java.util.ArrayList<String>
In Java, there are four kinds of generic constructs:
  • generic interface
  • generic class
  • generic method
  • generic constructor
Constructors are very much like methods, except that there is not any return for constructors. For this reason, we are going to omit any discussion about generic constructors since all discussions about generic methods, except what about method returns, also apply to generic constructors.

Coding with generics usually involves one or more of the following:
  • Defining a generic interface, class, or method
  • Invoking a generic interface, or class
  • Invoking a generic method
  • Defining a non-generic method with at least a parameter of a generic type, or with the return of a generic type. Such a method must be a member of a generic type
  • Defining a method with at least a parameter of a parameterized type, or with the return of a parameterized type
  • Invoking a method with at least a parameter of a parameterized type
  • Invoking a method with the return of a parameterized type

Defining Generic Types

Example 1 – Defining a generic interface

public interface Iterable<T>

The T in Iterable<T> is called a type parameter. In the language of Java generics, we say that the generic type Iterable takes a type parameter, T. Conventionally, a single upper case T is used as identifier for a type parameter (T stands for type).

Example 2 – Defining a generic interface that extends another generic interface

public interface List<E> extends Collection<E>

Here the E in List<E> is the type parameter. Conventionally, E is used as identifier for type parameter of collections. (E stands for element)

Example 3 – Defining a generic class that implements a generic interface

public class ArrayList<E> implements List<E>

Defining Generic Method

Example  4 – Defining a generic method

<T> T[] toArray(T[] a);

Above is the definition of a toArray method in the body of java.util.List. The first <T> tells that this is a generic method and the method takes a type parameter T. This means that this method has a hole that will be filled later with a concrete type. Then it also tells that the type of the method return is T[],  and the type of the method parameter is a T[] (array of T). Essentially, the type parameter of this method establishes a constrain, in term of type, between the method return and parameters. If we want to turn a list into a String [], we must pass to the toArray method a String[].

Please note that a method whose parameters or return is of a type parameter is not necessarily a generic method unless in its definition  <T>  is placed before its return type (or void).  For example, the methods shown in Example 5 below are not generic methods.

On the other hand, even a non-generic type may have a generic method as its member.

Type Parameter

A type parameter is a placeholder for a concrete type.  It is important to understand that a type parameter is either taken by a generic type,  a generic method, or a generic constructor. On the other hand a generic type or method takes at least one type parameter.

Inside the body of a generic interface or class, a type parameter taken by the interface or class, can server as the type of parameters, the type of return, or the type of local variables, of an instance method. It can also server as the type of instance fields.

A type parameter taken by a generic method can server as type of its parameters, type of its return, or type of its local variables. (Note: It is legal for a generic method to be a static member of a class or interface)

A type parameter taken by a class or interface cannot be:
  • Type of its static fields (because there is only one class vs. many different T)
  • Anywhere in its static member methods (same reason)
  • In a static initial block (same reason)

In addition, none type parameter can be
  • used in new T() statement  to create a new object (because erasure)
  • used in new T[size]() to create a new array of objects (because erasure)

Example 5 - Defining methods with parameters or return of type parameter

boolean add(E e);
E get(int index);

The above two methods are defined in the body of generic List<E>, the type parameter E servers as the type of parameter named e for the method named add, and the return type of the method named get. These two methods are not generic methods. The type parameter E is not taken by the methods but by their owner type (i.e. List<E>).

Generic Type v.s.  Parameterized Type

It is critical to understand the difference and relationship between generic type and parameterized type. For example, ArrayList<E> is a generic type and  ArrayList<String> is a parameterized type.  They differ in the following aspects:
  • E in  ArrayList<E>  is a type parameter, and String  in  ArrayList<String> is an concrete type (particularly, a ordinary class). In regard to  ArrayList<String>, the concrete type String servers as the type argument, to fill the place held by type parameter E, which is taken by  ArrayList<E>.
  • A parameterized type, like an ordinary type, is a concrete type, while a generic type is an abstract type
  • It is legal to create an object of ArrayList<String> via statement new ArrayList<String>();, statement new ArrayList<E>(); is, however, illegal.
  • More generally, the usage of a parameterized type is exactly the same as an ordinary type. A parameterized type can be used at any place where an ordinary type is to be used, i.e. to be used as the type of a variable or a method return. The variable may be a method parameter, a local variable, or a field.
A parameterized type always has a special relationship with a generic type: a parameterized type is always instantiated out of a generic type. For example, ArrayList<String> is instantiated out of ArrayList<E>, by replacing a type argument, String, for the type parameter, E. In order for ArrayList<String> to exist, ArrayList<E> must exist first.

A type parameter is like a hole. When the hole is filled with a concrete type, a parameterized type comes out of the generic type. Replacing a type parameter by a concrete type is called invocation of a generic type. While one can invoke a method passing arguments, one can invoke a generic type passing type arguments.

In a parameterized type a type argument takes all places used to be held by its corresponding type parameter.  For example, in List<String>, there are effectively
boolean add(String e);
String get(int index);

(For the formal specification, see 4.5.2 Members and Constructors of Parameterized Types, The Java Language Specification, Third Edition, Addision Wesley, 2004)

Bounded Type Parameter

In a generic type definition, a type parameter may be given an upper bound.

Example 6 – Defining a generic type named SortedSet which is a set with elements sorted

public interface SortedSet<E extends Comparable<E>> extends Set<E>

Here <E extends Comparable<E>> indicates that E is a bounded type parameter and Comparable<E> is the upper bound. Any parameterized type out of this generic type must have the type argument as a sub-type of Comparable<E>.  For example, we may have a parameterized type SortedSet<Integer>. It is OK since Integer implements Comparable<Integer>. However, we cannot have a parameterized type SortedSet<java.io.File> because File does not implement Comparable<File>. In short, the bound of a type parameter is used to restrict the type arguments to the generic type. Without a bound, any type will be accepted as legal type argument at compile time. Some of them may lead to runtime exception.

A few more words about the example, the upper bound, java.lang.Comparable<E>,  is also a generic type, and its type parameter is E, the same as of SortedSet.

If there are multiple such bounds, separate them by & in the generic type definition.

Calling a Generic Method

Example 7 - Calling a generic method

         List<String> list = new ArrayList<String>();
         String[] stringArray = list.toArray(new String[]{});

Usually, it is not required to specifying the type argument (i.e. the concrete type to take the place of the type parameter) when call a generic method, as the example above shows, because the compiler can infer the type argument from the type of the argument to the method (i.e. String[] in the example). That is however, not always the case. In some cases, the compiler cannot determine the concrete type by inference. Then the type argument has to be explicitly specified. The right syntax to specify the type argument to a generic method is show in the example below:

String[] stringArray = list.<String>toArray(new String[]{});

The type argument (e.g. String in the example above) is place between < and >, and immediately before the name of the generic method being called.

By the way, be aware that the toArray method does not bring complete type-safety. The following code fragment compiles but causes run time exception.

        List<String> list = new ArrayList<String>();

        Integer[] intArray = list.toArray(new Integer[]{});

Type Wildcard

Type wildcard in Java Generics is a complex topic. It is discussed in my other post Type Wildcard in Java Generics.

Sunday, November 27, 2011

StringTemplate 4 Note for Java Programmers

StringTemplate is a very simple and powerful template engine. The existing documents are fairly complete. As long as one knows where to find those documents, learning StringTemplate is quite easy. This note is intended to help users to quickly find needed documents.

To learn StringTemplate means to learn the following three aspects of it:
  1. The core concepts and the relationship among them
  2. Syntax of templates and template groups
  3. API

In this document, users can find:
  • A brief introduction to StringTemplate 4
  • Instructions to setup Java programs to use StringTemplate 4
  • Syntax of templates and groups  
  • StringTemplate 4 Java API
The syntax documentation is quite formal and is in (a variation of?) the Backus-Naur Form notation. Programmers who are not used to such formal notation may feel that the syntax documentation is hard to understand. For those programmers, my recommendation is to spend one hour to learn the Backus-Naur Form notation. Of course, examples also help.

This article from the creator of StringTemplate provides:
  • An overview of StringTemplate
  • The philosophy of StringTemplate
  • The theoretical foundation of StringTemplate
  • A few examples showing the main features of StringTemplate, namely:
    • attribute (and attribute property) reference
    • map operation (i.e. applying a template to an attribute that is a list of objects or applying a list of templates alternatively to an attribute that is a list of objects), 
    • conditional include
    • recursive template.
Many template users say that certain other template engines are more powerful than StringTemplate. This article helps users to understand why those features existing in other template engines are purposely excluded from StringTemplate for very good reasons.

In this article by the creator of StringTemplate, the author provides a comprehensive introduction, explaining the major concepts in StringTemplate with examples:
  • Template
  • Template group
  • Expression
  • Attribute
  • Multi-valued attribute
  • Implicitly set attribute
  • Template include
  • Conditional include
  • Template application, to a single or multiple attributes
  • Anonymous inline template
  • Recursive template
  • Group inheritance and overriding
  • Template region
  • Group interface
  • Map (dictionary) and list
  • Renderer
This is the deepest document about core concepts in StringTemplate. Consider this article as the must-read in order to really mater StringTemplate. The syntax of template is up to date. However,  the Java API has been changed since the publication of this article. Therefore the Java code examples in this article are out of date.

Sunday, November 20, 2011

Classpath and Resource Files in Java Programs

(Last updated on February 2, 2013)

Frequently a Java program needs to read some resource files in the file system. Such a resource file may be a .properties file or a .xml file for program configuration. Often it is not practical to hardcode the full path to such a file in the program because if we do so, we will not be able to execute the program correctly except from a specific location in the file system. That is highly undesirable. 

A popular practice is to place such a resource file on a classpath and code the program to search the classpath for the resource file. A programmer who adopts this practice must understand well what a classpath is and how to discover it programmatically.

To understand classpath, one must at first understand the concept of class loader. According to the Java API document, “A class loader is an object that is responsible for loading classes”. In general, a Java program uses multiple class loaders, instead of a single one, to load classes. A classpath is the search path of a class loader. In other words, a Java program usually has multiple classpaths. (It is inappropriate to talk about the classpath of a Java program because a Java program has multiple classpaths. On the other hand, it is all right to talk about the classpath of a class loader.) The fact is that some of those classpaths can be discovered programmatically, some simply cannot.

For a typically Java program, the class loaders form a hierarchy.  When a class loader is requested to find a class or a resource file, it will at first recursively delegate the request to its parent class loader. Only when its parent class loader cannot find the class or resource file, it will search it on its own search path.

Class Loader Hierarchy

On the top of the class loader hierarchy is the JVM’s built-in bootstrap class loader, which loads standard JDK classes. The search path of the bootstrap class loader can be found programmatically via a call to System.getProperty("sun.boot.class.path"). The search path is platform specific. On a Windows machine, it is <JAVA-HOME>/jre/lib.  Typical jar files on this search path are rt.jar, jsse.jar, jce.jar etc.

As the child of the bootstrap class loader is the extension class loader, which loads JDK extension classes. The search path of the extension class loader can be found programmatically via a call to System.getProperty("java.ext.dirs").  The path is platform specific. On a Windows machine, it is <JAVA-HOME>/jre/lib/ext.  An example of such JDK extension is sunjce_provider.jar. 

As the child of the extension class loader is the system class loader, which loads classes on the path specified by OS environment variable CLASSPATH or the –classpath option to the JVM. The search path of the system class loader can be found programmatically via a call to System.getProperty("java.class.path").
If the Java program does not create its own user class loader, all non-JDK-standard/extension classes will be loaded by the system class loader. As we just said, the search path of this class loader can be easily found programmatically.

If the Java program creates its own user class loaders, unless the class loaders are of a custom class loader class with a method to retrieve the search path, there is no way to find the search path programmatically. If the user classes are executed in a JEE container, the JEE container is the bootstrap program and it always creates user class loaders to load user classes (usually one for each war or ear). Similarly, Maven always load plugin classes with user class loaders. Therefore,  if the user classes are executed as a Maven plugin, don’t expect to find the search path for those classes by calling System.getProperty("java.class.path"). It is worth to notice that many applications and application servers actually use instances of the java.net.URLClassLoader as their user class loaders. In such cases, the classpath of the class loaders can be found by call the getURLs() method on the classloader instances (after casting them from java.lang.ClassLoader to java.net.URLClassLoader). The getURLs() method returns an array or URLs. Each URL returned is a directory on the classpath. In other words, the class loader will search those directories to find the classes wanted.

No matter the search path can be found programmatically or not, a program can always asks a class loader to find a resource file by calling the getResource(String name) method on the classloader object (or to get an InputStream connected to the resource file by calling the getResourceAsStream(String name) method on the classloader object). It will found the resource file if it is on the search path or in a jar file on the search path. By the way, the getSystemResource(String name) and getSystemResourceAsStream(String name) methods are to find the resource on the search path of the system class loader.

By the way, given any object, one can call the getClass() method on it to find the Class object representing its class. Then one can find the class loader by calling the getClassLoader() method on the Class object. In short, obj.getClass().getClassLoader() will return the class loader that loaded the class of obj, an object.

It is usually preferable to use the getResource(String name) method rather than the getResourceAsStream(String name) method for the reason of logging. The returned URL can be logged. In case there are inadvertently multiple resource files with the same name on different locations on the search path, the logged URL can help to debug. Even if there is only one resource file with the name, if there is some difficult to read from or write to it, the logged URL can point the programmer quickly to the file that needs a fix.