In the past years I have become a fan of the final keyword for variables and members in Java. But many people reading my source code opposed this view: They find that using final for every non-reassignable variable clutters the code because most variables will not be reassigned. When I recently talked to Uncle Bob and he uttered the same opinion, I decided to search the Web for reasons for and against the usage of final.

Most of the arguments stem from these source:

Most of the time, I refer to final as a modifier to class members and local variables.

When final is inevitable or very recommended

  • Constants
    By convention constants should have static and final modifiers. 

    private static final int CONSTANT = 1;
  • Inner classes:
    Inner classes can only access local variables when they are final.
  • Utility classes:
    Utility classes should be final and have a private constructor: 

    public final class CollectionUtils {
        private CollectionUtils() {
            new UnsupportedOperationException("private class");
        }
    }

 Pro final

  • Easier to understand during maintenance and debugging
    It is clear which variables remaing the same within the current scope. final reduces complexity.
  • Avoids NullPointerExceptions
    In order to check whether a variable is null or not, you only have to check its initialization.
  • Color patterns
    After a while, the color patterns that result from the frequent finals may help to navigate through the code.
  • Compiler can optimize
    Some people argue that the compiler is able to optimize when final is used.
  • Immutabilitiy
    Using final is necessary (but not sufficient) to enforce immutability. 

    • Use Collections.unmodifiable… to make immutable collections
    • final fields need to be set in the constructor. Some frameworks (such as UIMA) expect variables to be initialized in an initialize method. In this case, we cannot use final.
  • Fosters thread-safety
    Synchronizing on final variables is safer.
  • Extension points:
    Marking methods as final allows to quickly find out, which methods serve as extension points. Joshua Bloch’s Effective Java even suggests to make as many methods (and classes) final as possible.
  • Discourages overbroad scoping
    Every local variable should serve one purpose. Using final, we can avoid the reusage of dummy variables such as in the following example (taken from Stackoverflow): 

    String msg = null;
    for(int i = 0; i < 10; i++) {
        msg = "We are at position " + i;
        System.out.println(msg);
    }
    msg = null;

Contra final

  • Clutters the code
    Normally, most (local) variables are assigned only once and are thus eligible for bearing final in front of them.
  • Hard-to-read method signatures
    When you make a method and all of its parameters final, the method signature is unlikely to fit one line even with two parameters. This makes the method signature hard to read.
  • Final can be replaced with static code checkers…
    … at least in some places: IDEs and code checkers (PMD, Style check,…) check that method parameters are not re-assigned. However, some of them will suggest to make method parameters final :-)
  • final is not const
    Objects that are labeled with final can still be modified if they have non-immutable members. If you are accustomed to the const keyword in C/C++, this behavior is misleading.
  • May slow down development
    While coding, you may change your mind on whether a certain variable is final or not. Every time, you change your mind, you have to add or remove the final. However, many IDEs support you in both directions: Eclipse can add final where possible on saving and, vice versa, when you try to re-assign a final variable, it will offer you to make it non-final.

It happened to me that my Wicket web application does not reliably reload static resources such as CSS stylesheets or JavaScript code. The problem need not be Wicket in this respect, but may also result from the browser’s caching policy. Wicket can circumvent this problem by updating the lastModified property of static resources.

Wicket 1.5 and above

Wicket 1.5 has introduced an advanced caching configuration, that allows you to even implement your own caching strategies (see here). Basically, you now have to create an instance of IResourceCachingStrategy that uses a particular IResourceVersion. The IResourceCachingStrategy determines how the filename is modified and the IResourceVersion determines how the current version information is calculated (e.g. last modified, checksum,…). A minimal working example looks like this:

import org.apache.wicket.request.resource.caching.FilenameWithVersionResourceCachingStrategy;
import org.apache.wicket.request.resource.caching.version.LastModifiedResourceVersion;

public class MyApplication extends WebApplication {
    @Override
    protected void init() {
        IResourceCachingStrategy strategy = new FilenameWithVersionResourceCachingStrategy(
                new LastModifiedResourceVersion());
        this.getResourceSettings().setCachingStrategy(strategy);
    }
}

The FilenameWithVersionResourceCachingStrategy is the suggested strategy. The following implementations of IResourceVersion are available:

  • LastModifiedResourceVersion uses the last modified timestamp.
  • CachingResourceVersion limits the liftetime of a cached resource to the lifetime of the object and an configured cache size.
  • MessageDigestResourceVersion calculates a hash of the cached resource. It can use various algorithms from the Java Cryptographic Architecture, by default MD5.
  • RequestCycleCachedResourceVersion caches the resources over one HTTP request lifecycle.
  • StaticResourceVersion uses a static resource version.

Before Wicket 1.5

Before Wicket 1.5, it was fairly easy: You simply needed to set a flag in the application’s settings:

public class MyApplication extends WebApplication {
    @Override
    protected void init() {
        getResourceSettings().setAddLastModifiedTimeToResourceReferenceUrl(true);
    }
}

Resources

  • [1] Solution for Wicket < 1.5
  • [2] Notes on the changes introduced in Wicket 1.5
  • [3] Javadoc of FilenameWithVersionResourceCachingStrategy

When I deployed a Maven project that uses the LanguageToolSegmenter (wrapped by DKPro Core), I encountered the following stack trace:

Caused by: java.lang.ExceptionInInitializerError
    at org.languagetool.language.German.getSentenceTokenizer(German.java:96)
    at de.tudarmstadt.ukp.dkpro.core.languagetool.LanguageToolSegmenter.process(LanguageToolSegmenter.java:47)
    at de.tudarmstadt.ukp.dkpro.core.api.segmentation.SegmenterBase.process(SegmenterBase.java:124)
    at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
    at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:378)
    at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:298)
    at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:568)
    at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:410)
    at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:343)
    at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:265)
    at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:568)
    at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:410)
    at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:343)
    at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:265)
    at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)
    at org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:170)
    at org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:191)
    at de.tudarmstadt.ukp.rk.mt.api.pipeline.io.JsonAnnotatedCorpusReader.createIndexToTokenMapping(JsonAnnotatedCorpusReader.java:415)
    at de.tudarmstadt.ukp.rk.mt.api.pipeline.io.JsonAnnotatedCorpusReader.getNext(JsonAnnotatedCorpusReader.java:327)
    at de.tudarmstadt.ukp.rk.mt.classification.io.UserFilteredClassificationCorpusReader.getNext(UserFilteredClassificationCorpusReader.java:45)
    at org.apache.uima.fit.component.JCasCollectionReader_ImplBase.getNext(JCasCollectionReader_ImplBase.java:72)
    at de.tudarmstadt.ukp.dkpro.lab.uima.engine.simple.SimpleExecutionEngine.run(SimpleExecutionEngine.java:139)
    ... 38 more
Caused by: java.lang.UnsupportedOperationException: This parser does not support specification "null" version "null"
    at javax.xml.parsers.SAXParserFactory.setSchema(SAXParserFactory.java:419)
    at net.sourceforge.segment.srx.io.Srx2SaxParser.<init>(Srx2SaxParser.java:181)
    at org.languagetool.tokenizers.SRXSentenceTokenizer.createSrxDocument(SRXSentenceTokenizer.java:60)
    at org.languagetool.tokenizers.SRXSentenceTokenizer.<clinit>(SRXSentenceTokenizer.java:47)
    ... 60 more

It turned out that the problem is that some (actually unnecessary) dependencies ship with an old version of the SAXParser, namely:

  • xercesImpl-2.6.2.jar
  • xalan-2.7.1.jar

In Eclipse, these dependencies can be excluded as follows: Open the dependency tree and search for “xalan” or “xerces”. You can exclude these artifacts via the context menu or by placing exclusion tags below of of the parent artifacts further up in the dependency tree. In my case, it looks like this:

<dependency>
  <groupId>de.tudarmstadt.ukp.dkpro.tc</groupId>
  <artifactId>
      de.tudarmstadt.ukp.dkpro.tc.core-asl
  </artifactId>
  <exclusions>
      <exclusion>
          <artifactId>xalan</artifactId>
          <groupId>xalan</groupId>
      </exclusion>
  </exclusions>
</dependency>
<dependency>
  <groupId>de.tudarmstadt.ukp.dkpro.tc</groupId>
  <artifactId>
      de.tudarmstadt.ukp.dkpro.tc.features-asl
  </artifactId>
  <exclusions>
      <exclusion>
          <artifactId>xercesImpl</artifactId>
          <groupId>xerces</groupId>
      </exclusion>
  </exclusions>
</dependency>

References

  • [1] Source of this solution
  • [2] describes the problem

This article describes how to serialize and deserialize JCas objects using DKPro’s  XmiWriter and XmiReader components. A runnable Maven project can be found on GitHub.

Dependencies

Only one dependency is necessary, which is available on Maven Central:

<dependency>
  <groupId>de.tudarmstadt.ukp.dkpro.core</groupId>
  <artifactId>de.tudarmstadt.ukp.dkpro.core.io.xmi-asl</artifactId>
  <version>1.5.0</version>
</dependency>

As usual in the context of DKPro Core, it is better to omit the version tag and to configure the version of DKPro Core centrally:

<dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>de.tudarmstadt.ukp.dkpro.core</groupId>
      <artifactId>de.tudarmstadt.ukp.dkpro.core-asl</artifactId>
      <version>1.5.0</version>
      <type>pom</type>
      <scope>import</scope>
    </dependency>
  </dependencies>
</dependencyManagement>

Serialization

The basic code for serialization looks as follows:

// import de.tudarmstadt.ukp.dkpro.core.io.xmi.XmiWriter;
final AnalysisEngineDescription xmiWriter = 
AnalysisEngineFactory.createEngineDescription(
        XmiWriter.class,
        XmiWriter.PARAM_TARGET_LOCATION, "./target/cache");

The target location is the folder where the cached JCases will be stored. You may either pass a String or a File object. Each JCas needs a DocumentMetaData feature structure in order to know the target filename. The filename can either be configured via DocumentMetaData.setDocumentId(String) or via setBaseURI(String) and setURI(String). For details, look at the provided sample project.

Deserialization

The deserialization works analogously, but of course, the XmiReader is not a consumer but a reader component and has to be the first component in the Pipeline:

//import de.tudarmstadt.ukp.dkpro.core.io.xmi.XmiReader;
final CollectionReaderDescription xmiReader =
  CollectionReaderFactory.createReaderDescription(
      XmiReader.class,
      XmiReader.PARAM_SOURCE_LOCATION,  "./target/cache",
      XmiReader.PARAM_PATTERNS, "[+]*.xmi");

The source location is identical to the target location of the writer. Additionally, the reader requires a pattern, that describes files to include (“[+]”) and exclude (“[-]”). Patterns obey to the format of Ant patterns.

Download

If you are interested in a “minimal working example”, you can find a Maven project on GitHub.

References

  • [1] Ant patterns

 

In a previous post, I described how to simplify the task of creating initialize and collectionProcessComplete methods using Eclipse templates. A similar technique can be applied to generate the idiomatic code for a uimaFit configuration parameter.

The shape of UIMA configuration parameters

An example of configuring UIMA components using uimaFit’s @ConfigurationParameter looks like this:

public static final String PARAM_PARAMETER = "parameter";
@ConfigurationParameter(name = PARAM_PARAMETER, mandatory = false, description = "Description", defaultValue = "true")
private boolean parameter;

This snippet demonstrates some best practices that the community or I introduced:

  • The name-giving constant (here: PARAM_PARAMETER) is always public and bears the name of the parameter field as its value (here: “parameter”)
  • You should always provide a default value for non-mandatory parameters. The defaultValue property is always a String and the toString method normally cannot be applied due to compile time restrictions.
  • If a defaultValue attribute is given, uimaFit injects the default value automatically. That means that after the initialize method is complete, parameter has the value true.

For mandatory configuration parameters, only two parts have to be changed:

  1. There is no more defaultValue.
  2. The attribute mandatory is set to true.

Template for optional parameters

In order to create a new code template, open Window -> Preferences -> Java/Editor/Templates and Use the following metadata:

  • Name: uima_optional_param
  • Context: Java
  • Description: Generates an optional uimaFit configuration parameter

Pattern:

${imp:import(org.apache.uima.fit.descriptor.ConfigurationParameter)}
public static final String PARAM_${paramNameCapitalized} = "${paramName}";
@ConfigurationParameter(name = PARAM_${paramNameCapitalized}, mandatory = false, 
    description = "${description}", defaultValue = "${defaultValue}")
private ${type} ${paramName};${cursor}

When you now type uima_optional_param in an editor, you will be prompted for the different parts of the template:

  • type is the type of the new parameter. You may choose any primitive type (int, boolean,…), any class that has a ‘single String’ constructor (e.g. File), enum types, and Locale/Pattern.
  • paramName is the parameter name. In Java, you should camel-case it, e.g. shallDeleteAll would be a valid parameter name.
  • paramNameCaptialized is the capitalized form of the parameter name. You should follow Java coding conventions, as illustrated in the following example: Parameter shallDeleteAll yields the  constant PARAM_SHALL_DELETE_ALL.
  • defaultValue is the default value of the configuration parameter. It will be injected into the class member by uimaFit.

Template for mandatory configuration parameters

Most things that I described about optional configuration parameters also apply to mandatory configuration parameters. The template for mandatory parameters can be created as follows: Create a new code template in Window -> Preferences -> Java/Editor/Templates and use the following metadata:

  • Name: uima_mandatory_param
  • Context: Java
  • Description: Generates a mandatory uimaFit configuration parameter

Pattern:

${imp:import(org.apache.uima.fit.descriptor.ConfigurationParameter)}
public static final String PARAM_${paramNameCapitalized} = "${paramName}";
@ConfigurationParameter(name = PARAM_${paramNameCapitalized}, mandatory = true, description = "${description}")
private ${type} ${paramName};${cursor}

The variables in the template have the exact same meaning as in the template for optional parameters.

 References

  • [1] uimaFit wiki on @ConfigurationParameter

WordNet is an invaluable resource for NLP research. John Didion has developed a Java library for accessing WordNet data in a programmatic way. To access WN from Java, the following steps are necessary:

  1. Download WordNet
  2. Add a dependency to JWNL to your project or download the library.
  3. Configure properties.xml so that JWNL knows where to find WordNet and which version is used.
  4. Create Dictionary instance for querying WordNet.

Configuration

The configuration is stored in an XML file that sets the path where WordNet can be found. If you use a standard WN distribution, then the path should end in dict as the following minimalistic properties.xml illustrates:

<?xml version="1.0" encoding="UTF-8"?>
<jwnl_properties language="en">
  <version publisher="Princeton" number="3.0" language="en"/>
  <dictionary>
    <param name="dictionary_element_factory" 
      value="net.didion.jwnl.princeton.data.PrincetonWN17FileDictionaryElementFactory"/>
    <param name="file_manager" value="net.didion.jwnl.dictionary.file_manager.FileManagerImpl">
      <param name="file_type" value="net.didion.jwnl.princeton.file.PrincetonRandomAccessDictionaryFile"/>
      <param name="dictionary_path" value="path-to-dict"/>
    </param>
  </dictionary>
  <resource/>
</jwnl_properties>

On GitHub, you find two prepared properties files:

  • properties_min.xml uses only a minimum of the possible settings
  • properties.xml includes a rule-based morphological stemmer that allows you to query for inflected forms, e.g., houses, runs, dogs

Boilerplate code

A singleton instance of Dictionary is used to query WordNet with JWNL. In fact, setting up the dictionary is very easy:

JWNL.initialize(new FileInputStream("src/main/resources/properties.xml"));
final Dictionary dictionary = Dictionary.getInstance();

Afterwards, you can easily query the dictionary for a lemma of your choice (try house, houses, dog). For each lemma, you also specify one of the 4 possible part-of-speech classes that you are looking for, that is one of POS.ADJECTIVE, POS.ADVERB, POS.NOUN, POS.VERB. For house you would choose POS.NOUN or POS.VERB. The whole process looks rather clumsy, so I listed the steps below:

  1. Lookup: Is the lemma in the dictionary?
    final IndexWord indexWord = dictionary.lookupIndexWord(pos, lemma);

    • If the lookup fails, indexWord is null.
  2. What different senses may the lemma have?
    final Synset[] senses = indexWord.getSenses();
  3. For each sense, we may get a short description of the sense, called the gloss.
    final String gloss = synset.getGloss();
  4. What other lemmas are in a synset?
    final Word[] words = synset.getWords();

    • For each word, we may get its lemma and its POS: word.getLemma(); and word.getPOS().getKey();

Where to get it

The code for this tutorial is available on GitHub. You need to copy the template properties file(s) in src/main/resources before you can run the code. Given an lemma and part-of-speech, the program returns the list of synsets that contain the lemma. For house/v the output looks like so:

Aug 23, 2013 9:13:40 AM net.didion.jwnl.dictionary.Dictionary doLog
INFO: Installing dictionary net.didion.jwnl.dictionary.FileBackedDictionary@6791d8c1
 1 Lemmas: [house/v] (Gloss: contain or cover; “This box houses the gears”)
 2 Lemmas: [house/v, put_up/v, domiciliate/v] (Gloss: provide housing for; “The immigrants were housed in a new development outside the town”)

Maven dependency for JWNL reader and the necessary logging:

<dependency>                        
  <groupId>net.didion.jwnl</groupId>
  <artifactId>jwnl</artifactId>     
  <version>1.4.0.rc2</version>      
</dependency>
<dependency>
  <groupId>commons-logging</groupId>
  <artifactId>commons-logging</artifactId>
  <version>1.1.3</version>
</dependency>

Links

  • [1] JWNL Sourceforge site
  • [2] JWNLSourceforge Wiki with much more information
  • [3] WordNet 3.0 download

Naively, I supposed that every varargs parameter in Java could be treated just like a normal array. This is not the case for primitive arrays as I had to learn!

Using the standard library method Arrays.asList(T…) that converts arrays/varargs of objects to java.util.List‘s, an idiomatic code snippet may look like this:

final int[] ints = { 3, 2, 8, 1, 1, 5 };
final List<Integer> list = Arrays.asList(ints);
Collections.sort(list);

However, the Java compiler complains about the second line:

Type mismatch: cannot convert from List<int[]> to List<Integer>

Obviously, the type parameter T is resolved to int[]. Apache Commons-Lang offers helps to resort from this problem: Its ArrayUtils.toObject() method takes a primitive array and converts it to the corresponding object array. The following, modified listing demonstrates this:

final int[] ints = { 3, 2, 8, 1, 1, 5 };
// final List<Integer> list = Arrays.asList(ints);
final Integer[] intObjects = ArrayUtils.toObject(ints);
final List<Integer> list = Arrays.asList(intObjects);
Collections.sort(list);

Where to get it

Maven Dependency (Download):

<dependency>
    <groupId>commons-lang</groupId>
    <artifactId>commons-lang</artifactId>
    <version>2.6</version>
</dependency>

Links

  • [1] Related stackoverflow question
  • [2] Arrays.asList(T…) docu (Java 6)
  • [3] ArrayUtils.toObject() docu (Commons-Lang 2.6)

If your software produces costly objects, object serialization may be an option to spare you some bootstrapping time, e.g., when you repeatedly restart your application during development. Apache Commons-Lang offers an implementation of serialization that is an epitome of ease of use: SerializationUtils. The core methods are serialize and deserialize.

Given an object, the actual process of serialization is a one-line statement (split up here):

final File targetFile = new File("./target/serializedObject.ser");
final BufferedOutputStream outStream = new BufferedOutputStream(new FileOutputStream(targetFile));
SerializationUtils.serialize(object, outStream);

The same holds for the deserialization process. In these subsequent lines, we assume that the object to be derserialized is an instance of java.lang.String:

final BufferedInputStream inStream = new BufferedInputStream(new FileInputStream(targetFile));
final String string = (String) SerializationUtils.deserialize(inStream);

A complete executable example can be found on GitHub.

Maven Dependency (a Jar file for download can be found here):

<dependency>
    <groupId>commons-lang</groupId>
    <artifactId>commons-lang</artifactId>
    <version>2.2</version>
</dependency>

Links

  • [1] SerializationUtils Javadoc
  • [2] Executable example code (Maven project)

Log4J is a widely used logging library for Java. With the release of version 1.2.15, some dependencies were introduced that are not deemed necessary by some people [1]. The developers did not – for what ever reason – mark these dependencies as optional, even though they were not available in some common Maven repositories.

The following snippet demonstrates how to exclude transitive dependencies in Maven, applied to the case study of the transitive dependencies com.sun.jmdk:jmxtools and com.sun.jmx:jmxri.

<dependency>
  <groupId>log4j</groupId>
  <artifactId>log4j</artifactId>
  <version>1.2.15</version>
  <exclusions>
    <exclusion>
      <groupId>com.sun.jdmk</groupId>
      <artifactId>jmxtools</artifactId>
    </exclusion>
    <exclusion>
      <groupId>com.sun.jmx</groupId>
      <artifactId>jmxri</artifactId>
    </exclusion>
  </exclusions>
</dependency>

Links

  • [1] Mailing list entry stating that some dependencies are unnecessary
  • [2] Maven reference: Optional dependencies and exclusions

There exist very sophisticated timer libraries for Java around. Most of the time, however, I only need a tiny class which essentially caches a start and end point in time, and that’s it – such as the following:

public class Timer
{
  private long startTime;
  private long endTime;

  public void start()
  {
      this.startTime = System.currentTimeMillis();
  }

  public void stop()
  {
      this.endTime = System.currentTimeMillis();
  }

  public long getTimeInMillis()
  {
      return this.endTime - this.startTime;
  }

  public double getTimeInSeconds()
  {
      return this.getTimeInMillis() / 1000.0;
  }
}