TreeTagger configuration with DKPro and Maven

DKPro Core contains a component that wraps the popular TreeTagger.

Unfortunately, only the core component de.tudarmstadt.ukp.dkpro.core.treetagger-asl is directly available as Maven artifact, while license restrictions disallow to redistribute the binaries (de.tudarmstadt.ukp.dkpro.core.treetagger-bin) and the models (de.tudarmstadt.ukp.dkpro.core.treetagger-model-{de,en,fr,…}). The DKPro Core developer team provides instructions on how to create the latter artifacts, using an ant build.xml script.

The Maven dependencies of the TreeTagger component look as follows. It is important to use dependency management in order to coordinate the versions of the three artifacts.

<dependencies>
  <dependency>
    <groupId>de.tudarmstadt.ukp.dkpro.core</groupId>
    <artifactId>de.tudarmstadt.ukp.dkpro.core.treetagger-asl</artifactId>
  </dependency>
  <dependency>
    <groupId>de.tudarmstadt.ukp.dkpro.core</groupId>
    <artifactId>de.tudarmstadt.ukp.dkpro.core.treetagger-bin</artifactId>
  </dependency>
  <dependency>
    <groupId>de.tudarmstadt.ukp.dkpro.core</groupId>
    <artifactId>de.tudarmstadt.ukp.dkpro.core.treetagger-model-de</artifactId>
  </dependency>
</dependencies>
<dependencyManagement>
  <dependency>
    <groupId>de.tudarmstadt.ukp.dkpro.core</groupId>
    <artifactId>de.tudarmstadt.ukp.dkpro.core.treetagger-asl</artifactId>
    <version>1.5.0</version>
    <type>pom</type>
    <scope>import</scope>
  </dependency>
</dependencyManagement>

 References

  • [1] TreeTagger project site
  • [2] Instructions on packaging the binary and model artifacts

Leave a Reply