When I deployed a Maven project that uses the LanguageToolSegmenter (wrapped by DKPro Core), I encountered the following stack trace:

Caused by: java.lang.ExceptionInInitializerError
    at org.languagetool.language.German.getSentenceTokenizer(German.java:96)
    at de.tudarmstadt.ukp.dkpro.core.languagetool.LanguageToolSegmenter.process(LanguageToolSegmenter.java:47)
    at de.tudarmstadt.ukp.dkpro.core.api.segmentation.SegmenterBase.process(SegmenterBase.java:124)
    at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
    at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:378)
    at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:298)
    at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:568)
    at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:410)
    at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:343)
    at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:265)
    at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:568)
    at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:410)
    at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:343)
    at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:265)
    at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)
    at org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:170)
    at org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:191)
    at de.tudarmstadt.ukp.rk.mt.api.pipeline.io.JsonAnnotatedCorpusReader.createIndexToTokenMapping(JsonAnnotatedCorpusReader.java:415)
    at de.tudarmstadt.ukp.rk.mt.api.pipeline.io.JsonAnnotatedCorpusReader.getNext(JsonAnnotatedCorpusReader.java:327)
    at de.tudarmstadt.ukp.rk.mt.classification.io.UserFilteredClassificationCorpusReader.getNext(UserFilteredClassificationCorpusReader.java:45)
    at org.apache.uima.fit.component.JCasCollectionReader_ImplBase.getNext(JCasCollectionReader_ImplBase.java:72)
    at de.tudarmstadt.ukp.dkpro.lab.uima.engine.simple.SimpleExecutionEngine.run(SimpleExecutionEngine.java:139)
    ... 38 more
Caused by: java.lang.UnsupportedOperationException: This parser does not support specification "null" version "null"
    at javax.xml.parsers.SAXParserFactory.setSchema(SAXParserFactory.java:419)
    at net.sourceforge.segment.srx.io.Srx2SaxParser.<init>(Srx2SaxParser.java:181)
    at org.languagetool.tokenizers.SRXSentenceTokenizer.createSrxDocument(SRXSentenceTokenizer.java:60)
    at org.languagetool.tokenizers.SRXSentenceTokenizer.<clinit>(SRXSentenceTokenizer.java:47)
    ... 60 more

It turned out that the problem is that some (actually unnecessary) dependencies ship with an old version of the SAXParser, namely:

  • xercesImpl-2.6.2.jar
  • xalan-2.7.1.jar

In Eclipse, these dependencies can be excluded as follows: Open the dependency tree and search for “xalan” or “xerces”. You can exclude these artifacts via the context menu or by placing exclusion tags below of of the parent artifacts further up in the dependency tree. In my case, it looks like this:

<dependency>
  <groupId>de.tudarmstadt.ukp.dkpro.tc</groupId>
  <artifactId>
      de.tudarmstadt.ukp.dkpro.tc.core-asl
  </artifactId>
  <exclusions>
      <exclusion>
          <artifactId>xalan</artifactId>
          <groupId>xalan</groupId>
      </exclusion>
  </exclusions>
</dependency>
<dependency>
  <groupId>de.tudarmstadt.ukp.dkpro.tc</groupId>
  <artifactId>
      de.tudarmstadt.ukp.dkpro.tc.features-asl
  </artifactId>
  <exclusions>
      <exclusion>
          <artifactId>xercesImpl</artifactId>
          <groupId>xerces</groupId>
      </exclusion>
  </exclusions>
</dependency>

References

  • [1] Source of this solution
  • [2] describes the problem