A German part-of-speech dictionary that can be used from Java. This repo contains no code but Morfologik binary files to look up part-of-speech data. As a developer, consider using LanguageTool instead of this. If you really want to use this directly, please check out the unit tests for examples.
Also use LanguageTool to export the data in these dictionaries, as documented here.
The POS tags are documented here.
If you update the tagger dictionary, always make sure to also update the synth dictionary with the same data and vice versa. LanguageTool expects the two to be in sync.
To prepare a release (note this will only add forms, not remove them):
- (optional) move readings from do-not-synthesize.txt to
filter-archaic.txt
(in the execution path of SynthDictionaryBuilder) - call
./download-data.sh
- set
DBUSER
,DBPASS
, andLT_PASS
in./data-to-dict.sh
- call
./data-to-dict.sh
- increase version in
pom.xml
- call
mvn install
- test it from the software that integrates it (including a regression test)
To make a release:
- set the version in
pom.xml
to not includeSNAPSHOT
rm src/main/resources/org/languagetool/resource/de/SynthDictionaryBuilder*tags.txt
mvn clean test
mvn clean deploy -P release
- go to https://oss.sonatype.org/#stagingRepositories
- scroll to the bottom, select latest version, and click
Release
git tag vx.y
git push origin vx.y