Germanet_Dart is a Dart equivalent to the python package germanetpy, that is a german lexical tree system (similar in its functionality to WordNet)
Features
This package can do everything what the python package can do. You can generate 'Synset's (Word meanings) for almost every adjective, noun and verb. The system also uses Wictionary data as well as Ili records. Besides the lexical system the package also provides algorthms to compute the semantic similarity between words (Synsets).
Getting started
At first you need to have the GermaNet Dataset, which is licensed and owned by the University of Tübingen.
Install the package using
$ dart pub add germanet_dart
or using
$ flutter pub add germanet_dart
Usage
Import the Germanet class
import 'package:germanet_dart/src/germanet.dart';
Initlialize the Germanet class
Germanet germanet = Germanet("PATH_TO_GERMANET_XML_FILES", loadDataDirectly: false);
await germanet.loadData(); //can take up to 10 seconds
Getting Synsets for a specific word (text)
List<Synset> synsets = germanet.getSynsetsByOrthform("Haus");
print(synsets);
/*
result:
[Synset(id=s29522), lexunits={Sternzeichen,Sternbild,Tierkreiszeichen,Haus}),
Synset(id=s74611), lexunits={Haus}),
Synset(id=s24202), lexunits={Haus,Geschlecht,Dynastie,Familiendynastie,Familienclan}),
Synset(id=s74612), lexunits={Haus}),
Synset(id=s9439), lexunits={Haus})]
*/
Each Synsets has multiple Lex(ical)Units that represent the original word behind the Synset. Each Synset has a different meaning (for the word 'Haus', there are 5 different meaning). The exact meaning of each element can be extracted
List<Synset> s = g.getSynsetsByOrthform("Haus");
Synset s0 = s[0];
print(s0.lexunits.first.orthform);
print(s0.lexunits.first.examples);
print(s0.lexunits.first.sense);
print(s0.lexunits.first.source);
print(s0.word_class);
print(s0.category);
/*
result:
Sternzeichen //specific meaning
[] //no examples
1 //the sense number of the lexical unit
core //core is the origin (the core dataset)
WordClass.Motiv
WordCategory.nomen
*/
There are 6 different semantic similarity methods:
PathBasedRelatedness pathBasedRelatedness = PathBasedRelatedness(g, WordCategory.nomen);
print(SemRelMeasure.SimplePath.name+": "+pathBasedRelatedness.simple_path(a, b).toString());
print(SemRelMeasure.LeacockAndChodorow.name+": "+pathBasedRelatedness.leacock_chodorow(a, b).toString());
print(SemRelMeasure.WuAndPalmer.name+": "+pathBasedRelatedness.wu_and_palmer(a, b).toString());
ICBasedSimilarity icBasedSimilarity = ICBasedSimilarity(g, WordCategory.nomen, "PATH_TO_YOUR_COPORA");
print(SemRelMeasure.Lin.name+": "+icBasedSimilarity.lin(a, b).toString());
print(SemRelMeasure.Resnik.name+": "+icBasedSimilarity.resnik(a, b).toString());
print(SemRelMeasure.JiangAndConrath.name+": "+icBasedSimilarity.jiang_and_conrath(a, b).toString());
/*
result:
Checking similarities...
SimplePath: 0.62857
LeacockAndChodorow: 0.47712
WuAndPalmer: 0.0
Lin: 0.0
Resnik: 0.0
JiangAndConrath: 2.5390760987927763
*/
Additional information
The author of the python source code is the University of Tübingen.
Libraries
- germanet_dart
- Support for doing something awesome.