germanet_dart - Dart API docs

Germanet_Dart is a Dart equivalent to the python package germanetpy, that is a german lexical tree system (similar in its functionality to WordNet)

Features

This package can do everything what the python package can do. You can generate 'Synset's (Word meanings) for almost every adjective, noun and verb. The system also uses Wictionary data as well as Ili records. Besides the lexical system the package also provides algorthms to compute the semantic similarity between words (Synsets).

Getting started

At first you need to have the GermaNet Dataset, which is licensed and owned by the University of Tübingen.

Install the package using

$ dart pub add germanet_dart

or using

$ flutter pub add germanet_dart

Usage

Import the Germanet class

import 'package:germanet_dart/src/germanet.dart';

Initlialize the Germanet class

Germanet germanet = Germanet("PATH_TO_GERMANET_XML_FILES", loadDataDirectly: false);
await germanet.loadData(); //can take up to 10 seconds

Getting Synsets for a specific word (text)

List<Synset> synsets = germanet.getSynsetsByOrthform("Haus");
print(synsets);

/*
result:
[Synset(id=s29522), lexunits={Sternzeichen,Sternbild,Tierkreiszeichen,Haus}), 
Synset(id=s74611), lexunits={Haus}), 
Synset(id=s24202), lexunits={Haus,Geschlecht,Dynastie,Familiendynastie,Familienclan}), 
Synset(id=s74612), lexunits={Haus}), 
Synset(id=s9439), lexunits={Haus})]
*/

Each Synsets has multiple Lex(ical)Units that represent the original word behind the Synset. Each Synset has a different meaning (for the word 'Haus', there are 5 different meaning). The exact meaning of each element can be extracted

List<Synset> s = g.getSynsetsByOrthform("Haus");
Synset s0 = s[0];
print(s0.lexunits.first.orthform);
print(s0.lexunits.first.examples);
print(s0.lexunits.first.sense);
print(s0.lexunits.first.source);
print(s0.word_class);
print(s0.category);

/*
result:
Sternzeichen //specific meaning
[] //no examples
1 //the sense number of the lexical unit
core //core is the origin (the core dataset)
WordClass.Motiv
WordCategory.nomen
*/

There are 6 different semantic similarity methods:

PathBasedRelatedness pathBasedRelatedness = PathBasedRelatedness(g, WordCategory.nomen);
print(SemRelMeasure.SimplePath.name+": "+pathBasedRelatedness.simple_path(a, b).toString());
print(SemRelMeasure.LeacockAndChodorow.name+": "+pathBasedRelatedness.leacock_chodorow(a, b).toString());
print(SemRelMeasure.WuAndPalmer.name+": "+pathBasedRelatedness.wu_and_palmer(a, b).toString());
  
ICBasedSimilarity icBasedSimilarity = ICBasedSimilarity(g, WordCategory.nomen, "PATH_TO_YOUR_COPORA");
print(SemRelMeasure.Lin.name+": "+icBasedSimilarity.lin(a, b).toString());
print(SemRelMeasure.Resnik.name+": "+icBasedSimilarity.resnik(a, b).toString());
print(SemRelMeasure.JiangAndConrath.name+": "+icBasedSimilarity.jiang_and_conrath(a, b).toString());

/*
result:
Checking similarities...
SimplePath: 0.62857
LeacockAndChodorow: 0.47712
WuAndPalmer: 0.0
Lin: 0.0
Resnik: 0.0
JiangAndConrath: 2.5390760987927763
*/

Additional information

The author of the python source code is the University of Tübingen.

Features

Getting started

Usage

Additional information

Libraries

germanet_dart package