Table of contents
The main purpose of the library  to give developers, interested both in Dart language and data science, native Dart implementation of machine learning algorithms. This library targeted to dart vm, so, to get smoothest experience with the lib, please, do not use it in a browser.
Following algorithms are implemented:
Linear regression:
Linear classifier:
Nonparametric regression:
An algorithm, that performs linear binary classification.
LogisticRegressor.gradient. Logistic regression with gradient ascent optimization of loglikelihood cost function. To use this kind of classifier your data have to be linearly separable.
LogisticRegressor.coordinate.
Not implemented yet. Logistic regression with coordinate descent optimization of negated loglikelihood cost
function. Coordinate descent allows to do feature selection (aka L1 regularization
) To use this kind of
classifier your data have to be linearly separable.
An algorithm, that performs linear multiclass classification.
SoftmaxRegressor.gradient. Softmax regression with gradient ascent optimization of loglikelihood cost function. To use this kind of classifier your data have to be linearly separable.
SoftmaxRegressor.coordinate.
Not implemented yet. Softmax regression with coordinate descent optimization of negated loglikelihood cost
function. As in case of logistic regression, coordinate descent allows to do feature selection (aka
L1 regularization
) To use this kind of classifier your data have to be
linearly separable.
LinearRegressor.gradient. A wellknown algorithm, that performs linear regression using gradient vector of a cost function.
LinearRegressor.coordinate An algorithm,
that uses coordinate descent in order to find optimal value of a cost function. Coordinate descent allows to
perform feature selection along with regression process (This technique often calls Lasso regression
).
k
closest observations from training data.
It has quite high computational complexity, but in the same time it may easily catch nonlinear pattern of the data. Let's classify records from wellknown dataset  Pima Indians Diabets Database via Logistic regressor
Import all necessary packages. First, it's needed to ensure, if you have ml_preprocessing
package in your
dependencies:
dependencies:
ml_preprocessing: ^3.2.0
We need this repo to parse raw data in order to use it farther. For more details, please, visit ml_preprocessing repository page.
import 'dart:async';
import 'package:ml_algo/ml_algo.dart';
import 'package:ml_preprocessing/ml_preprocessing.dart';
Download dataset from Pima Indians Diabets Database and read it (of course, you should provide a proper path to your downloaded file):
final data = DataFrame.fromCsv('datasets/pima_indians_diabetes_database.csv',
labelName: 'class variable (0 or 1)');
final features = (await data.features)
.mapColumns((column) => column.normalize()); // it's needed to normalize the matrix columnwise to reach
// computational stability and provide uniform scale for all
// the values in the column
final labels = await data.labels;
Data in this file is represented by 768 records and 8 features. 9th column is a label column, it contains either 0 or 1
on each row. This column is our target  we should predict a class label for each observation. Therefore, we
should point, where to get label values. Let's use labelName
parameter for that (labels column name, 'class variable
(0 or 1)' in our case).
Processed features and labels are contained in data structures of Matrix
type. To get more information about
Matrix
type, please, visit ml_linal repo
Then, we should create an instance of CrossValidator
class for fitting hyperparameters
of our model
final validator = CrossValidator.KFold(numberOfFolds: 5);
All are set, so, we can do our classification.
Evaluate our model via accuracy metric:
final accuracy = validator.evaluate((trainFeatures, trainLabels) =>
LogisticRegressor.gradient(
trainFeatures, trainLabels,
initialLearningRate: .8,
iterationsLimit: 500,
batchSize: 768,
fitIntercept: true,
interceptScale: .1,
learningRateType: LearningRateType.constant),
features, labels, MetricType.accuracy);
Let's print score:
print('accuracy on classification: ${accuracy.toStringAsFixed(2)}');
We will see something like this:
acuracy on classification: 0.77
All the code above all together:
import 'dart:async';
import 'package:ml_algo/ml_algo.dart';
import 'package:ml_preprocessing/ml_preprocessing.dart';
Future main() async {
final data = DataFrame.fromCsv('datasets/pima_indians_diabetes_database.csv',
labelName: 'class variable (0 or 1)');
final features = (await data.features).mapColumns((column) => column.normalize());
final labels = await data.labels;
final validator = CrossValidator.kFold(numberOfFolds: 5);
final accuracy = validator.evaluate((trainFeatures, trainLabels) =>
LogisticRegressor.gradient(
trainFeatures, trainLabels,
initialLearningRate: .8,
iterationsLimit: 500,
batchSize: 768,
fitIntercept: true,
interceptScale: .1,
learningRateType: LearningRateType.constant),
features, labels, MetricType.accuracy);
print('accuracy on classification: ${accuracy.toStringFixed(2)}');
}
Let's classify another famous dataset  Iris dataset. Data in this csv is separated into 3 classes  therefore we need to use different approach to data classification  Softmax regression.
As usual, start with data preparation. Before we start, we should update our pubspec's dependencies with xrange` library:
dependencies:
...
xrange: ^0.0.5
...
Download the file and read it:
final data = DataFrame.fromCsv('datasets/iris.csv',
labelName: 'Species',
columns: [ZRange.closed(1, 5)],
categories: {
'Species': CategoricalDataEncoderType.oneHot,
},
);
final features = await data.features;
final labels = await data.labels;
The csv database has 6 columns, but we need to get rid of the first column, because it contains just ID of every observation  it's absolutely useless data. So, as you may notice, we provided a columns range to exclude IDcolumn:
columns: [ZRange.closed(1, 5)]
Also, since the label column 'Species' has categorical data, we encoded it to numerical format:
categories: {
'Species': CategoricalDataEncoderType.oneHot,
},
Next step  create a cross validator instance:
final validator = CrossValidator.kFold(numberOfFolds: 5);
Evaluate quality of prediction:
final accuracy = validator.evaluate((trainFeatures, trainLabels) =>
LinearClassifier.softmaxRegressor(
trainFeatures, trainLabels,
initialLearningRate: 0.03,
iterationsLimit: null,
minWeightsUpdate: 1e6,
randomSeed: 46,
learningRateType: LearningRateType.constant
), features, labels, MetricType.accuracy);
print('Iris dataset, softmax regression: accuracy is '
'${accuracy.toStringAsFixed(2)}'); // It yields 0.93
Gather all the code above all together:
import 'dart:async';
import 'package:ml_algo/ml_algo.dart';
import 'package:ml_preprocessing/ml_preprocessing.dart';
import 'package:xrange/zrange.dart';
Future main() async {
final data = DataFrame.fromCsv('datasets/iris.csv',
labelName: 'Species',
columns: [ZRange.closed(1, 5)],
categories: {
'Species': CategoricalDataEncoderType.oneHot,
},
);
final features = await data.features;
final labels = await data.labels;
final validator = CrossValidator.kFold(numberOfFolds: 5);
final accuracy = validator.evaluate((trainFeatures, trainLabels) =>
LinearClassifier.softmaxRegressor(
trainFeatures, trainLabels,
initialLearningRate: 0.03,
iterationsLimit: null,
minWeightsUpdate: 1e6,
randomSeed: 46,
learningRateType: LearningRateType.constant
), features, labels, MetricType.accuracy);
print('Iris dataset, softmax regression: accuracy is '
'${accuracy.toStringAsFixed(2)}');
}
Let's do some prediction with a wellknown nonparametric regression algorithm  k nearest neighbours. Let's take a state of the art dataset  boston housing.
As usual, import all necessary packages
import 'dart:async';
import 'package:ml_algo/ml_algo.dart';
import 'package:ml_preprocessing/ml_preprocessing.dart';
import 'package:xrange/zrange.dart';
and download and read the data
final data = DataFrame.fromCsv('lib/_datasets/housing.csv',
headerExists: false,
fieldDelimiter: ' ',
labelIdx: 13,
);
As you can see, the dataset is headless, that means, that there is no a descriptive line in the beginning of the file, hence we can just use the indexbased approach to point, where the outcomes column resides (13 index in our case)
Extract features and labels
// As in example above, it's needed to normalize the matrix columnwise to reach computational stability and provide
// uniform scale for all the values in the column
final features = (await data.features).mapColumns((column) => column.normalize());
final labels = await data.labels;
Create a crossvalidator instance
final validator = CrossValidator.kFold(numberOfFolds: 5);
Let the k
parameter be equal to 4
.
Assess a knn regressor with the chosen k
value using MAPE metric
final error = validator.evaluate((trainFeatures, trainLabels) =>
ParameterlessRegressor.knn(trainFeatures, trainLabels, k: 4), features, labels, MetricType.mape);
Let's print our error
print('MAPE error on kfold validation: ${error.toStringAsFixed(2)}%'); // it yields approx. 6.18
If you have questions, feel free to write me on
ScoreToProbMapperFactory
removedScoreToProbMapperType
enum removedScoreToProbMapper
: the entity renamed to LinkFunction
Predictor
: fit
method removed, fitting is happening while a model is being createdPredictor
: interface replaced with Assessable
, redundant properties removedLinearClassifier
reorganizedInterceptPreprocessor
replaced with a helper function addInterceptIf
NoNParametricRegressor.nearestNeighbour
: added possibility to specify the kernel functionNoNParametricRegressor
class addedKNNRegressor
class addedml_linalg
v9.0.0 supportedml_linalg
v7.0.0 supportnormalize
method used)CategoricalDataEncoderType
: onehot encoding documentation correctedLinearClassifier.logisticRegressor
: numerical stability improvedLinearClassifier.logisticRegressor
: probabilityThreshold
parameter addedDataFrame.fromCsv
: parameter fieldDelimiter
addedDataFrame
: labelName
parameter addedml_linalg
v6.0.2 supportedClassifier
: type of weightsByClasses
changed from Map
to Matrix
SoftmaxRegressor
: more detailed unit tests for softmax regression addedDataFrame
introduced (former MLData
)LinearClassifier.softmaxRegressor
implementedMetric
interface refactored (getError
renamed to getScore
)SoftmaxMapper
added (aka Softmax activation function)ConvergenceDetector
added (this entity stops the optimizer when it is needed)ml_algo
entryLinkFunction
renamed to ScoreToProbMapper
ScoreToProbMapper
accepts vector and returns vector instead of a scalarMLData
correctedtest_api.dart
Float32x4CsvMlData
significantly extendedrows
parameter added to Float32x4CsvMlData
ml_linalg
removed from export filedatasets
directory createdml_linal
^4.0.0 supporteddartfmt
tool applied to all necessary filesml_linalg
2.0.0 supporteddartfm
tool appliedml_linalg
supportedlinalg
packagefitIntercept
and interceptScale
parameters)One versus all
refactored, tests for logistic regression added part ... part of
directives removedfindMaxima
and findMinima
methods were added to Optimizer
interfaceREADME.md
updatedREADME.md
updatedsimd_vector
dependency url fixedFloat32x4Vector
class was added (from dart_vector
library)List
for label (target) list replaced with Float32List
(in Predictor.train()
and Optimizer.optimize()
)Vector
and enum Norm
were extracted to separate library (https://github.com/gyrdym/dart_vector.git
)README.md
was actualizedRandomizer
class was addedCrossValidation
classTypedVector
> Vector
new
instantiationKFoldCrossValidation
)LpoCrossValidation
)DataTrainTestSplitter
was removedcopy
, fill
methods were added to Vector
Vector
class was added as a base for typed and regular vector classes README
file was extended and clarifiedBGDOptimizer
, MBGDOptimizer
and GradientOptimizer
were addedOptimizerInterface
was addedsum
, abs
, fromRange
methods of the TypedVector
were addedDataTrainTestSplitter
was added
, *
, /
operators and all vectors methods added to the TypedVector
example/main.dart
import 'dart:async';
import 'package:ml_algo/ml_algo.dart';
import 'package:ml_linalg/matrix.dart';
/// A simple usage example using synthetic data. To see more complex examples,
/// please, visit other directories in this folder
Future main() async {
// Let's create a feature matrix (a set of independent variables)
final features = Matrix.fromList([
[2.0, 3.0, 4.0, 5.0],
[12.0, 32.0, 1.0, 3.0],
[27.0, 3.0, 0.0, 59.0],
]);
// Let's create dependent variables vector. It will be used as `true` values
// to adjust regression coefficients
final labels = Matrix.fromList([
[4.3],
[3.5],
[2.1],
]);
// Let's create a regressor itself and train it
final regressor = LinearRegressor.gradient(
features, labels,
iterationsLimit: 100,
initialLearningRate: 0.0005,
learningRateType: LearningRateType.constant);
// Let's see adjusted coefficients
print('Regression coefficients: ${regressor.coefficients}');
}
Add this to your package's pubspec.yaml file:
dependencies:
ml_algo: ^12.0.2
You can install packages from the command line:
with pub:
$ pub get
with Flutter:
$ flutter pub get
Alternatively, your editor might support pub get
or flutter pub get
.
Check the docs for your editor to learn more.
Now in your Dart code, you can use:
import 'package:ml_algo/ml_algo.dart';
Version  Uploaded  Documentation  Archive 

12.0.2  May 24, 2019  
12.0.1  May 24, 2019  
12.0.0  May 21, 2019  
11.0.1  Apr 23, 2019  
11.0.0  Apr 21, 2019  
10.3.0  Apr 20, 2019  
10.2.1  Apr 17, 2019  
10.2.0  Apr 16, 2019  
10.1.0  Apr 3, 2019  
10.0.0  Mar 28, 2019 
Popularity:
Describes how popular the package is relative to other packages.
[more]

48

Health:
Code health derived from static analysis.
[more]

99

Maintenance:
Reflects how tidy and uptodate the package is.
[more]

90

Overall:
Weighted score of the above.
[more]

72

We analyzed this package on Jun 11, 2019, and provided a score, details, and suggestions below. Analysis was completed with status completed using:
Detected platforms: Flutter, web, other
No platform restriction found in primary library
package:ml_algo/ml_algo.dart
.
Fix lib/src/regressor/linear_regressor.dart
. (1 points)
Analysis of lib/src/regressor/linear_regressor.dart
reported 2 hints:
line 21 col 3: Prefer using /// for doc comments.
line 86 col 3: Prefer using /// for doc comments.
Fix lib/src/optimizer/convergence_detector/convergence_detector.dart
. (0.50 points)
Analysis of lib/src/optimizer/convergence_detector/convergence_detector.dart
reported 1 hint:
line 1 col 1: Prefer using /// for doc comments.
Format lib/src/algorithms/knn/kernel.dart
.
Run dartfmt
to format lib/src/algorithms/knn/kernel.dart
.
Fix additional 28 files with analysis or formatting issues.
Additional issues in the following files:
lib/src/algorithms/knn/kernel_type.dart
(Run dartfmt
to format lib/src/algorithms/knn/kernel_type.dart
.)lib/src/algorithms/knn/knn.dart
(Run dartfmt
to format lib/src/algorithms/knn/knn.dart
.)lib/src/classifier/linear_classifier_mixin.dart
(Run dartfmt
to format lib/src/classifier/linear_classifier_mixin.dart
.)lib/src/classifier/logistic_regressor/gradient_logistic_regressor.dart
(Run dartfmt
to format lib/src/classifier/logistic_regressor/gradient_logistic_regressor.dart
.)lib/src/classifier/logistic_regressor/logistic_regressor.dart
(Run dartfmt
to format lib/src/classifier/logistic_regressor/logistic_regressor.dart
.)lib/src/classifier/softmax_regressor/gradient_softmax_regressor.dart
(Run dartfmt
to format lib/src/classifier/softmax_regressor/gradient_softmax_regressor.dart
.)lib/src/classifier/softmax_regressor/softmax_regressor.dart
(Run dartfmt
to format lib/src/classifier/softmax_regressor/softmax_regressor.dart
.)lib/src/cost_function/log_likelihood.dart
(Run dartfmt
to format lib/src/cost_function/log_likelihood.dart
.)lib/src/cost_function/squared.dart
(Run dartfmt
to format lib/src/cost_function/squared.dart
.)lib/src/helpers/add_intercept_if.dart
(Run dartfmt
to format lib/src/helpers/add_intercept_if.dart
.)lib/src/helpers/get_probabilities.dart
(Run dartfmt
to format lib/src/helpers/get_probabilities.dart
.)lib/src/link_function/logit/float32_inverse_logit_link_function_mixin.dart
(Run dartfmt
to format lib/src/link_function/logit/float32_inverse_logit_link_function_mixin.dart
.)lib/src/link_function/logit/inverse_logit_link_function.dart
(Run dartfmt
to format lib/src/link_function/logit/inverse_logit_link_function.dart
.)lib/src/link_function/softmax/float32_softmax_link_function_mixin.dart
(Run dartfmt
to format lib/src/link_function/softmax/float32_softmax_link_function_mixin.dart
.)lib/src/link_function/softmax/softmax_link_function.dart
(Run dartfmt
to format lib/src/link_function/softmax/softmax_link_function.dart
.)lib/src/metric/regression/mape.dart
(Run dartfmt
to format lib/src/metric/regression/mape.dart
.)lib/src/model_selection/cross_validator/cross_validator.dart
(Run dartfmt
to format lib/src/model_selection/cross_validator/cross_validator.dart
.)lib/src/model_selection/cross_validator/cross_validator_impl.dart
(Run dartfmt
to format lib/src/model_selection/cross_validator/cross_validator_impl.dart
.)lib/src/model_selection/data_splitter/k_fold.dart
(Run dartfmt
to format lib/src/model_selection/data_splitter/k_fold.dart
.)lib/src/optimizer/coordinate/coordinate.dart
(Run dartfmt
to format lib/src/optimizer/coordinate/coordinate.dart
.)lib/src/optimizer/gradient/gradient.dart
(Run dartfmt
to format lib/src/optimizer/gradient/gradient.dart
.)lib/src/optimizer/optimizer.dart
(Run dartfmt
to format lib/src/optimizer/optimizer.dart
.)lib/src/optimizer/optimizer_factory.dart
(Run dartfmt
to format lib/src/optimizer/optimizer_factory.dart
.)lib/src/optimizer/optimizer_factory_impl.dart
(Run dartfmt
to format lib/src/optimizer/optimizer_factory_impl.dart
.)lib/src/regressor/coordinate_regressor.dart
(Run dartfmt
to format lib/src/regressor/coordinate_regressor.dart
.)lib/src/regressor/gradient_regressor.dart
(Run dartfmt
to format lib/src/regressor/gradient_regressor.dart
.)lib/src/regressor/knn_regressor.dart
(Run dartfmt
to format lib/src/regressor/knn_regressor.dart
.)lib/src/regressor/parameterless_regressor.dart
(Run dartfmt
to format lib/src/regressor/parameterless_regressor.dart
.)The package description is too short. (10 points)
Add more detail to the description
field of pubspec.yaml
. Use 60 to 180 characters to describe the package, what it does, and its target use case.
Package  Constraint  Resolved  Available 

Direct dependencies  
Dart SDK  >=2.3.0 <3.0.0  
ml_linalg  ^10.0.0  10.3.2  
quiver  ^2.0.2  2.0.3  
tuple  ^1.0.2  1.0.2  
xrange  ^0.0.5  0.0.6  
Transitive dependencies  
matcher  0.12.5  
meta  1.1.7  
path  1.6.2  
stack_trace  1.9.3  
Dev dependencies  
benchmark_harness  >=1.0.0 <2.0.0  
build_runner  ^1.1.2  
build_test  ^0.10.2  
mockito  ^3.0.0  
pedantic  1.1.0  
test  ^1.2.0 