ml_algo 4.3.1 copy "ml_algo: ^4.3.1" to clipboard
ml_algo: ^4.3.1 copied to clipboard

outdated

Machine learning algorithms written in native dart (without bindings to any popular ML libraries, pure Dart implementation)

Build Status pub package Gitter Chat

Machine learning algorithms with dart #

Table of contents

Current state of the library #

The main purpose of the library - give developers, interested both in Dart language and data science, native Dart implementation of machine learning algorithms. This library targeted to dart vm, so, to get smoothest experience with the lib, please, do not use it in a browser.

Following algorithms are implemented:

  • Linear regression:

    • Gradient descent algorithm (batch, mini-batch, stochastic) with ridge regularization
    • Lasso regression (feature selection model)
  • Linear classifier:

    • Logistic regression (with "one-vs-all" multinomial classification)

Key entities of the library #

To provide main purposes of machine learning, the library exposes the following classes:

  • Float32x4CsvMLData. Factory, that creates instances of a csv reader. The reader makes work with csv data easier: you just need to point, where your dataset resides and then get features and labels in convenient data science friendly format. As you see, the reader converts data into sequence of numbers of Float32x4 type, which makes machine learning process faster;

  • Float32x4CrossValidator. Factory, that creates instances of a cross validator. In a few words, this entity allows researchers to fit different hyperparameters of machine learning algorithms, assessing prediction quality on different parts of a dataset. Wiki article about cross validation process.

  • LogisticRegressor. A class, that performs simplest linear classification. If you want to use this classifier for your data, please, make sure, that your data is linearly separably

  • GradientRegressor. A class, that performs geometry-based linear regression using gradient vector of a cost function.

  • LassoRegressor A class, that performs feature selection along with regression process. It uses coordinate descent optimization and subgradient vector instead of gradient descent optimization and gradient vector like in GradientRegressor to provide regression. If you want to decide, which features are less important - go ahead and use this regressor.

Usage #

Real life example #

Let's classify records from well-known dataset - Pima Indians Diabets Database via Logistic regressor

Import all necessary packages:

import 'dart:async';

import 'package:ml_algo/ml_algo.dart';

Read csv-file pima_indians_diabetes_database.csv with test data. You can use csv from the library's datasets directory:

final data = Float32x4CsvMLData.fromFile('datasets/pima_indians_diabetes_database.csv');
final features = await data.features;
final labels = await data.labels;

Data in this file is represented by 768 records and 8 features. Processed features are contained in a data structure of MLMatrix<Float32x4> type and processed labels are contained in a data structure of MLVector<Float32x4> type. To get more information about these types, please, visit ml_linal repo

Then, we should create an instance of CrossValidator class for fitting hyperparameters of our model

final validator = Float32x4CrossValidator.KFold();

All are set, so, we can perform our classification. For better hyperparameters fitting, let's create a loop in order to try each value of a chosen hyperparameter in a defined range:

final step = 0.001;
final limit = 0.6;
double minError = double.infinity;
double bestLearningRate = 0.0;
for (double rate = step; rate < limit; rate += step) {
  // ...
}

Let's create a logistic regression classifier instance with stochastic gradient descent optimizer in the loop's body:

final logisticRegressor = LogisticRegressor(
        iterationLimit: 100,
        learningRate: rate,
        batchSize: 1,
        learningRateType: LearningRateType.constant,
        fitIntercept: true);

Evaluate our model via accuracy metric:

final error = validator.evaluate(logisticRegressor, featuresMatrix, labels, MetricType.accuracy);
if (error < minError) {
  minError = error;
  bestLearningRate = rate;
}

Let's print score:

print('best error on classification: ${(minError * 100).toFixed(2)}');
print('best learning rate: ${bestLearningRate.toFixed(3)}');

Best model parameters search takes much time so far, so be patient. After the search is over, we will see something like this:

best error on classification: 35.5%
best learning rate: 0.155

All the code above all together:

import 'dart:async';

import 'package:ml_algo/ml_algo.dart';

Future<double> logisticRegression() async {
  final data = Float32x4CsvMLData.fromFile('datasets/pima_indians_diabetes_database.csv');
  final features = await data.features;
  final labels = await data.labels;

  final validator = Float32x4CrossValidator.kFold(numberOfFolds: 7);

  final step = 0.001;
  final limit = 0.6;

  double minError = double.infinity;
  double bestLearningRate = 0.0;

  for (double rate = step; rate < limit; rate += step) {
    final logisticRegressor = LogisticRegressor(
        iterationLimit: 100,
        learningRate: rate,
        batchSize: 1,
        learningRateType: LearningRateType.constant,
        fitIntercept: true);
    final error = validator.evaluate(logisticRegressor, features, labels, MetricType.accuracy);
    if (error < minError) {
      minError = error;
      bestLearningRate = rate;
    }
  }

  print('best error on classification: ${(minError * 100).toFixed(2)}');
  print('best learning rate: ${bestLearningRate.toFixed(3)}');
}

For more examples please see examples folder

Contacts #

If you have questions, feel free to write me on

100
likes
0
pub points
82%
popularity

Publisher

verified publisherml-algo.com

Machine learning algorithms written in native dart (without bindings to any popular ML libraries, pure Dart implementation)

Repository (GitHub)
View/report issues

License

unknown (LICENSE)

Dependencies

csv, ml_linalg

More

Packages that depend on ml_algo