LogisticRegressor.SGD constructor
- DataFrame trainingData,
- String targetName, {
- required LearningRateType learningRateType,
- int iterationsLimit = iterationLimitDefaultValue,
- double initialLearningRate = initialLearningRateDefaultValue,
- double decay = decayDefaultValue,
- int dropRate = dropRateDefaultValue,
- double minCoefficientsUpdate = minCoefficientsUpdateDefaultValue,
- double probabilityThreshold = probabilityThresholdDefaultValue,
- double lambda = lambdaDefaultValue,
- bool fitIntercept = fitInterceptDefaultValue,
- double interceptScale = interceptScaleDefaultValue,
- InitialCoefficientsType initialCoefficientsType = initialCoefficientsTypeDefaultValue,
- num positiveLabel = positiveLabelDefaultValue,
- num negativeLabel = negativeLabelDefaultValue,
- bool collectLearningData = collectLearningDataDefaultValue,
- DType dtype = dTypeDefaultValue,
- Vector? initialCoefficients,
- int? seed,
Creates a LogisticRegressor instance based on Stochastic Gradient Descent algorithm
Parameters:
trainingData
Observations that will be used by the classifier to learn
the coefficients. Must contain targetName
column.
targetName
A string that serves as a name of the target column (a
column that contains class labels or outcomes for the associated
features).
learningRateType
A value defining a strategy for the learning rate
behaviour throughout the whole fitting process.
iterationsLimit
A number of fitting iterations. Uses as a condition of
convergence in the optimization algorithm. Default value is 100
.
initialLearningRate
The initial value defining velocity of the convergence of the
gradient descent optimizer. Default value is 1e-3
.
decay
The value meaning "speed" of learning rate decrease. Applicable only
for LearningRateType.timeBased, LearningRateType.stepBased, and
LearningRateType.exponential strategies
dropRate
The value that is used as a number of learning iterations after
which the learning rate will be decreased. The value is applicable only for
LearningRateType.stepBased learning rate; it will be omitted for other
learning rate strategies
minCoefficientsUpdate
A minimum distance between coefficient vectors in
two contiguous iterations. Uses as a condition of convergence in the
optimization algorithm. If a difference between the two vectors is small
enough, there is no reason to continue fitting. Default value is 1e-12
probabilityThreshold
A probability on the basis of which it is decided,
whether an observation relates to positive class label (see
positiveLabel
parameter) or to negative class label (see negativeLabel
parameter). The greater the probability, the more strict the classifier
is. Default value is 0.5
.
lambda
A coefficient of regularization. Uses to prevent the regressor's
overfitting. The more the value of lambda
, the more regular the
coefficients of the equation of the predicting hyperplane are. Extremely
large lambda
may decrease the coefficients to nothing, otherwise too
small lambda
may be a cause of too large absolute values of the
coefficients, that is also bad.
seed
A seed value that will be used to generate random indices to
select rows from trainingData
. If it's needed to get the same result
every time one trains the classifier, it's needed to specify this value
fitIntercept
Whether or not to fit intercept term. Default value is
false
. Intercept in 2-dimensional space is a bias of the line (relative
to X-axis).
interceptScale
A value, defining a size of the intercept.
initialCoefficientsType
Defines the coefficients that will be
autogenerated at the first optimization iteration. By default
all the autogenerated coefficients are equal to zeroes. If
initialCoefficients
are provided, the parameter will be ignored
initialCoefficients
Coefficients to be used in the first iteration of
optimization algorithm. initialCoefficients
is a vector, length of which
must be equal to the number of features in trainingData
: in case of
logistic regression only one column from trainingData
is used as a
prediction target column, thus the number of features is equal to
the number of columns in trainingData
minus 1 (target column). Keep in
mind, that if your model considers intercept term, initialCoefficients
should contain an extra element in the beginning of the vector and it
denotes the intercept term coefficient
positiveLabel
A value that will be used for the positive class.
By default, 1
.
negativeLabel
A value that will be used for the negative class.
By default, 0
.
collectLearningData
Whether or not to collect learning data, for
instance cost function value per each iteration. Affects performance much.
If collectLearningData
is true, one may access costPerIteration
getter in order to evaluate learning process more thoroughly. Default value
is false
dtype
A data type for all the numeric values, used by the algorithm. Can
affect performance or accuracy of the computations. Default value is
DType.float32
Example:
import 'package:ml_algo/ml_algo.dart';
import 'package:ml_dataframe/ml_dataframe.dart';
void main() {
final samples = getPimaIndiansDiabetesDataFrame().shuffle();
final model = LogisticRegressor.SGD(
samples,
'Outcome',
seed: 10,
iterationsLimit: 50,
initialLearningRate: 1e-4,
learningRateType: LearningRateType.constant,
);
}
Keep in mind that you need to select a proper learning rate strategy for
every particular model. For more details, refer to LearningRateType,
also consider decay
and dropRate
parameters.
Implementation
factory LogisticRegressor.SGD(
DataFrame trainingData,
String targetName, {
required LearningRateType learningRateType,
int iterationsLimit = iterationLimitDefaultValue,
double initialLearningRate = initialLearningRateDefaultValue,
double decay = decayDefaultValue,
int dropRate = dropRateDefaultValue,
double minCoefficientsUpdate = minCoefficientsUpdateDefaultValue,
double probabilityThreshold = probabilityThresholdDefaultValue,
double lambda = lambdaDefaultValue,
bool fitIntercept = fitInterceptDefaultValue,
double interceptScale = interceptScaleDefaultValue,
InitialCoefficientsType initialCoefficientsType =
initialCoefficientsTypeDefaultValue,
num positiveLabel = positiveLabelDefaultValue,
num negativeLabel = negativeLabelDefaultValue,
bool collectLearningData = collectLearningDataDefaultValue,
DType dtype = dTypeDefaultValue,
Vector? initialCoefficients,
int? seed,
}) =>
initLogisticRegressorModule().get<LogisticRegressorFactory>().create(
trainData: trainingData,
targetName: targetName,
optimizerType: LinearOptimizerType.gradient,
iterationsLimit: iterationsLimit,
initialLearningRate: initialLearningRate,
decay: decay,
dropRate: dropRate,
minCoefficientsUpdate: minCoefficientsUpdate,
probabilityThreshold: probabilityThreshold,
lambda: lambda,
regularizationType: RegularizationType.L2,
randomSeed: seed,
batchSize: 1,
fitIntercept: fitIntercept,
interceptScale: interceptScale,
isFittingDataNormalized: false,
learningRateType: learningRateType,
initialCoefficientsType: initialCoefficientsType,
initialCoefficients:
initialCoefficients ?? Vector.empty(dtype: dtype),
positiveLabel: positiveLabel,
negativeLabel: negativeLabel,
collectLearningData: collectLearningData,
dtype: dtype,
);