LinearRegressor constructor
- DataFrame fittingData,
- String targetName, {
- LinearOptimizerType optimizerType = LinearOptimizerType.closedForm,
- int iterationsLimit = iterationLimitDefaultValue,
- LearningRateType learningRateType = learningRateTypeDefaultValue,
- InitialCoefficientsType initialCoefficientsType = initialCoefficientsTypeDefaultValue,
- double initialLearningRate = initialLearningRateDefaultValue,
- double decay = decayDefaultValue,
- int dropRate = dropRateDefaultValue,
- double minCoefficientsUpdate = minCoefficientsUpdateDefaultValue,
- double lambda = lambdaDefaultValue,
- bool fitIntercept = fitInterceptDefaultValue,
- double interceptScale = interceptScaleDefaultValue,
- int batchSize = batchSizeDefaultValue,
- bool isFittingDataNormalized = isFittingDataNormalizedDefaultValue,
- bool collectLearningData = collectLearningDataDefaultValue,
- DType dtype = dTypeDefaultValue,
- RegularizationType? regularizationType,
- int? randomSeed,
- Matrix? initialCoefficients,
Parameters:
fittingData
A DataFrame
with observations that is used by the
regressor to learn coefficients of the predicting hyperplane. Must contain
targetName
column.
targetName
A string that serves as a name of the target column
containing observation labels.
optimizerType
Defines an algorithm of optimization that will be used
to find the best coefficients. Also defines which regularization type
(L1 or L2) one may use to learn a linear regressor. By default -
LinearOptimizerType.closedForm.
iterationsLimit
A number of fitting iterations. Uses as a condition of
convergence in the optimization algorithm. Default value is 100
.
initialLearningRate
The initial value defining velocity of the convergence of
gradient descent-based optimizers. Default value is 1e-3
.
decay
The value meaning "speed" of learning rate decrease. Applicable only
for LearningRateType.timeBased, LearningRateType.stepBased, and
LearningRateType.exponential strategies
dropRate
The value that is used as a number of learning iterations after
which the learning rate will be decreased. The value is applicable only for
LearningRateType.stepBased learning rate; it will be omitted for other
learning rate strategies
minCoefficientsUpdate
A minimum distance between coefficient vectors in
two contiguous iterations. Uses as a condition of convergence in the
optimization algorithm. If difference between the two vectors is small
enough, there is no reason to continue fitting. Default value is 1e-12
lambda
A coefficient of regularization. Uses to prevent the regressor's
overfitting. The more the value of lambda
, the more regular the
coefficients of the equation of the predicting hyperplane are. Extremely
large lambda
may decrease the coefficients to nothing, otherwise too
small lambda
may be a cause of too large absolute values of the
coefficients.
regularizationType
A way the coefficients of the regressor will be
regularized to prevent the model's overfitting.
randomSeed
A seed value that will be passed to a random value generator,
used by stochastic optimizers. Will be ignored, if the solver is not
stochastic. Remember, each time you run the stochastic regressor with the
same parameters but with unspecified randomSeed
, you will receive
different results. To avoid it, define randomSeed
.
batchSize
A size of data (in rows) that will be used for fitting per
one iteration. Applicable not to all optimizers. If gradient-based
optimizer is used and if batchSize
== 1
, stochastic mode will be
activated; if 1
< batchSize
< total number of rows
, mini-batch mode
will be activated; if batchSize
== total number of rows
, full-batch
mode will be activated.
fitIntercept
Whether or not to fit intercept term. Default value is
false
. Intercept in 2-dimensional space is a bias of the line (relative
to X-axis).
interceptScale
A value defining a size of the intercept.
isFittingDataNormalized
Defines whether the fittingData
is normalized
or not. Normalization should be performed column-wise. Normalized data
may be required by some optimizers (e.g., for
LinearOptimizerType.coordinate).
learningRateType
A value defining a strategy for the learning rate
behaviour throughout the whole fitting process.
initialCoefficientsType
Defines the coefficients that will be
autogenerated at the first iteration of optimization. By default,
all the autogenerated coefficients are equal to zeroes at the start.
If initialCoefficients
are provided, the parameter will be ignored.
initialCoefficients
Coefficients to be used during the first iteration of
optimization algorithm. initialCoefficients
should have length that is
equal to the number of features in the fittingData
.
collectLearningData
Whether or not to collect learning data, for
instance cost function value per each iteration. Affects performance much.
If collectLearningData
is true, one may access costPerIteration
getter in order to evaluate learning process more thoroughly.
dtype
A data type for all the numeric values, used by the algorithm. Can
affect performance or accuracy of the computations. Default value is
DType.float32.
Implementation
factory LinearRegressor(
DataFrame fittingData,
String targetName, {
LinearOptimizerType optimizerType = LinearOptimizerType.closedForm,
int iterationsLimit = iterationLimitDefaultValue,
LearningRateType learningRateType = learningRateTypeDefaultValue,
InitialCoefficientsType initialCoefficientsType =
initialCoefficientsTypeDefaultValue,
double initialLearningRate = initialLearningRateDefaultValue,
double decay = decayDefaultValue,
int dropRate = dropRateDefaultValue,
double minCoefficientsUpdate = minCoefficientsUpdateDefaultValue,
double lambda = lambdaDefaultValue,
bool fitIntercept = fitInterceptDefaultValue,
double interceptScale = interceptScaleDefaultValue,
int batchSize = batchSizeDefaultValue,
bool isFittingDataNormalized = isFittingDataNormalizedDefaultValue,
bool collectLearningData = collectLearningDataDefaultValue,
DType dtype = dTypeDefaultValue,
RegularizationType? regularizationType,
int? randomSeed,
Matrix? initialCoefficients,
}) =>
initLinearRegressorModule().get<LinearRegressorFactory>().create(
fittingData: fittingData,
targetName: targetName,
optimizerType: optimizerType,
iterationsLimit: iterationsLimit,
learningRateType: learningRateType,
initialCoefficientsType: initialCoefficientsType,
initialLearningRate: initialLearningRate,
decay: decay,
dropRate: dropRate,
minCoefficientsUpdate: minCoefficientsUpdate,
lambda: lambda,
regularizationType: regularizationType,
fitIntercept: fitIntercept,
interceptScale: interceptScale,
randomSeed: randomSeed,
batchSize: batchSize,
initialCoefficients: initialCoefficients,
isFittingDataNormalized: isFittingDataNormalized,
collectLearningData: collectLearningData,
dtype: dtype,
);