LinearRegressor constructor

LinearRegressor(
  1. DataFrame fittingData,
  2. String targetName, {
  3. LinearOptimizerType optimizerType = LinearOptimizerType.closedForm,
  4. int iterationsLimit = iterationLimitDefaultValue,
  5. LearningRateType learningRateType = learningRateTypeDefaultValue,
  6. InitialCoefficientsType initialCoefficientsType = initialCoefficientsTypeDefaultValue,
  7. double initialLearningRate = initialLearningRateDefaultValue,
  8. double decay = decayDefaultValue,
  9. int dropRate = dropRateDefaultValue,
  10. double minCoefficientsUpdate = minCoefficientsUpdateDefaultValue,
  11. double lambda = lambdaDefaultValue,
  12. bool fitIntercept = fitInterceptDefaultValue,
  13. double interceptScale = interceptScaleDefaultValue,
  14. int batchSize = batchSizeDefaultValue,
  15. bool isFittingDataNormalized = isFittingDataNormalizedDefaultValue,
  16. bool collectLearningData = collectLearningDataDefaultValue,
  17. DType dtype = dTypeDefaultValue,
  18. RegularizationType? regularizationType,
  19. int? randomSeed,
  20. Matrix? initialCoefficients,
})

Parameters:

fittingData A DataFrame with observations that is used by the regressor to learn coefficients of the predicting hyperplane. Must contain targetName column.

targetName A string that serves as a name of the target column containing observation labels.

optimizerType Defines an algorithm of optimization that will be used to find the best coefficients. Also defines which regularization type (L1 or L2) one may use to learn a linear regressor. By default - LinearOptimizerType.closedForm.

iterationsLimit A number of fitting iterations. Uses as a condition of convergence in the optimization algorithm. Default value is 100.

initialLearningRate The initial value defining velocity of the convergence of gradient descent-based optimizers. Default value is 1e-3.

decay The value meaning "speed" of learning rate decrease. Applicable only for LearningRateType.timeBased, LearningRateType.stepBased, and LearningRateType.exponential strategies

dropRate The value that is used as a number of learning iterations after which the learning rate will be decreased. The value is applicable only for LearningRateType.stepBased learning rate; it will be omitted for other learning rate strategies

minCoefficientsUpdate A minimum distance between coefficient vectors in two contiguous iterations. Uses as a condition of convergence in the optimization algorithm. If difference between the two vectors is small enough, there is no reason to continue fitting. Default value is 1e-12

lambda A coefficient of regularization. Uses to prevent the regressor's overfitting. The more the value of lambda, the more regular the coefficients of the equation of the predicting hyperplane are. Extremely large lambda may decrease the coefficients to nothing, otherwise too small lambda may be a cause of too large absolute values of the coefficients.

regularizationType A way the coefficients of the regressor will be regularized to prevent the model's overfitting.

randomSeed A seed value that will be passed to a random value generator, used by stochastic optimizers. Will be ignored, if the solver is not stochastic. Remember, each time you run the stochastic regressor with the same parameters but with unspecified randomSeed, you will receive different results. To avoid it, define randomSeed.

batchSize A size of data (in rows) that will be used for fitting per one iteration. Applicable not to all optimizers. If gradient-based optimizer is used and if batchSize == 1, stochastic mode will be activated; if 1 < batchSize < total number of rows, mini-batch mode will be activated; if batchSize == total number of rows, full-batch mode will be activated.

fitIntercept Whether or not to fit intercept term. Default value is false. Intercept in 2-dimensional space is a bias of the line (relative to X-axis).

interceptScale A value defining a size of the intercept.

isFittingDataNormalized Defines whether the fittingData is normalized or not. Normalization should be performed column-wise. Normalized data may be required by some optimizers (e.g., for LinearOptimizerType.coordinate).

learningRateType A value defining a strategy for the learning rate behaviour throughout the whole fitting process.

initialCoefficientsType Defines the coefficients that will be autogenerated at the first iteration of optimization. By default, all the autogenerated coefficients are equal to zeroes at the start. If initialCoefficients are provided, the parameter will be ignored.

initialCoefficients Coefficients to be used during the first iteration of optimization algorithm. initialCoefficients should have length that is equal to the number of features in the fittingData.

collectLearningData Whether or not to collect learning data, for instance cost function value per each iteration. Affects performance much. If collectLearningData is true, one may access costPerIteration getter in order to evaluate learning process more thoroughly.

dtype A data type for all the numeric values, used by the algorithm. Can affect performance or accuracy of the computations. Default value is DType.float32.

Implementation

factory LinearRegressor(
  DataFrame fittingData,
  String targetName, {
  LinearOptimizerType optimizerType = LinearOptimizerType.closedForm,
  int iterationsLimit = iterationLimitDefaultValue,
  LearningRateType learningRateType = learningRateTypeDefaultValue,
  InitialCoefficientsType initialCoefficientsType =
      initialCoefficientsTypeDefaultValue,
  double initialLearningRate = initialLearningRateDefaultValue,
  double decay = decayDefaultValue,
  int dropRate = dropRateDefaultValue,
  double minCoefficientsUpdate = minCoefficientsUpdateDefaultValue,
  double lambda = lambdaDefaultValue,
  bool fitIntercept = fitInterceptDefaultValue,
  double interceptScale = interceptScaleDefaultValue,
  int batchSize = batchSizeDefaultValue,
  bool isFittingDataNormalized = isFittingDataNormalizedDefaultValue,
  bool collectLearningData = collectLearningDataDefaultValue,
  DType dtype = dTypeDefaultValue,
  RegularizationType? regularizationType,
  int? randomSeed,
  Matrix? initialCoefficients,
}) =>
    initLinearRegressorModule().get<LinearRegressorFactory>().create(
          fittingData: fittingData,
          targetName: targetName,
          optimizerType: optimizerType,
          iterationsLimit: iterationsLimit,
          learningRateType: learningRateType,
          initialCoefficientsType: initialCoefficientsType,
          initialLearningRate: initialLearningRate,
          decay: decay,
          dropRate: dropRate,
          minCoefficientsUpdate: minCoefficientsUpdate,
          lambda: lambda,
          regularizationType: regularizationType,
          fitIntercept: fitIntercept,
          interceptScale: interceptScale,
          randomSeed: randomSeed,
          batchSize: batchSize,
          initialCoefficients: initialCoefficients,
          isFittingDataNormalized: isFittingDataNormalized,
          collectLearningData: collectLearningData,
          dtype: dtype,
        );