DecisionTreeClassifier constructor

DecisionTreeClassifier(
  1. DataFrame trainData,
  2. String targetName, {
  3. num minError = 0.5,
  4. int minSamplesCount = 1,
  5. int maxDepth = 10,
  6. DType dtype = dTypeDefaultValue,
  7. TreeAssessorType assessorType = TreeAssessorType.gini,
})

Parameters:

trainData A DataFrame with observations that will be used to build a decision tree. Must contain targetName column.

targetName A name of a column in trainData that contains class labels

minError A value within the range 0..1 (both inclusive). The value is a minimal error on a single decision tree node and is used as a stop criterion to avoid further decision tree node splitting: if the node is good enough, there is no need to split it and thus it will become a leaf.

minSamplesCount A minimal number of samples (observations) on the decision's tree node. The value is used as a stop criteria to avoid further decision tree node splitting: if the node contains less than or equal to minSamplesCount observations, the node turns into the leaf.

maxDepth A maximum number of decision tree levels.

assessorType Defines an assessment type that will be applied to the data in order to decide how to split the subset while building the tree. Default value is TreeAssessorType.gini

Possible values of assessorType :

TreeAssessorType.gini The algorithm makes a decision on how to split a subset of data based on the Gini index

TreeAssessorType.majority The algorithm makes a decision on how to split a subset of data based on a major class.

Implementation

factory DecisionTreeClassifier(
  DataFrame trainData,
  String targetName, {
  num minError = 0.5,
  int minSamplesCount = 1,
  int maxDepth = 10,
  DType dtype = dTypeDefaultValue,
  TreeAssessorType assessorType = TreeAssessorType.gini,
}) =>
    initDecisionTreeModule().get<DecisionTreeClassifierFactory>().create(
          trainData,
          targetName,
          dtype,
          minError,
          minSamplesCount,
          maxDepth,
          assessorType,
        );