confused about random_state in decision tree of scikit learn

This is explained in the documentation The problem of learning an optimal decision tree is known to be NP-complete under several aspects of optimality and even for simple concepts. Consequently, practical decision-tree learning algorithms are based on heuristic algorithms such as the greedy algorithm where locally optimal decisions are made at each node. Such algorithms … Read more

What does `sample_weight` do to the way a `DecisionTreeClassifier` works in sklearn?

Some quick preliminaries: Let’s say we have a classification problem with K classes. In a region of feature space represented by the node of a decision tree, recall that the “impurity” of the region is measured by quantifying the inhomogeneity, using the probability of the class in that region. Normally, we estimate: Pr(Class=k) = #(examples … Read more

R: Obtaining Rules from a Function

This isn’t my area of expertise, but perhaps this function (from https://www.togaware.com/datamining/survivor/Convert_Tree.html) will do what you want to do: library(rpart) car.test.frame$Reliability = as.factor(car.test.frame$Reliability) z.auto <- rpart(Reliability ~ ., car.test.frame) plot(z.auto, margin = 0.25) text(z.auto, pretty = TRUE, cex = 0.8, splits = TRUE, use.n = TRUE, all = FALSE) list.rules.rpart <- function(model) { if (!inherits(model, … Read more

Why is Random Forest with a single tree much better than a Decision Tree classifier?

The random forest estimators with one estimator isn’t just a decision tree? Well, this is a good question, and the answer turns out to be no; the Random Forest algorithm is more than a simple bag of individually-grown decision trees. Apart from the randomness induced from ensembling many trees, the Random Forest (RF) algorithm also … Read more

How to extract the decision rules from scikit-learn decision-tree?

I believe that this answer is more correct than the other answers here: from sklearn.tree import _tree def tree_to_code(tree, feature_names): tree_ = tree.tree_ feature_name = [ feature_names[i] if i != _tree.TREE_UNDEFINED else “undefined!” for i in tree_.feature ] print “def tree({}):”.format(“, “.join(feature_names)) def recurse(node, depth): indent = ” ” * depth if tree_.feature[node] != _tree.TREE_UNDEFINED: … Read more