In this lecture, we will discuss how overfitting occurs with decision trees and how it can be avoided. After this video, you will be able to discuss overfitting in the context of decision tree models. Explain how overfitting is addressed in decision tree induction. Define pre-pruning and post-pruning. In our lecture on decision trees, we discussed that during the construction of a decision tree, also referred to as tree induction, the tree repeatedly splits the data in a node in order to get successively paired subsets of data. Note that a decision tree classifier can potentially expand its nodes until it can perfectly classify samples in the training data. But if the tree grows nodes to fit the noise in the training data, then it will not classify a new sample well. This is because the tree has partitioned the input space according to the noise in the data instead of to the true structure of a data. In other words, it has overfit. How can overfitting be avoided in decision trees? There are two ways. One is to stop growing the tree before the tree is fully grown to perfectly fit the training data. This is referred to as pre-pruning. The other way to avoid overfitting in decision trees is to grow the tree to its maximum size and then prune the tree back by removing parts of the tree. This is referred to as post-pruning. In general, overfitting occurs because the model is too complex. For a decision tree model, model complexity is determined by the number of nodes in the tree. Addressing overfitting in decision trees means controlling the number of nodes. Both methods of pruning control the growth of the tree and consequently, the complexity of the resulting model. With pre-pruning, the idea is to stop tree induction before a fully grown tree is built that perfectly fits the training data. To do this, restrictive stopping conditions for growing nodes must be used. For example, a nose stops expanding if the number of samples in the node is less than some minimum threshold. Another example is to stop expanding a note if the improvement in the impurity measure falls below a certain threshold. In post-pruning, the tree is grown to its maximum size, then the tree is pruned by removing nodes using a bottom up approach. That is, the tree is trimmed starting with the leaf nodes. The pruning is done by replacing a subtree with a leaf node if this improves the generalization error, or if there is no change to the generalization error with this replacement. In other words, if removing a subtree does not have a negative effect on the generalization error, then the nodes in that subtree only add to the complexity of the tree, and not to its overall performance. So those nodes should be removed. In practice, post-pruning tends to give better results. This is because pruning decisions are based on information from the full tree. Pre-pruning, on the other hand, may stop the tree growing process prematurely. However, post-pruning is more computationally expensive since the tree has to be expanded to its full size. In summary, to address overfitting in decision trees, tree pruning is used. There are two pruning methods, pre-pruning and post-pruning. Both methods control the complexity of the tree model.