Let's talk about what it means to build a classification model and how building a model differs from applying a model. After this video, you will be able to discuss what building a classification model means. Explain the difference between building and applying a model. And summarize why the parameters of a model needs to be adjusted. A machine learning model is a mathematical model. In the general sense, this means that the model has parameters and uses equations to determine the relationship between its inputs and outputs. The parameters are used by the model to modify the inputs to generate the outputs. The model adjusts its parameters in order to correct or refine this input, output relationship. Here's an example of a simple model. This mathematical model represents a line. y is the output, x is the input, m determines the slope of the line and b determines the y-intercept or where the line crosses the y-axis. m and b are the model's parameters. Given a specific value for x, the model uses as parameters along with x to determine y. By adjusting the values for the parameters m and b, the model can adjust how the input x matched to the output y. Here we see how the output y changes for the same value of input x, when parameter b changes. Recall that b is the y-intercept, or where the line crosses the y-axis. The value of b is +1 for the red line and -1 for the blue line. For the input x=1, the value of y is 3 for the red line, as indicated by the red arrow. For the blue line, when the parameter b changes from +1 to -1, for x=1, the value of y is 1, as indicated by the blue arrow. So we see that with just a simple change in one model parameter, the input to output mapping changes. A machine learning model works in a similar way. It maps input values to output values. And it adjusts the parameters in order to correct or refine this input-output mapping. The parameters of a machine learning model are adjusted or estimated from the data using a learning algorithm. This, in essence, is what is involved in building a model. This process is referred to by many terms, such as model building, model creation, model training and model fitting. In building a model, we want to adjust the parameters in order to reduce the model's error. In the case of supervised tasks, such as classification, this means getting the model's outputs to match the targets or desired outputs as much as possible. Since the classification task is to predict the correct category or class, given the input variables, you can think of the classification problem visually as carving out the input space as regions corresponding to the different class labels. In this diagram for example, the classification model needs to form the boundaries to define the regions separating red triangles from blue diamonds, from green circles, from yellow squares. In this example, if a sample falls within the region in the upper right corner, it will be classified as a blue diamond. Classification decisions are based on these regions, and the regions are defined by the boundaries, as indicated by the dashed lines in the diagram. So these boundaries are referred to as decision boundaries. Building a classification then means using the data to adjust the model's parameters in order to form decision boundaries to separate the target classes. Note that the term classifier is often used to mean classification model. In general, building a classification model, as well as other machine learning models, involves two phases. The first is the training phase, in which the model is constructed and its parameters adjusted using as what referred to as training data. Training data is the data set used to train or create a model. The second is the testing phase. This is where the learned model is applied to new data. That is, data not used in training the model. Here's another way to look at the two phases. In a training phase, the learning algorithm uses the training data to adjust the model's parameters to minimize errors. At the end of the training phase, you get the trained model. In the testing phase, the trained model is applied to test data. Test data is separate from training data and is previously unseen by the model. The model is then evaluated on how it performs on the test data. The goal in building a classifier model is to have the model perform well on training, as well as test data. We will discuss in more detail the use of training and test data sets in the next module, when we discuss model evaluation. To adjust a model's parameters, we need to apply a learning algorithm. We will discuss the specific algorithms to build a classification model in the next few lectures.