Excel and sheets include a large collections of statistical functions. To the most useful in developing models to help predict future events are correlation and regression. Correlation helps us make sense out of the data we collect in our business. For example, does the length of service of our employees of field sales department correlate to success in sales? A correlation exists when two sets of data change together. Regression is a statistical technique for using one data set to estimate a second correlated data set. For example, in the market for diamonds, the market price is correlated with its weight. Using regression analysis, we can predict the probable price of a new diamond using its weight. In this lecture we will use statistical functions to measure correlations between variables and our business models. Will review what the correlation coefficient tells us about the strength of the relationship between two variables. We'll explore the diamond market example again to practice the use of regression. For predicting a particular diamonds market price and we'll use multiple regression to improve our forecast. Now let's turn to this spread sheet. One of the most common of the classic business models is the regression analysis model. To demonstrate regression, let's use a dataset that describes the sales of diamonds in Singapore. The weight of each diamond in a sale is located in column A. The corresponding price is in column B. If we plot the two data sets using the scatter plot as I have done here. The data appear to be almost a straight line but not perfectly. I can use the Excel correlation function to see how closely changes in weight correlate to changes in price that is how much to the two numbers vary together. So, to do that I'll used the Excel function CORREL. CORREL asked for two arrays, in this case those are A4 to A51. Which is the weight. And secondly B4, to B51. Which is the price. When I run that correlation, I get the result, 0.989. Correlations run from minus 1 to plus 1. A zero would indicate no correlation, that is the two numbers don't vary together at all. This number 0.98 is a very high correlation. Let me clear away chart one. Regression analysis takes this one step further. It asks the question, to what extent can the change in one variable be predicted by the change in another? Now, this presumes that we have logic or theory suggesting that one number drives the change in another number. In this case are theory is that wait is driving price. To run our regression choose data. Data analysis and roll down to the regression option clicking down OK For the y variable, that is the price we're predicting, choose the range B4 to B51. For the x variable, choose the weight A4 to A51. Click here to indicate that are ranges include headers. Click the Labels option to indicate that the ranges we have identified include labels then click Ok. Once you click Ok, a regression report will appear in another sheet. The summary output as a regression appears in this second tab. The key numbers for here are the three regressions that are listed at the top. This R represents the correlation we saw before. R squared is the covariance, the amount of change in y that is due to a change in x according to the regression. The adjusted R squared makes allowances for the relatively small sample size as you here only 47 observations. But, this model suggests that 97 or 98% of the change in price can be attributed to a change in the weight. That's the classic regression analysis model used in many business scenarios. Those include scenarios covered by other courses in the business and financial modeling specialization. Let's summarize Module 3. In this module, we used some statistical functions to review some historical sales data. Those included means and standard deviations, and we then applied what we learned from that analysis to a sales forecast. We also used functions for including uncertainty in our forecast model. We used the rand() and randbetween() functions for generating random numbers. We talked about the difference between uniform and normal distributions and we used the data analysis tool pack to generate random numbers that were pulled from the normal distribution. We looked at forecast models that were structured according to discrete time with columns or rows indicating the passage of time. We then compare that to model structures that allow for the treatment of time as a continuous variable. We looked at linear and proportionate growth in metrics over time. We begin to address the topic of nonlinear functions for showing either growth or declined. We practice the spreadsheet functions EXP as well as the forecast and growth functions. We created the model that link the probability of the events in the series and calculated the probability of different series of events using a probability tree. We also incorporated some revenues and expenses into those trees in order to create a decision tree. Modeling the expected value of any given branch. Finally, we examined two related sets of data using spreadsheet functions for measuring the strength of correlation between the two sets. We also use tools for running a regression analysis using values from one dataset to predict the value of another.