Hello and welcome back everyone. In this lesson, I will introduce the topic of risk stratification. The term risk stratification is often associated with the healthcare industry, but really this is a general topic use in almost all industries. For example, market researchers like to identify and categorize people who have a higher or lower probability of buying a product. They can then target their marketing outreach on these subpopulations. Political scientists stratify potential voters into groups associated with the likelihood of voting for particular candidates. We could probably come up with hundreds of examples from nearly all industries. The key concept is that it can be useful to put people or things into groups with various risks of experiencing some type of event or outcome. Being able to categorize risk levels for patients or health plan members can create impressive value for an organization. For example, you can focus efforts on patients that are most likely to need a particular treatment. At the end of this lesson, you will be able to describe to an analytical team how risk stratification can categorize patients who might have specific needs or problems. Although the benefits can be large, you will also learn how to communicate scenarios when risk stratification can fail because of regression to the mean or data quality issues. Let's start with the question, what is risk stratification? So, what is risk stratification? First, let me define risk and stratification separately. Concerning risks, we can think about patients or health plan members being at risk for specific types of events that might occur. Examples include future high costs or mortality. These are outcomes that impact health and finances. Thus it is important to understand who is at greater risk for outcomes of interests. It is impossible to find opportunities to prevent or reduce these negative outcomes. Stratification is putting the outcomes of interest into actionable groups or strata. For example, patients can be categorized as being low, medium, or high risk. Putting these concepts together, risk stratification is a way of categorizing individuals based on their risk of experiencing an outcome. So, an important question comes up, what is the value to organizations to stratify patients by risk? To answer this, consider how different types of patients might respond better or worse to treatments or interventions. This follows the idea that people have different levels of risk for specific outcomes. Thus it might be more effective to focus scarce resources on those who might have the greatest need. In some, stratification is putting individuals into groups or strata so that they can be targeted for actionable interventions. I offer a few examples of the application of risk stratification to healthcare. First, disease management programs and care coordination programs. The objective of these programs is to proactively help people with chronic health conditions to better manage their condition. In addition, preventing chronic disease in the first place also falls within this domain. Second, cost containment programs attempt to control the rapidly rising healthcare costs in the United States. Many involved in healthcare industry, especially the payers of the services have a big incentive to try to reduce the rapid increase in cost expenditures associated with healthcare services. In addition to effectively using interventions, it is often important to be able to predict future events. For example, an organization interested in cost containment might ask; which patients might be expensive next year? Risk stratification models that identify groups likely to have higher costs in the future can help address this critical question. Another old idea fits well with this conceptual approach to risk stratification. This is Pareto's principle. The idea was created by Vilfredo Pareto, who was a 20th-century Italian economist. He observed that 80 percent of the wealth in his country was owned by about 20 percent of the people. Thus, Pareto created the 80-20 rule that you've probably heard of. Just as with economic inequality, healthcare costs are often disproportionate. A small group of people often use the most healthcare resources. For example, as I just mentioned in the past lesson, researchers have identified super-utilizers. These are the five percent of the patients that use up between 30 and 50 percent of the total medical resources. In some, Pareto's principle is an important concept for stratification. Some people are very different than the rest of the population, and if outcomes can be predicted for the specific individuals, outcomes can be improved. When seeking to stratify risk, analysts need to consider selection bias and regression to the mean since both impact outcomes while stratifying risk. Selection bias refers to the idea that specific members are selected based on crossing a threshold in the current timeframe. Thus, outliers are selected based on the current timeframe. For example, a member is selected for a disease management program if their annual costs reached, say $10,000. Selection bias is related to another potentially more damaging problem, in this situation, it's known as regression to the mean. Regression to the mean suggests that the measurement of an event over time will likely move towards its average. For example, consider how sick people this year are likely to be healthier or less expensive next year. This relates to actually good news. People who get sick often get better. Regression to the mean can create big problems for analysts creating risk stratification information. For example, a disease management program might stratify patients into disease severity groups and then invest resources on supposedly high-risk groups. For example, a disease management program might stratify patients into disease severity groups and then invest resources on a supposedly high-risk group. Yet due to regression to the mean, many of these patients might get better on their own. A second problem relates to the evaluation of the program. It can be difficult for researchers or analysts to control for selection biases and regression to the mean. As a result, program managers might attribute benefits to the program rather than individuals getting better on their own. Now let's talk about another critical problem for risk stratification; data quality. Data quality is almost always a critical aspect of analytical work and risk stratification is not an exception. As with all modeling endeavors, if the input data has problems, then the model and associated information might also be problematic. In the context of risk stratification, data quality issues could lead to inaccurate targeting of the most at-risk members. This is more problematic when using administrative healthcare data, but there could also be problems using clinical data. There are many concepts associated with data quality, but a few critical ones are completeness, accuracy, and timeliness. First, completeness. New enrollees or patients may not have enough data generated in the records to correctly assess their risk. First, completeness. New enrollees or patients may not have enough data generated in the records to correctly assess their risk. Accuracy has to do with how well data values are captured for specific fields. As data analysts quickly learn, incorrect data values are present in large administrative datasets and in clinical databases. One reason for inaccurate information is that clinicians or other providers document with notes and text fields but then fail to use the forms correctly that lead to higher quality discrete data. If the data are inaccurate related to the risk outcomes of interests, the models will also not be accurate. For example, if disease severity is not reliably documented in the data, the risk strata output from the models will also be inaccurate. Finally, timeliness is important. It could be that the data are complete and accurate; however, the important data for risk strata are also slow to get compiled into the data warehouse or analytical systems. As an example, claims and encounter data often have lags, especially for high-cost hospital patients. If a researcher is trying to use recent data for modeling expensive hospital patients, they may not have an information to correctly create the risk strata. Okay. Excellent. I hope this has provided you with a good foundation for our next lesson, where we will go into more details about how to perform risk stratification. See you soon.