Social Determinants of Health: The Importance of Data, Part 1

  • This page as PDF


Tune into our fifth episode in the Avalere Health Essential Voice podcast series focused on social determinants of health (SDOH). In Part 1 of this segment, our experts from Avalere’s Health Economics and Advanced Analytics practice discuss the importance of SDOH data, how health plans are increasingly utilizing that data, and the ongoing limitations to data access.
Please note: This is an archived post. Some of the information and data discussed in this article may be out of date. It is preserved here for historical reference but should not be used as the basis for business decisions. Please see our main Insights section for more recent posts.
“The evidence is becoming clearer that SDOH impact utilization of health services like hospitalizations and emergency room visits, which in turn impact healthcare costs. But social determinants also impact quality measures, and in this world of value-based purchasing where you're getting penalties or bonuses based on your quality measure outcomes, there’s skin in the game.” Christie Teigland


Guest Speaker
John E. Linnehan , SVP, Customer Delivery, Inovalon
John E Linnehan leads a team of consultants, strategists, and data scientists focused on driving healthcare improvement through value demonstration, stakeholder engagement, predictive and applied analytics, and data innovations.
Guest Speaker
Christie Teigland , VP, Research Science and Advanced Analytics, Inovalon
Christie Teigland, PhD, leads the design and implementation of studies focused on comparative effectiveness, predictive analytics, and performance measure development and testing.

This interview was originally published as a podcast. The audio is no longer available, but you can read the transcript below. For updates on our newly released content, visit our Insight Subscription page.

If you would like to watch the video version, please visit our video page.


John: Hello, and welcome to the Avalere Health Essential Voice series focused on social determinants of health (SDOH). My name is John Linnehan. I’m Practice Director of the Health Economics and Advanced Analytics team here at Avalere.

I’m joined by Dr. Christie Teigland, a Principal on our team. Christie is a member of the National Quality Forum (NQF) Disparities Standing Committee, and co-chair of the NQF Scientific Methods Panel. She’s a resident expert on the topic of SDOH, health disparities, and the use of data sources to bring these data points into research in a meaningful way.

Welcome, Christie. This is a critical topic for us. Avalere has observed a lot of activity around SDOH in the marketplace over the past few years, first focused on our health and health plan clients, with some interesting trends emerging over the past year in transitioning this research into other sectors, such as life sciences.

Today’s discussion will look at some historical trends of how SDOH have been used in research. We’ll talk about the data points that allow us to look at these inputs, and then we’ll talk about some of the more recent activity that we’ve seen, shifting focus from the health plan orientation to a life sciences and value demonstration orientation.

Christie: Thank you. I’m really happy to participate today. As you know, I’ve been focusing on the impact of SDOH on health outcomes for more than a decade and it’s one of my passions.

This is a really timely discussion because this issue, which was getting a lot of attention, is now getting even more because of the heightened focus on health disparities that have come to light during this COVID-19 crisis.

John: There’s certainly a lot of activity and I think it’s a perfect time to talk about how different players in the healthcare system can look at these data points in new ways, incorporate them in research and value demonstration efforts, and really play a meaningful role in addressing these disparities.

We’ve heard plenty of statistics, such as 80% of factors contributing to health outcomes are related to SDOH, things like social and economic factors, personal behaviors, and physical environment, which means that only 20% of health outcomes are related to clinical care or treatment.

I’ve heard you speak in several public settings and you often say that your zip code is just as important, if not more important, than your genetic code. This is a fascinating topic that I want to address further. I’d like to start by talking about how we’ve been incorporating this type of work in our research and how you’ve observed it being incorporated in the marketplace, especially from a health plan perspective. Health plans were the earliest organizations to look at SDOH and I’d love some background on the work that you’ve been doing with plans and what they’ve been undertaking.

Christie: Yeah, health plans are increasingly focused on this topic. The evidence is becoming clearer that SDOH impact utilization of health services like hospitalizations and emergency room visits, which in turn impact healthcare costs. But social determinants also impact quality measures, and in this world of value-based purchasing where you’re getting penalties or bonuses based on your quality measure outcomes, there’s skin in the game. Medicare Advantage plans now offer more non-medical services to their members. They’re doing things like transportation, fresh food delivery, and in-home aides. They’re even providing housing, which they’ve shown to provide a big return on investment because it can reduce hospitalizations and readmissions, and that’s an important quality measure.

Let me give you an example of one study we conducted recently with a large Medicare Advantage health plan. It was eye opening because, through our data analytics, we were able to link SDOH at a very granular level, at the 9-digit zip code level. We were able to identify a group of their members who were not dual eligible, which describes low-income individuals covered by both Medicare and Medicaid. These members were not dual eligible, but they were poor. These low-income non duals had worse outcomes than the dual-eligible members of that plan who had the extra benefits that Medicaid provides. They had higher readmission rates and lower use of adherence to medications, which are important in maintaining your overall health.

This shows the importance of looking at a wide range of social determinants, not just dual-eligible status, which is often used as a proxy for all of the other social determinants. It doesn’t tell us the full story. We really need more data on social determinants.

John: Right. And speaking of more data, health plans’ focus on social determinants to do things like design benefits, understanding how their members consume healthcare and respond to healthcare interventions so that the right benefits can be made available to drive outcomes and control costs, is an interesting takeaway.

It does stand to reason that key interventions that health plans can utilize to influence patient outcomes are pharmaceutical drugs, choice of device or procedures to cover, or allowing certain diagnostic tests. So, there are there are several interventions that health plans can provide that life sciences companies, drug companies, device companies, and biotech diagnostics have an opportunity to influence.

Given this focus on social determinants, there’s an opportunity for life sciences companies to incorporate these elements into research to stratify results or control for key elements, like living situation, income, education level, and geography, and better target value demonstration efforts to plans so the plans can in turn make the appropriate access decisions on behalf of their patients.

One of the challenges that we’ve seen is that the primary currency of real-world evidence is health insurance claims, and to a lesser degree, electronic health records (EHR). In many cases, these datasets don’t have that granular social determinants data that we would expect to see in our work. Given this, what are the types of data sources that exist in social determinants that one could incorporate along with claims or EHR data?

Christie: That is the million-dollar question. Access to good data on social risk factors is still the biggest barrier to using them in all the ways you described. They’re so important, but you need to have the data. You have to know what it is.

There are several sources of data, but they all have limitations, and they’re not standardized to allow them to be used across various populations. So, as you mentioned, claims and EHR data are often used, but they typically don’t have good or complete data, even on things like race or ethnicity of members. For example, there’s been a lot of talk about the use of Z codes. These are ICD-10 codes in insurance claims that capture data on SDOH. They document things like low education status, literacy, low income, and inadequate housing. They ask all those questions, but unfortunately, they’re not used very often. I did a poll recently on a webinar with 600 participants and only 4% said they were using these codes to document social determinants.

John: I’ve heard similar things. The good news is that we’ve started to look at this in our work. We have the benefit of the MORE2 registry dataset in house, which is everything from fee-for-service, Medicare, and Medicare Advantage, to commercial and managed Medicaid. Certainly, we’ve taken advantage of multi payer and the diverse and large nature of the dataset to look at Z codes, but we still need to augment the data that we utilize to get to that level of granularity that you mentioned.

So, outside of claims, what are some other sources that one could consider when looking to bring these elements into research?

Christie: The majority of data that the health plans use on social needs comes from health plan screening. During an office visit, they will ask these questions or do a survey. In fact, in 2017, I saw a survey that said about 88% of healthcare organizations are now doing some kind of screening like this with their members. I’m guessing that’s higher now given the focus on disparities.

The problem with using these data is that they’re not consistently collected. You can’t use them across organizations or geographic areas. For example, you might have a question about income, but the buckets will be different. So how do you unpack that and compare one organization to another? It just doesn’t work. There are some efforts underway to standardize the collection of this type of data on social risk factors. There’s a big project going on called the Gravity Project, but it’s not on the immediate horizon.

There are also several public and private data sources that are available. Researchers most often use the American Community Survey, which is census data that’s collected on an ongoing annual basis. They also use the medical expenditure panel survey, which is another longitudinal survey that’s conducted on a regular basis. So, we see these kinds of sources being used regularly in published research when they’re trying to adjust quality measures for social risk factors.

What we found is that the data in these public sources are just not granular enough to really capture social risk factors at the patient level. For example, if you think about census data, which you can link to your patients based on their 5-digit zip code, that only gives you 42,000 geographic areas across the United States. If we think of a 5-digit zip code in Manhattan, we know that it has very wealthy people and very poor people. When you average that out, it washes out the effect. You lose that granularity, and you’re not going to see an impact of social risk factors on outcomes. Even the data from the American Community Survey, which is at what they call the census block group level, only covers 220,000 geographic areas. It might sound like a lot, but those are still big areas if you think of the United States.

What Avalere has been using for the last 8 or 10 years is a private source of data that we can link to our members’ 9-digit zip code level. Just to put that in perspective, that’s 30 million geographic areas compared to 220,000. That’s an average of 5 households, so if you think about the 4 households around you, you probably look pretty much like them. We found the data at that very granular level is highly predictive of health behaviors. It’s not quite at the patient level, but it’s about as good as we can get. And so that’s the data that we are we are using and have found to be very helpful.

John: That level of granularity is fascinating. With the benefit of a claims dataset as a base that spans over 324 or 325 million members, being able to bring that data on education, geography, living situations, spending habits, race, and ethnicity into the research allows us to look at a such a nuanced level and prepare evidence in a way that really resonates and speaks to those populations and supports value stories within those populations.

What I want to do now is say “thank you.” That will conclude Part 1 of this episode focused on SDOH, but Christie and I will be back for Part 2, where we’ll make the transition from how we take advantage of the trends we’ve seen in the health plan space, the very granular sources of social determinant data that exist, to how we have been seeing other companies bring these data elements into their research, especially from the perspective of life sciences organizations, pharmaceuticals, and medical devices.

If you’d like to talk further about any of these topics, you can feel free to reach out to Christie or myself. Our contact information is on Avalere’s website where you can get a recording of this podcast at Thank you.

From beginning to end, our team synergy
produces measurable results. Let's work together.

Sign up to receive more insights about Health Equity
Please enter your email address to be notified when new Health Equity insights are published.

Back To Top