Big Data

Thoughts on the Strata Rx Healthcare Conference

Bill Schmarzo By Bill Schmarzo CTO, Dell EMC Services (aka “Dean of Big Data”) October 23, 2012

Last week myself and several colleagues from EMC attended the Strata Healthcare Conference here in San Francisco, and I thought it was a huge success. I also expect that for some folks, it was a very troublesome conference. Why would I say that?  Because the number of massive data sets available today, and the number that are coming on-line in very short order, is almost debilitating. It’s one of the reasons why many of the entrepreneurs and venture capitalists who had success in the internet and financial services industries, have now trained their sights on the healthcare industry – a $1 trillion industry (in just the US) where severe processing, analytic, and business inefficiencies exist (see Figure 1).

Figure 1:  US Healthcare Spend 1997-2017

Healthcare Data Challenges and Opportunities

There are many massive, big data sources – both structured and unstructured – available that can yield new information and insights about patients, procedures, medical treatments, medicine testing, clinical studies, drug research, and the payer-provider relationship (see Figure 2).

Figure 2:  Tsunami of Current and New Healthcare Data Sources

And new massive, big data sources are on their way, such as:

  • Genomics or gene sequencing, which contain over 2.3B snips of data per each human strand of DNA.  And not only has the price of DNA and genomics testing dropping to the level of the common man, but organizations like are working to “liberate” your DNA so that it is easier to share with other genomics organizations and to pool your genomics with the genomics of others in order to identify causes of diabetes, cancer, blindness, and even baldness.
  • Mobile apps and social media are creating vibrant communities around specific heath issues.  There is much innovation in the mobile healthcare space to simplify and encourage the capture and sharing of your personal health data by startups such as WebMD and
  • “Intelligent” personal home health monitoring devices (blood glucose, blood pressure, medication monitoring, smart toilets) that will unleash a tsunami of data and insights about your current health conditions and flag patterns or trends of personal health concerns.

Healthcare Big Data Enabling Technologies

As I’ve discussed in the past, the advent of new, more detailed data sources typically requires new technologies designed to acquire, store, manipulate, and analyze these new data sources.  Big data technologies, many of which have been perfected in other industries like digital media (ad serving, real time bidding, attribution analysis) and financial services (algorithmic trading, fraud detection), are now available to the healthcare industry to merge, integrate, synchronize, and tease out the insights buried in these new data sources.

Figure 3:  Big Data Enabling Technologies

Big Data Healthcare Use Cases

A number of healthcare use cases, enabled by these new sources of data and innovative big data technologies, are starting to emerge.  Here are a few examples:

Figure 4:  Healthcare Big Data Use Cases

  • Detecting fraud in real-time by using Hadoop to match historical claims and payment data, with in-memory computing to analyze current transactions to flag or score potential fraudulent activities.
  • Reducing hospital readmissions by using MPP databases and data virtualization to access and integrate past admissions and outcomes with current patient data, and in-database analytics to create re-admission scores at the time of patient check-in that can suggest personalized hospitalization plans for at-risk patients.
  • Improving patient care using Hadoop and data virtualization to synchronize all of a patient’s history of treatments, procedures, lifestyle changes, therapy – and even DNA data in the near future – with advanced analytics to attribute the effectiveness of different medications, treatments, and lifestyle changes upon a patient’s health score.

“Data Is Good.  More Data Is Better!”

Few industries have the variety of data sources, many of them publicly available, to provide unique, actionable insights into the quality and cost of healthcare.  The potential is almost endless, as healthcare organizations look to take the next step in pooling data across patients, treatments, procedures, studies, and more to preempt disease outbreaks or identify the potential causes of life-threatening health conditions like diabetes (see Figure 5).

Figure 5:  Pooling Data to Yield New Healthcare Insights

To quote my colleague Hulya Emir-Farinas, a data scientist within our Greenplum division, “Data is good.  More data is better.”  More detailed and diverse data sets can yield new insights and perspectives on some of our healthcare problems, and enable new solutions to fulfill that goal of providing better patient care at lower costs.   More data can enable healthcare organizations to uncover the real causes of healthcare problems so that actionable, cost-effective solutions and care can be directed at those problems.

By the way, if you’d like to see my Strata Rx presentation, you can check it out here.

Bill Schmarzo

About Bill Schmarzo

CTO, Dell EMC Services (aka “Dean of Big Data”)

Bill Schmarzo, author of “Big Data: Understanding How Data Powers Big Business” and “Big Data MBA: Driving Business Strategies with Data Science”, is responsible for setting strategy and defining the Big Data service offerings for Dell EMC’s Big Data Practice. As a CTO within Dell EMC’s 2,000+ person consulting organization, he works with organizations to identify where and how to start their big data journeys. He’s written white papers, is an avid blogger and is a frequent speaker on the use of Big Data and data science to power an organization’s key business initiatives. He is a University of San Francisco School of Management (SOM) Executive Fellow where he teaches the “Big Data MBA” course. Bill also just completed a research paper on “Determining The Economic Value of Data”. Onalytica recently ranked Bill as #4 Big Data Influencer worldwide.

Bill has over three decades of experience in data warehousing, BI and analytics. Bill authored the Vision Workshop methodology that links an organization’s strategic business initiatives with their supporting data and analytic requirements. Bill serves on the City of San Jose’s Technology Innovation Board, and on the faculties of The Data Warehouse Institute and Strata.

Previously, Bill was vice president of Analytics at Yahoo where he was responsible for the development of Yahoo’s Advertiser and Website analytics products, including the delivery of “actionable insights” through a holistic user experience. Before that, Bill oversaw the Analytic Applications business unit at Business Objects, including the development, marketing and sales of their industry-defining analytic applications.

Bill holds a Masters Business Administration from University of Iowa and a Bachelor of Science degree in Mathematics, Computer Science and Business Administration from Coe College.

Read More

Share this Story
Join the Conversation

Our Team becomes stronger with every person who adds to the conversation. So please join the conversation. Comment on our posts and share!

Leave a Reply

Your email address will not be published. Required fields are marked *