AI/IoT/Analytics

Go Big Data Lake or Go Home

Frank Coleman By Frank Coleman Senior Director, DELL EMC Services September 10, 2015

data lake zenRemember, when it comes to Data Lakes, I’m not just an EMC employee I’m also a client.

I was recently interviewed about analytics at EMC and was asked what I would say to other companies thinking about a Data Lake. That’s when Sy Sperling and this old Hair Club for Men commercial from 1986 popped in my head. See, I’m an EMC employee but I’m also a client since I use EMC ‘s Data Lake.

The decision to build a Data Lake isn’t just an IT decision. It’s a C-Level decision answering the real question “Do you want to have advanced analytics / Data Science capabilities that don’t force users to jump through hoops to get at the data?” The biggest challenge is getting at all the data sets required to do your analysis.

You can do Data Science without a Data Lake but it’s very time consuming. If you want to enable a culture of analytics, a Data Lake is what you want. Bill Schmarzo’s recent blog, Why Do I Need A Data Lake, is a great starting point. He always does a great job talking about data ingestion, and hub & spoke service architecture enabling analytics so be sure to check it out.

What other issues is the Data Lake addressing?

  • BI used for ETL – Traditional IT solutions for providing data to the business are via BI tools. Our business needs are rarely met with these tools as they are typically single topic data visualizations. Many people, including me back in the day, resort to Shadow IT solutions to meet business requirements. We often have to blend several data sets together and provide simple models or basic reporting. The issue we are addressing with the IT native data may not be enough.
  • Reducing risk of Shadow IT No extra copies of the data, visibility to what is done with the data, sharing and scale are no longer challenges. With a Data Lake, we can see and listen to how the data is being used as opposed to being told how to consume it.
  • Enabling “Sharing” by feeding Analytical Sandboxes – This enables rapid data discovery and the ability to share findings with other groups. Without this “sharing” environment much of what gets discovered remains with the team who found it. They typically are less likely to share or want to share for risk of taking down their environment or being held accountable for providing copies of their data to other groups. It’s boring work and not of value to the team who created the insight.
  • A Cloud Feed – Many companies are moving to cloud-based solutions. Where is your data going? Just like Traditional IT BI, these cloud-based solutions have point data sets. For deep analytics you will need to merge this data with other data. If your cloud solutions don’t have a data lake that you’re feeding, make sure you are capturing this data in your data lake. Many cloud providers offer BI & analytic solutions so it’s not in their interest to feed you the data for use outside of their solution. Now you are back to BI used for ETL.

Are you thinking about a Data Lake or already have one?  Do you see the same issues being solved? If you are struggling making the business case, I think the strongest one is around Cloud Solutions. It’s your data so make sure you are maximizing the use of it.

Frank Coleman

About Frank Coleman


Senior Director, DELL EMC Services

Frank is a Senior Director of Business Operations for Dell EMC Services. He is living the world of Big Data in this role, as he is responsible for using advanced data analytics to improve the customer experience with Dell EMC’s services organization.

This role keeps Frank immersed in Big Data, and he is at the cutting edge of using Big Data to solve real business problems. Frank has a strong blend of technical knowledge and business understanding, and has spent the last nine years focused on the business of service.

Under his leadership, EMC was honored in mid-2012 for the third consecutive year with the Technology Services Industry Association (TSIA) STAR Award for “Excellence in the Use of Metrics and Business Intelligence.” Prior to joining EMC, Frank worked in various fields and remote technical support roles.

Read More

Share this Story
Join the Conversation

Our Team becomes stronger with every person who adds to the conversation. So please join the conversation. Comment on our posts and share!

Leave a Reply

Your email address will not be published. Required fields are marked *

One thought on “Go Big Data Lake or Go Home

  1. Your comments really resonate with me, Frank. In GBS, we support so many differing processes and functional areas of the company so disparate data sources is commonplace. We are embarking on a Data Lake project being led by Jay Flanagan to merge these data sources to unlock the value it can provide to the executive level of the company. From spend activity in a simple dashboard for Controllers to creating a predictive analytics model to make our cash collections more efficient, the Data Lake strategy is the enabler.