Frank Coleman – InFocus Blog | Dell EMC Services https://infocus.dellemc.com DELL EMC Global Services Blog Wed, 21 Feb 2018 14:18:07 +0000 en-US hourly 1 https://wordpress.org/?v=4.9.2 Dell EMC Services Podcasts Frank Coleman – InFocus Blog | Dell EMC Services clean episodic Frank Coleman – InFocus Blog | Dell EMC Services casey.may@emc.com casey.may@emc.com (Frank Coleman – InFocus Blog | Dell EMC Services) Dell EMC Services Podcasts Frank Coleman – InFocus Blog | Dell EMC Services /wp-content/plugins/powerpress/rss_default.jpg https://infocus.dellemc.com Data Visualization with No Results? Could vs Should https://infocus.dellemc.com/frank_coleman/data-visualization/ https://infocus.dellemc.com/frank_coleman/data-visualization/#comments Mon, 03 Apr 2017 18:00:59 +0000 https://infocus.dellemc.com/?p=30648 I often hear people and teams complaining about their data. They have people digging in but are not getting the results they expected from a quality or quantity standpoint. What’s wrong? Infrastructure, Tools, Skills, or all of them? Which of the following are part of your experience? Infrastructure: Stone Carvings – No centralized systems or […]

The post Data Visualization with No Results? Could vs Should appeared first on InFocus Blog | Dell EMC Services.

]]>
I often hear people and teams complaining about their data. They have people digging in but are not getting the results they expected from a quality or quantity standpoint.

What’s wrong? Infrastructure, Tools, Skills, or all of them? Which of the following are part of your experience?

Infrastructure:

  • Stone Carvings – No centralized systems or common process
  • The 80s – More centralized but no access to the data unless you are IT
  • Y2K – Glad your computer still works. Many canned reports that are just columns and rows but still no database access
  • Drowning in Data – All the data in the world, just limited results

Tools:

  • Excel – This is equivalent to the Stone Carvings above. You can survive but that’s about it.
  • You will use what I give you in Excel.
  • Reporting tools that allow you to dump the data to Excel. Starting to get there but still limited in ability to scale and automate. High risk of Shadow IT.
  • Lots of options just not sure when to use which tool and why

Skills:

  • Excel – Nothing wrong with having Excel skills but you need more to advance
  • Data Subject Matter Expert (SME) – Knows all things about the data, what to use and what is broken and even sometimes how to get around it.
  • Technical but no understanding of the business – This can be a dangerous place because many folks with technical skills can create information that could be incorrect or not exactly what the business meant.
  • Analytical and business SME – Understanding the business you support and the end-to-end process in which the business operates

I listed the different stages of Infrastructure and Tools mostly to point out where many people  are now and make fun of them to highlight how far off some groups are. My timelines and dates above may be a little off : ). The further along you are with infrastructure, common platforms, and common processes the closer you are to doing more with your data. The same is true for your tools. IT can play a huge role in making this easy on you or keeping you in Y2K. I am spoiled when it comes to infrastructure because that’s what Dell EMC does. Feel free to check it out, but I’m not here to sell.

Skills is the area people have the most control of but when it comes to knowing your business it doesn’t matter how skilled you are. If you don’t understand the business you will never create value. Knowing they have different processes and use different tools is part of your knowledge. When bringing that information together you may have a suggestion back to the business around which process and/or tool delivers the most and/or best data to you. But we often find ourselves with too much work and not enough time to actually think. Then we end up with tons of charts and dashboards but no results.

So what should you do?

Step back and talk with the business you support

  1. Interview various members of the business and understand their top 5-10 pain points
  2. Meet regularly with key stakeholders to ensure everyone is in agreement with the major pain points or work it out. This will keep your resources on point and not distracted with just doing.
  3. Make sure there is a clear issue and impact based on your analysis. What questions are we trying to answer or what issue do we want to expose with this project? Most importantly, what actions can you take with this data, or do you need more data to drive action? We often create data visualization that highlights a pain point but
    doesn’t enable the business to see what’s behind it.
  4. Driver Metrics are often needed to expose what’s behind the dashboard view. But creating more metrics off metrics still doesn’t get you there. What specific actions can address what you have uncovered? What action needs to be taken? Once your data gives you this you’re on your way.
  5. When the business asks you “tell me something I don’t know”, first find out more about what they want to accomplish. What are their challenges and issues? What do they wish they could do? If you can find a project that aligns with their objectives and goals you will get a better chance of creating value.

You can create value for your business whether you’re Captain Caveman or George Jetson but you need to start with an understanding of the business drivers and issues and not just blindly jump in.

Does any of this sound familiar?

The post Data Visualization with No Results? Could vs Should appeared first on InFocus Blog | Dell EMC Services.

]]>
https://infocus.dellemc.com/frank_coleman/data-visualization/feed/ 1
New Challenges on the Big Data Journey https://infocus.dellemc.com/frank_coleman/new-challenges-on-the-big-data-journey/ https://infocus.dellemc.com/frank_coleman/new-challenges-on-the-big-data-journey/#comments Tue, 13 Sep 2016 11:30:11 +0000 https://infocus.dellemc.com/?p=28717 No matter where you are on your journey to leveraging Big Data, you have challenges to overcome. My team and I have been doing this awhile now. While I’m happy to report we’ve solved many traditional data problems, new ones are always popping up. If you’ve just decided to build a Data Lake / go […]

The post New Challenges on the Big Data Journey appeared first on InFocus Blog | Dell EMC Services.

]]>
No matter where you are on your journey to leveraging Big Data, you have challenges to overcome. My team and I have been doing this awhile now. While I’m happy to report we’ve solved many traditional data problems, new ones are always popping up.

If you’ve just decided to build a Data Lake / go big data / whatever you want to call it or just a little further along in your journey, here are some of the challenges we faced early on and how leveraging Big Data solved them.

Problem: Can’t get access to the data

The data doesn’t exist in a way that IT can get it to you as they are used to doing a ton of work on the data then pushing it to you via BI tools. Get in line for an IT budget request.

Resolution: I still have to request access to a data set but it’s not limited to what IT has put into BI tools. I can self-help on the complete data set


Problem: Can’t get the data at the level I need

The data is just too big. Seriously, did someone just say that?

Resolution: Once the data is available you can view it at all levels and create new levels if you want.

Problem: Data can’t process because it’s too large

Processing the data would take days or you would have to cut it into smaller chunks them merge it back together.

Resolution: Data runs in seconds, sometimes a minute or two; game-changing for decision support

Problem: Can’t merge the data with other IT-maintained data

Using Shadow IT since IT didn’t have a way for you to do this. If you’ve been around a while, you’re lying if you say haven’t done some Shadow IT work, using a server in a lab or, even worse, some desktop.

Resolution: All IT sourced data is available to me. Of course I have to ask for access first.

Problem: Can’t merge the data with outside data

Similar to the last one but a different twist as we often need to blend our data with 3rd party data or industry data.

Resolution: I can dump / feed my 3rd party data into my Analytical Sandbox to merge with the IT-sourced data instead of using a Shadow IT solution.

Problem: Can’t add in business rules to the data

Getting to the data via BI tools was the only way and the tools were very limited. Many times we just use the BI tool for Extract Transform Load and dump that output into a Shadow IT table for the next step.

Resolution: At a database level it’s just a few lines of code and a new column with our data shows up.

Problem: Can’t process many months of data, never mind years

If you haven’t already pulled and aggregated all of history, it will take days just to put the data together.

Resolution: We can literally say, “Sit down, let’s take a look”. To be honest, we can’t make it look pretty that quickly but sometimes quick and dirty is good enough. Even on bigger projects getting the data is not the problem anymore.

Everything sounds great, right? But don’t get ahead of yourself. With these traditional data solutions, come “new” problems/questions:

Where is my return on investment?

Let’s face it, it wasn’t cheap to get here so you had better have some ideas on how you’re going to create value. And let me say this: you saving days on pulling data is not the return they’re looking for. The return is the new insights and changes in business workflow you are about to do.

How do I ask the right questions before attacking a problem so I don’t create cool reports with no impact?

This was one of the hardest ones for me. We made a ton of mistakes turning on the new system as we didn’t know what was possible. Don’t boil the ocean on your first projects or they’ll fail. But don’t just focus on the BI benefit of speed because no one cares. Well, they’ll care if you plan to cut your BI staff. Otherwise no one cares . Most importantly, don’t give up!

How do I get Shadow IT teams or other BI users on board with a new way of thinking?

I’ve said this before: they are your best friends and can be your worst foe. Get them engaged and feeling like part of the team. They usually have tons of business knowledge and are often data SMEs in many areas. Initially they will view you as a threat looking to put the IT handcuffs on them. Change is hard and not everyone will get there. Embrace the ones that are willing to try.

How do I share this information, securely?

You now have some serious data out there and it needs serious rules about access to it. You must be able to grant access to the right people and make sure they don’t share it with unauthorized people.

How do I feed this insight into applications to impact workflow?

If you have access to the data and are able to model the data well you’re almost there. Data visualization is cool but not good enough. I want to be able to feed this insight into our workflow applications to impact the way we work in real time. Sorry folks, I don’t have a silver bullet for this one. But making sure your projects are clearly outlined with who is going to use your data and how they’ll use it can help. For example, you plan to feed your Data Science model data into an application which enables Sales and/or Service. This is when you are truly killing it, impacting workflow via Data Science! You don’t need a fancy chart, just the right information in front of the person making a decision when they need it.

Having these new problems is a sign of progress and that makes me happy. Many of our historical problems are gone and as we break new ground we will always run in to new obstacles. I’m curious if anyone else has had a similar experience or other challenges I may not have listed.

Let me know!

The post New Challenges on the Big Data Journey appeared first on InFocus Blog | Dell EMC Services.

]]>
https://infocus.dellemc.com/frank_coleman/new-challenges-on-the-big-data-journey/feed/ 2
Lost in the Lake? 5 Keys to Data Lake Success https://infocus.dellemc.com/frank_coleman/5-keys-data-lake-success/ https://infocus.dellemc.com/frank_coleman/5-keys-data-lake-success/#comments Thu, 02 Jun 2016 09:00:53 +0000 https://infocus.dellemc.com/?p=27604 I had a cup of coffee with EMC Chief Data Governance Officer Barbara Latulippe recently. We talked about how more and more people tell us they have access to analytical sandboxes attached to a Data Lake but still can’t find the information they need. Is this a Data Governance problem? A Skill problem? A Technology problem? A Tools problem? The answer is yes, it’s […]

The post Lost in the Lake? 5 Keys to Data Lake Success appeared first on InFocus Blog | Dell EMC Services.

]]>
I had a cup of coffee with EMC Chief Data Governance Officer Barbara Latulippe recently. We talked about how more and more people tell us they have access to analytical sandboxes attached to a Data Lake but still can’t find the information they need.

Is this a Data Governance problem? A Skill problem? A Technology problem? A Tools problem?

The answer is yes, it’s all of that!

When you build a Data Lake you most likely have structured and unstructured data in it. For this post I’m only going to talk about the structured data because it’s the fastest/easiest to get value from it and a larger audience will benefit.

Structured Data

Biggest Complaint: I can’t find my data!

Reply: “You have everything you need. Why are you complaining?”

So what’s the problem?

Ok, many of us are used to using reporting tools and having nice clean flat tables fed from an EDW/GDW database. Now I have thousands or more tables with very little connection. I blogged about his problem before, likening it to dumping a bag of Legos on your desk and saying “Here you go”.

Keys to Success

  1. Light Data Governance
    • You need some form of Data Governance or you create chaos. Please read Rachel Haines’ blog around just enough Data Governance. You don’t want the lake buried in red tape but you can’t just dump all the data in one place and expect value to magically jump out of it.
  2. Data SMEs
    • You need the help of your data SMEs to make some order of this chaos and then document record and explain what they did. These data SMEs are the wizards who can make the magic jump out of the lake. Capturing what they do and making it available to the masses is where the value starts piling up.
  3. Leverage your Reporting Tools for help – See if the Reporting tools can show you the SQL or get IT to help
    • When you first start out, many people don’t know what columns to grab or what they are called because they are used to working with reporting tools. Many reporting tools can show you the XML or SQL being created when you grab the data.
  4. Focus on Team Skills
    • When we first got the Data Lake we had some skills issues. Most of my team were BI people and needed to skill up on SQL and then Hadoop. Being totally honest, not everyone was able to make that transition and new hires were targeted with those skills.
    • It’s important to partner with your IT teams and have regular knowledge sharing events. Both sides can benefit as you probably have the Data SME knowledge and they have more technical knowledge. The more you collaborate the better you understand each other’s needs and how to work more effectively.
  5. It’s hard work. Wishful thinking and complaining doesn’t make it better.
    • Sorry I had to throw that in : ). Regular meetings with your IT teams on what is and isn’t working is key. These are not complaint sessions bashing IT. We show real use cases that we’re struggling to get going. Early on it may be access to data, just finding the data or query restrictions on your roles.

If you are on the journey or just thinking about getting a Data Lake, I hope you found this useful.  Please let me know if you found any other lessons that enabled your success leveraging a Data Lake.

The post Lost in the Lake? 5 Keys to Data Lake Success appeared first on InFocus Blog | Dell EMC Services.

]]>
https://infocus.dellemc.com/frank_coleman/5-keys-data-lake-success/feed/ 1
Best Place to Start Your Big Data/Data Science Journey is…Finance? https://infocus.dellemc.com/frank_coleman/best-place-to-start-your-big-data-data-science-journey-isfinance/ https://infocus.dellemc.com/frank_coleman/best-place-to-start-your-big-data-data-science-journey-isfinance/#comments Mon, 01 Feb 2016 09:00:51 +0000 https://infocus.dellemc.com/?p=25745 If you work in Finance you love MS Excel. Don’t lie. You know you do. Many businesses, very big businesses, have Excel models they’ve grown up with. We never had a better way to work and Excel was the best way to put together our assumptions and model to our heart’s content. But these models […]

The post Best Place to Start Your Big Data/Data Science Journey is…Finance? appeared first on InFocus Blog | Dell EMC Services.

]]>
If you work in Finance you love MS Excel. Don’t lie. You know you do. Many businesses, very big businesses, have Excel models they’ve grown up with. We never had a better way to work and Excel was the best way to put together our assumptions and model to our heart’s content.

But these models can be very manually intensive to maintain and aren’t truly auditable. Or the person who built it doesn’t work there anymore and no one knows how to modify it.
Don’t worry, you can still use Excel. But what if I told you that you could process pivotal-big-datamillions of rows of data in seconds without Vlookup merging your many datasets together or, dare I say, without using MS Access? At EMC, we use Pivotal’s Big Data Suite. It’s light years ahead of what we were doing with Excel.

So if you’re starting a Big Data project at your company, Finance should be one of your first stops. They have tons of models leveraging disparate data sets and they are used to assisting in many major business decisions. What better way to create a Return On Investment (ROI for you Finance people) for your project than to do it with the Finance team? You know, the ones that write the checks?

If you are in Finance or support Finance the dust is just settling from rev 106 of the 2016 Plan. Here are a few ideas. Some may not apply to your specific business but I tried to call out where you can get some quick, easy wins, secure funding, and grow your Big Data program.

  • Install base – Whatever’s happening to your product(s), this is a major driver in your Financial Planning for the current and next 3-5 years
  • Revenue and/or Bookings – Sales predictions can be leveraged with business forecasts to see if there are gaps or blind spots.
  • Staffing models – Every Finance team has a staffing model if they support a people business. Staffing can be ~70 percent of their spend. Anything you can do to improve hiring will help the bottom line. Similar to Sales, you can use the prediction in combination with business forecasts to see if there are gaps or just to challenge business assumptions in their forecast.
  • Skill Gaps – Similar or combined with Staffing, make sure you are hiring the right people in the right locations.
  • Service Costs – Which products are more costly to support and why?
  • Accrual Balances – Accrual example: some companies have to keep a balance due to product warranty. I would put a big star on this one; reducing an accrual increases bottom line profitability. **Great way to fund your next project**. My only caution is the Accrual could increase with your new model. But most Finance people are very conservative with their assumptions so this shouldn’t be an issue.

It’s a safe bet your Finance team already has a model that does this in Excel. The Big BUT is their models likely don’t scale because they are built at a high level and are full of assumptions. Don’t get hung up on the term Data Science. Sometimes you don’t need a Data Scientist, just the ability to pull together large disparate data sets.

There will always be an opportunity to improve a model with a Data Scientist but bringing together disparate data sets at the lowest level can dramatically improve your models. This saves time, reduces human error, and increases business visibility masked by assumptions or product mix shifts within a hierarchy and auditable to the lowest level.

Whichever project you choose, make sure you size up the Dollar Impact, Risk Reduction, Efficiency Gained and get buy in from the current Excel model owners to work with you.

I will put another gold star on “get buy in from the current Excel model owners” because nothing kills a project more than a group who doesn’t want the help.

Once you get buy in, don’t just replicate the Excel model. Think bigger and expand it to the lowest level then build it bottom-up. If you don’t you’re just automating a flawed process. This still has value but you are leaving a ton on the table and it may not be enough to fund your next project.

I hope this helps. I really believe Finance is a great place to start your Big Data journey as it has many models. Finance folks are data-driven by nature and their models assist in major business/ financial decisions.

The post Best Place to Start Your Big Data/Data Science Journey is…Finance? appeared first on InFocus Blog | Dell EMC Services.

]]>
https://infocus.dellemc.com/frank_coleman/best-place-to-start-your-big-data-data-science-journey-isfinance/feed/ 3
4 Key Steps on Your Journey to the Data Lake https://infocus.dellemc.com/frank_coleman/4-key-steps-on-your-journey-to-the-data-lake/ https://infocus.dellemc.com/frank_coleman/4-key-steps-on-your-journey-to-the-data-lake/#comments Mon, 26 Oct 2015 15:06:06 +0000 https://infocus.dellemc.com/?p=24966 In my last blog I talked about why you need a Data Lake. Now I’m going to share a few helpful steps on this journey and highlight some “gotchas” to avoid. Step 1 – Feed the Lake Understand all the data needs of your company/customer. If you don’t have the data, you are dead in the…wait for […]

The post 4 Key Steps on Your Journey to the Data Lake appeared first on InFocus Blog | Dell EMC Services.

]]>
shutterstock_153806942

In my last blog I talked about why you need a Data Lake. Now I’m going to share a few helpful steps on this journey and highlight some “gotchas” to avoid.

Step 1 – Feed the Lake

shutterstock_72308203

Understand all the data needs of your company/customer. If you don’t have the data, you are dead in the…wait for it—yes, I’m going there since we’re talking about data lake—water.

I can’t count the number of times I’ve requested data only to find it was missing an integral column in: Quotes, Billings, Bookings, Install Base, Contracts, Logistics, Case Management, Headcount, Expenses, Web traffic, Mobile, Telephony, Training, Industry data, DUNS….

With the data lake I finally have a place to ask questions about my business where I can see what is happening end to end without having to use 10 different BI tools.

  • Mistake #1: Don’t structure the data into what you think they need. Feed all the data, structured and unstructured. If not, you will always be asked to feed more and paying by the drip is expensive. Also, your customers will always be unhappy and work around your expensive data ingest model by using shadow solutions.

Step 2 – Care for the Lake

The initial feed to the data lake is awesome for asking questions and data discovery. But what happens when you discover something? What if you already knew something and now 10 other groups are re-creating the wheel? Or asking the same question but getting different answers?

This is where I use my Lego example. Let’s say you just dumped a bag of Legos on my desk, showed me a picture of the Death Star with no instructions or pre-packaged bags, and said, “Build it, you have everything you need”.

  • Mistake #2: Oops – you may have a skill gap. I hope your data lake is sitting on one of EMC’s solutions. Sorry I couldn’t help shamelessly plugging our EMC equipment. All kidding aside, this is what we use at EMC. If your team doesn’t know Hadoop, you probably need to learn it or pay for it “as a Service” to get you the unstructured data.
  • Mistake #3: Create a community not a competition. Data SMEs are your friends, not the enemy. Everyone wants to say they are in an advanced analytics / data science team, which is great, but you don’t need to discover the earth is round. Take advantage of data SMEs learnings and share yours. This can later feed into data governance.

Step 3 – Use the Lake with Analytical Sandboxes

shutterstock_170164847Analytical sandboxes are provisioned spaces for users to discover and build new insight. This is finally a place where you can access the data without a BI layer in the way. You can build and merge many different data sets in ways that were previously impossible.

  • Mistake #4: Where is that column? A huge frustration I’ve run into is only knowing our data through traditional BI tools. These BI tools often transform the data or create custom calculations that don’t exist natively. Understanding what data exists and how the BI data was created can save you an enormous amount of frustration and time.

Heads up: You will run into resistance, saying you are duplicating efforts and now have to maintain two copies of the logic. Forcing BI tools to do ETL is a huge mistake as the BI tool will hold you hostage. By building it into the data layer and then dropping it in to BI, it will run faster and take better advantage of your infrastructure.

Step 4: Feed Back into the Lake

shutterstock_118654267Again, you want to part of a community. Once you discover or create a model that adds value, feed it back. If you only keep it in your sandbox, you limit the amount of value this insight can produce. Many groups in my company look at similar information with a very different lens and sharing our findings helps reduce duplicated efforts or even dueling data. Success and Value to me is when we operationalize our findings and built it into workflow, applications and/or change the way we work, not just a report that has cool visualization. Your initial discovery or model may not be the final mile. Getting it fed back can help it feed other solutions or use cases. As an example, the data science model you created in your sandbox most likely isn’t going to feed a production app.

  • Mistake #5: I’m taking my data and going home. Creating insight or a fantastic data science model in your sandbox is awesome; but you are limiting your value. Many of your initial sandbox users may be former shadow IT groupies, where sharing data is not natural or encouraged. Create incentives and reward sharing or you are limiting value.

I hope you found these steps useful and learned how to avoid some land mines. If your experience is different or have other suggestions please comment below.

The post 4 Key Steps on Your Journey to the Data Lake appeared first on InFocus Blog | Dell EMC Services.

]]>
https://infocus.dellemc.com/frank_coleman/4-key-steps-on-your-journey-to-the-data-lake/feed/ 3
Go Big Data Lake or Go Home https://infocus.dellemc.com/frank_coleman/go-big-data-lake-or-go-home/ https://infocus.dellemc.com/frank_coleman/go-big-data-lake-or-go-home/#comments Thu, 10 Sep 2015 12:00:50 +0000 https://infocus.dellemc.com/?p=24557 Remember, when it comes to Data Lakes, I’m not just an EMC employee I’m also a client. I was recently interviewed about analytics at EMC and was asked what I would say to other companies thinking about a Data Lake. That’s when Sy Sperling and this old Hair Club for Men commercial from 1986 popped […]

The post Go Big Data Lake or Go Home appeared first on InFocus Blog | Dell EMC Services.

]]>

data lake zenRemember, when it comes to Data Lakes, I’m not just an EMC employee I’m also a client.

I was recently interviewed about analytics at EMC and was asked what I would say to other companies thinking about a Data Lake. That’s when Sy Sperling and this old Hair Club for Men commercial from 1986 popped in my head. See, I’m an EMC employee but I’m also a client since I use EMC ‘s Data Lake.

The decision to build a Data Lake isn’t just an IT decision. It’s a C-Level decision answering the real question “Do you want to have advanced analytics / Data Science capabilities that don’t force users to jump through hoops to get at the data?” The biggest challenge is getting at all the data sets required to do your analysis.

You can do Data Science without a Data Lake but it’s very time consuming. If you want to enable a culture of analytics, a Data Lake is what you want. Bill Schmarzo’s recent blog, Why Do I Need A Data Lake, is a great starting point. He always does a great job talking about data ingestion, and hub & spoke service architecture enabling analytics so be sure to check it out.

What other issues is the Data Lake addressing?

  • BI used for ETL – Traditional IT solutions for providing data to the business are via BI tools. Our business needs are rarely met with these tools as they are typically single topic data visualizations. Many people, including me back in the day, resort to Shadow IT solutions to meet business requirements. We often have to blend several data sets together and provide simple models or basic reporting. The issue we are addressing with the IT native data may not be enough.
  • Reducing risk of Shadow IT No extra copies of the data, visibility to what is done with the data, sharing and scale are no longer challenges. With a Data Lake, we can see and listen to how the data is being used as opposed to being told how to consume it.
  • Enabling “Sharing” by feeding Analytical Sandboxes – This enables rapid data discovery and the ability to share findings with other groups. Without this “sharing” environment much of what gets discovered remains with the team who found it. They typically are less likely to share or want to share for risk of taking down their environment or being held accountable for providing copies of their data to other groups. It’s boring work and not of value to the team who created the insight.
  • A Cloud Feed – Many companies are moving to cloud-based solutions. Where is your data going? Just like Traditional IT BI, these cloud-based solutions have point data sets. For deep analytics you will need to merge this data with other data. If your cloud solutions don’t have a data lake that you’re feeding, make sure you are capturing this data in your data lake. Many cloud providers offer BI & analytic solutions so it’s not in their interest to feed you the data for use outside of their solution. Now you are back to BI used for ETL.

Are you thinking about a Data Lake or already have one?  Do you see the same issues being solved? If you are struggling making the business case, I think the strongest one is around Cloud Solutions. It’s your data so make sure you are maximizing the use of it.

The post Go Big Data Lake or Go Home appeared first on InFocus Blog | Dell EMC Services.

]]>
https://infocus.dellemc.com/frank_coleman/go-big-data-lake-or-go-home/feed/ 1
Simple Ways to Use Statistics for Your Business https://infocus.dellemc.com/frank_coleman/simple-ways-to-use-statistics-for-your-business/ https://infocus.dellemc.com/frank_coleman/simple-ways-to-use-statistics-for-your-business/#respond Tue, 23 Jun 2015 17:27:33 +0000 https://infocus.dellemc.com/?p=23909 We all want to have super data science models that predict the future and can even prescribe what we should do next.  I’m on this journey and will get there but there are nuggets of value on the way that you shouldn’t overlook. I have a few examples of simple things you can do today, […]

The post Simple Ways to Use Statistics for Your Business appeared first on InFocus Blog | Dell EMC Services.

]]>
We all want to have super data science models that predict the future and can even prescribe what we should do next.  I’m on this journey and will get there but there are nuggets of value on the way that you shouldn’t overlook. I have a few examples of simple things you can do today, quickly and easily.

Everyone has survey data. We all look at our scores and measure increases or decreases over time, mostly looking at the Top 2 box. If your job is to analyze this data, you go way deeper but most of the “users” just look at the trend to see if they’re on goal or not.

Most surveys have an “Overall…” question. This often is the main question everyone zooms in on and then there are several other questions for more detail. Have you ever checked the correlation between the “Overall…” question and the other questions?

Just seeing a little correlation can tell you if you’re asking the right question or if the responder really cares about this area.

correlationIf you find a question that has high correlation and is trending down or below target you have exposed a potential focus area. Stack ranking each question can help prioritize where you can have the most impact on the “Overall….” It’s just math. While it’s not perfect it’s a simple way to take a different angle at something we have looked at for years.

Booking, Revenue, Workload/Staffing models are great examples of Time Series data that you can use simple statistics to assist. There is a little complexity when you look at these data sets to evaluate which models are best. I’m fortunate to have a Data Scientist to help me with that.

However, this simple trend can help create a statistical baseline. Most businesses have very complex ways to come up with their forecasts or headcount planning process. I’m not suggesting you throw that out, but you can challenge those models to help get the color behind their forecasts. Again, it’s just math. In some cases this won’t work but use it where it does work.

So start here and then you can get more complex, expanding your models to become more accurate and aligning tighter to the business drivers. I still feel that this alone can be very powerful. You don’t have to wait for the super complex model to start having impact. It’s more about how you position your analysis with the stakeholders.

You don’t have to wait for that huge database or that perfect model either. The platform does not need to be too complex or expensive; these types of analysis can be implemented in any analytics platform already available in IT.

If possible, you should use a Data Scientist to help this evolve. Here are a few videos and links to models you may find useful if you are just getting started.

The post Simple Ways to Use Statistics for Your Business appeared first on InFocus Blog | Dell EMC Services.

]]>
https://infocus.dellemc.com/frank_coleman/simple-ways-to-use-statistics-for-your-business/feed/ 0
Are Your Big Data Projects Set Up to Fail? https://infocus.dellemc.com/frank_coleman/are-your-big-data-projects-set-up-to-fail/ https://infocus.dellemc.com/frank_coleman/are-your-big-data-projects-set-up-to-fail/#comments Thu, 30 Apr 2015 18:33:15 +0000 https://infocus.dellemc.com/?p=23554 Are Your Big Data Projects Set Up to Fail? I recently read an article called “Where Big Data Projects Fail” by Bernard Marr that resonated with me. Bernard summarized his experiences by saying, “One thing they have in common is they are all caused by a lack of adequate planning.” He predicts that half of […]

The post Are Your Big Data Projects Set Up to Fail? appeared first on InFocus Blog | Dell EMC Services.

]]>
Are Your Big Data Projects Set Up to Fail?

I recently read an article called “Where Big Data Projects Fail” by Bernard Marr that resonated with me. Bernard summarized his experiences by saying, “One thing they have in common is they are all caused by a lack of adequate planning.”

He predicts that half of all Big Data projects will fail to meet expectations and highlights 5 major causes of failure:

  1. 1. Not starting with clear business objectives
  2. 2. Not making a good business case
  3. 3. Management Failure
  4. 4. Poor Communication
  5. 5. Not having the right skills for the job

Bernard goes in to detail on each cause of failure and I highly recommend you give it a read.

When I think about my experience this list hits home. Many companies / people don’t fully grasp Big Data and get distracted by the technical component of it rather than defining an objective.

I’ve been fortunate…or misfortunate depending on how you look at it…to have several project failures under my belt. Luckily, they haven’t been multi-million dollar failures. Failure is part of the process but you don’t want to have a colossal failure.

Below are a few questions I always now use to reduce my failure rate.

  • Who is my customer?
  • Executive sponsor – They can help you navigate any management barriers and ensure alignment if you have a cross-functional team. Action takers – Get “buy in” from the group who will take action. This is often overlooked. Our team may agree but if no one has talked to the people who have to use the solution, you have failure just waiting to happen.
  •  What will they do with what I’m building?
  • Understanding what they will do with this solution to achieve their business objective is key.
  • Business Intelligence (BI) layered on top of a Big Data solution is often dumped on the end users without an understanding of the action takers’ needs. This is another failure waiting to happen.
  •  Can I define clear actions items and owners of those items?
  •  You should be able to define and assign a few very specific action items. For each action understand how the owner of this action would best like to use this solution.
  • Building this into workflow may be a better solution than forcing a process change and leveraging a BI solution. BI is often easier and faster so we often jump to this as the answer. But adding a flag or action right into the workflow may be the way to success. This gets back to what Bernard was talking about in lack of planning.
  • Are your owners signed up?
  • Measure Success – Once you have all these questions answered go back to your customer, review this list, and make sure they are signed up.
  • Wherever possible quantify the impact of their actions so you have success metrics defined and agreed to by your action takers. Ensure they know the Executive Sponsor is expecting readouts on this Success metric. Don’t do this to instill fear but to make sure they truly buy in.
  • Many times this conversation will expose potential failure areas for the project. I like to call this the Show me the Money step. If you don’t have a good business case, keep at it or maybe this isn’t the right project to work.

Poor communication and not having the right skills for the job are about proper planning. Run your project with a Project Manager. Don’t just throw a few techs and a Data Scientist together and expect results.

As I’ve said before this is a team sport. Having people on your team that can translate the technical jargon to business speak is essential. Mapping out clear roles and responsibilities of the team will also ensure you don’t have a Data Scientist wasting their time setting up joins on tables when they should be building out a predictive model on another project.

I completely agree with Bernard that Data Scientists are rare and expensive. You don’t want them to quit because of your poor planning and having them do tasks that a Data Engineer should be doing.

If you have any other lessons learned from your failed projects please share them in the comments below or with me directly.

The post Are Your Big Data Projects Set Up to Fail? appeared first on InFocus Blog | Dell EMC Services.

]]>
https://infocus.dellemc.com/frank_coleman/are-your-big-data-projects-set-up-to-fail/feed/ 3
It’s Not the Size of Your Data But How You Use It https://infocus.dellemc.com/frank_coleman/size-use/ https://infocus.dellemc.com/frank_coleman/size-use/#comments Tue, 24 Feb 2015 16:05:19 +0000 https://infocus.dellemc.com/?p=22600 I often hear people brag about the size of their data. Don’t get me wrong. I work for EMC so you know I love this. But you don’t need huge databases to have significant business impact. As sexy as unstructured data is, there’s a wealth of information sitting in your structured data too. The biggest […]

The post It’s Not the Size of Your Data But How You Use It appeared first on InFocus Blog | Dell EMC Services.

]]>
I often hear people brag about the size of their data. Don’t get me wrong. I work for EMC so you know I love this. But you don’t need huge databases to have significant business impact. As sexy as unstructured data is, there’s a wealth of information sitting in your structured data too. The biggest issue I see is people don’t know how to use their data to make good business decisions. Sometimes a simple solution with a little scale can get the desired result.

Shocked Computer NerdAs I’ve said in the past I’m a business guy first. Value, not coolness, is the goal. Coolness is a byproduct of your success! Excel models are a good example of something the business created to add value. But because the creator didn’t have the right skill or environment, the models have limited value. They are often very manual and do not scale. Many times the models only work thanks to cutting corners or using summarizations of data.  Now these flawed models are being used only because your people don’t have the skills to build them differently. Which provides the perfect opening for your first Big Data project. If you’re well on your way in Advanced Analytics don’t overlook these small gems as opportunities. Excel models often can be broken down:

  • Data Sources: how many are you using?
    • Many times they are raw data tabs in Excel
  • Joins: do I need to merge these data sources at some level?
    • Many times you need multiple tabs to come together or have to summarize many sources before they can be joined together.
  • Indexes: Do you need to add/manipulate those sources?
    • How many times does the native data not have a view or rollup your business wants and you need to add?
  • Formulas: Do you need to do any calculations off these data sets?
    • This can get very complex depending on the Modeler’s skill set
  • Summarizations: Do I need to summarize the data or some of the data in order to join or use?
    • Many times pivots are used off raw tabs and then joined with other tabs or pivots because the data only joins cleanly at summarized levels.

Building a database solution rather than an Excel model can open the door for

  • Automation: Drastically reduce manual effort of maintaining the model
  • Scale: Will not have to cut corners or use summarizations. You can go to the lowest level needed.
  • New Insight: Many times the improved scale exposes information not seen in summary level data.

dark sideMulti-million dollar decisions are made off these models. Now you can experience the full power of the Dark Side…I mean of your Big Data Solution…even if it’s structured data and not gigantic. Once you automate and scale this up, you have a database environment that can be leveraged. Now bring in the Data Scientist for even more insights that were not possible before!

The post It’s Not the Size of Your Data But How You Use It appeared first on InFocus Blog | Dell EMC Services.

]]>
https://infocus.dellemc.com/frank_coleman/size-use/feed/ 1
Data Science – How Do I Turn Insight Into Value? https://infocus.dellemc.com/frank_coleman/data-science-turn-insight-value/ https://infocus.dellemc.com/frank_coleman/data-science-turn-insight-value/#respond Tue, 02 Dec 2014 19:30:32 +0000 https://infocus.dellemc.com/?p=21725 In many of my past blogs I focused on building a data science team and getting started down the path of advanced analytics / Data Science. I originally used the slides below at EMC World this past May. They show the team and a high level processes flow. My MythBusters blog talked about our journey […]

The post Data Science – How Do I Turn Insight Into Value? appeared first on InFocus Blog | Dell EMC Services.

]]>
In many of my past blogs I focused on building a data science team and getting started down the path of advanced analytics / Data Science. I originally used the slides below at EMC World this past May. They show the team and a high level processes flow. Build the Team My MythBusters blog talked about our journey to the Prototype and Validate stage or “Time to Insight”.  Now I’m sharing my thoughts on getting value from your work: the Operationalize phase of that process. Establish a Process I refer to the Prototype & Validate phase of this process slide as “Time to Insight”. If you are fortunate enough to get this far down the path, you know it wasn’t easy. Well get ready because getting “Value” is even harder. The technical challenges of acquiring the data, although sometimes painful, are no match for the show you’re about to put on to get value out of your insight. Operationalize or “Time to Value” The second part of that process flow, Operationalize, is where you start to see the value of the insight. Here are a few lessons I’ve learned on the way:

  1. IT infrastructure – You created a model that produces an output you believe can have real business impact. But the infrastructure you used to create this “Insight” may not be the same infrastructure you use to operationalize your model or output. Here are a few questions to help you figure out what level of support you’ll need:
    1. Cadence – how often do you need this model/output run? Real-time, Daily, Weekly?
    2. What other data do you want to join with this data?                            – Where are these data sets located and how often are they updated?
    1. What applications are impacted? As an example, if you’re trying to impact workflow, most likely you will be adding something or changing logic that feeds the workers.
  2. Create a list of all the use cases for your output. You may have started with one use case in mind but four or five more will likely pop up along the way.                -As an example, you create a data set that helps customer service proactively address customer problem areas. But wouldn’t this information be helpful to Professional Services, Engineering, and others? Each group may form entirely different initiatives to operationalize your results. For each use case you should have an ROI calculated to help create a buzz and get funding to implement.
  1. Executive Sponsorship – I’ve said this a million times in my posts around “Time to Insight”. It’s even more critical for “Operationalize” because this will cost money to implement. IT budgets are tight. If your model is considered “nice to have” you are dead in the water.  This is why the ROI is important. Hopefully you thought about this before you started the project in the “Time to Insight” phase.
  2. Consider the team required to operationalize your results. It may not be the same team that got you this far. You may also want to consider transitioning this to an Operations / Program Management team.
    1. You need a really strong Project/Program Manager who can “sell” the value and influence the teams to make change.
      1. i.   Change to Process
      2. ii.  Change to IT infrastructure / Tools
      3. iii. Change to Culture
    2. Refocus the Data Science team on what’s next or expanding the existing model so they don’t get stuck in the waiting around game of “making stuff happen”.

These are just a few lessons I’ve learned along the way. I hope you found them helpful. If you are operationalizing your results I’d love to hear the challenges you may be facing that may be different from my list.

The post Data Science – How Do I Turn Insight Into Value? appeared first on InFocus Blog | Dell EMC Services.

]]>
https://infocus.dellemc.com/frank_coleman/data-science-turn-insight-value/feed/ 0