Big Data

How to Become a Data Scientist

Frank Coleman By Frank Coleman Senior Director, DELL EMC Services March 13, 2014

obhSince I’m fortunate to work with several Data Scientists, all with varying backgrounds and work histories, I’m often asked for career path advice by aspiring Data Scientists. What better way to advise than to share the experiences of the Data Scientists I work with most closely? I sat down recently with Oshry Ben-Harush and here is what he shared:

FC: How did you become a Data Scientist?

OBH: After spending time in the military, I decided to obtain a degree in Computer Science, specifically electrical/computer engineering. Following my BS I worked in Engineering at Intel. From there I decided to get my Masters in speech processing/time series signal processing and worked as a teaching assistant. I then started my PhD but after a year of research decided to take a role at CheckPoint (software security company) creating text monitoring software and developing algorithms aimed at preventing sensitive information from leaking.

dataIn May 2012 I was recruited to EMC to build a Data Science team. I was curious as I’d never heard of Data Science. We were using machine learning but I hadn’t heard this term before. Building this capability using business data and adding value was very intriguing to me.

FC: What is your education background?

OBH: I have a BS and a Masters in Electrical and Computer Engineering from Beer Sheva Technical College and Ben-Gurion University of the Negev, respectively.

FC: What skills so you think are essential for a successful Data Scientist?

OBH: The following skills are required to be a Data Scientist:

  • Integration – You need to know more than the science, understand the business, and the business problems. You need to talk with data experts and understand how to manage a project. You must have a broad view of the business.
  • Rapidly learn and adapt – Projects sometimes have little to no similarities in relation to the content and domain knowledge. The technology/algorithms are your toolbox and these evolve with you, but you have to rapidly learn the domain in order to map the business problem to a form that is applicable by these set of algorithms.
  • Simplify your findings – Present your findings simply but be informative. Being good at this requires devotion. Having excellent technical skills isn’t enough because you have to be able to explain it well or it will fall down.

20060513_toolboxFC: What are the common tools you use in your everyday work?

OBH: I use the following on a regular basis:

In Memory Scripting languages: R, Python, Scala, Linux, Bash Shell or any other scripting mechanism for Linux

Big Data Platforms: Pivotal Greenplum Database (MADLib, HAWQ), Hadoop (MapReduce, HBase, Hive, Pig)

Visualization Tools: Tableau, ggplot for R, Python matplotlib

Text Editors: Vim

FC: What are the most important characteristics needed to be a Data Scientist?

OBH: Curiosity, persistence and “data-loving” nature.

FC: What is the coolest or most impactful project you have worked on?

OBH: Two projects come to mind. When I first started out, we were assigned to predict when servers would crash. We were trying to describe the behavior of the server to see when it was misbehaving and then predict when it was going to happen. This was a very successful project and was a lot of fun.

I’m working on the second project right now. It’s so exciting because we are exposing so much potential the sky’s the limit. I can’t share the details, since it’s not complete but stay tuned.

FC: How do you keep current with industry trends?

OBH: Personal development is a requirement and we ensure the team continues its education via formal training and industry conferences. Our Data Science team has the benefit of having a tight relationship with a local university. One team member maintains this relationship. We frequently attend training and run our own weekly seminars taught by my internal team.

FC: What guidance would you give aspiring Data Scientists?

7554750410_c6dc5b6db1_oOBH: Focus your attention on these three areas:

1) Get Educated – If your background is math-based, start with Statistics, go to machine learning algorithms and data analysis, then big data platforms. While taking these courses you can combine learning R and Python. You can find decent courses on Coursera and take them for free.

2) Get Real Examples – Look around your BU because you understand your challenges. Identify those challenges and try to apply machine learning algorithms and show them to your management chain. It won’t be easy but could be convenient because you know this domain. Many of the challenges that take time involve domain knowledge. Show two examples of how you solved a problem using Data Science. If you have a group of Data Scientists in your organization, offer up some of your time for free. With your manager’s approval, of course.

3) Get Involved in the Data Science Community – Participate in Meetup groups to discuss data science and data science projects.

 

A special “Thank You” to Oshry for sharing his experiences and suggestions! Hopefully this will help you aspiring Data Scientists on how to get started in a career of Advanced Analytics/Data Science. In future posts I will interview other Data Scientists to demonstrate their varying backgrounds and highlight their similarities and differences.

Frank Coleman

About Frank Coleman


Senior Director, DELL EMC Services

Frank is a Senior Director of Business Operations for Dell EMC Services. He is living the world of Big Data in this role, as he is responsible for using advanced data analytics to improve the customer experience with Dell EMC’s services organization.

This role keeps Frank immersed in Big Data, and he is at the cutting edge of using Big Data to solve real business problems. Frank has a strong blend of technical knowledge and business understanding, and has spent the last nine years focused on the business of service.

Under his leadership, EMC was honored in mid-2012 for the third consecutive year with the Technology Services Industry Association (TSIA) STAR Award for “Excellence in the Use of Metrics and Business Intelligence.” Prior to joining EMC, Frank worked in various fields and remote technical support roles.

Read More

Share this Story
Join the Conversation

Our Team becomes stronger with every person who adds to the conversation. So please join the conversation. Comment on our posts and share!

Leave a Reply

Your email address will not be published. Required fields are marked *

8 thoughts on “How to Become a Data Scientist

  1. Fantastic Article!!
    Its so insightful and inspiring for an aspiring Data Scientists like us.
    Thanks so much for bring this out OBH and Frank.

  2. Very Interesting and informative indeed…Thanks for such an effort. It is the best article for someone who wants to kick start in Data science and analytics field…

  3. Great article! I had been doing a couple these already and it’s really good to get some validation and learn the other areas to focus on.

    Thanks Frank and Oshry!