Don’t Do Data Science, Solve Business Problems
The term ‘Data Scientist’ has become colloquialized in modern business speak to signify an individual with almost every data centric skillset there is. Organizations who want to hire Data Scientists look for the ‘Unicorn’ — the Data Scientist professional with such a wide and deep skillset, they practically don’t exist. The Data Science Venn Diagram (popularized by Data Scientist Drew Conway) helps visualize this broad set of skills.
When most people look at this diagram, they instantly think that it defines a Data Scientist. As in anyone who calls themselves a Data Scientist should be an expert at Hacking, Math & Statistics, AND have Substantive Expertise in the field they work in. In practice, individuals who are truly experts in all three of these areas are extremely few and far between (If you are a true expert in all three, you’re probably working at Google, Microsoft, or Facebook).
The diagram was never meant to define a Data Scientist, (the person) but instead defines Data Science (the field). To be a Data Scientist one must be exceptional in at least one, or two of the categories, and have sufficient knowledge in the others to accomplish whatever the given objective may be.
Some companies are getting better at separating Data Scientists into more appropriate job baskets. Airbnb, who employs one of the most mature Data Science teams in the world, recently split their Data Science teams into three tracks (Analytics, Algorithms, and Inference) in order to more effectively communicate and structure value. Many companies now list the job title “Machine Learning Engineer” or “Research Scientist” to emphasize the disparity of roles across the Data Science continuum.
But while we’re getting better at defining Data Science and segmenting its many sub-domains into appropriate job titles, functions, and tasks, we’re still missing the point. Data Science is not about algorithms, advanced technical skills, or specialized degrees — it’s about solving problems.
Baking Bread vs. Building the Oven
Organizations are more concerned about the academic and technical complexity of their Data Science teams then the value they bring to the business. In fact, I would bet that 2 out of 3 of you who read this blog post can think of a Data Science project at their company that has significant investment but that has yet to show business value. Why is that?
Businesses want fresh baked bread, but are hiring electrical engineers instead of bakers.
Cassie Kozyerkov, Chief Decision Intelligence Engineer @ Google, explains the analogy:
Imagine hiring a chef to build you an oven or an electrical engineer to bake bread for you. When it comes to machine learning, that’s the kind of mistake I see businesses making over and over.
If you’re opening a bakery, it’s a great idea to hire an experienced baker well-versed in the nuances of making delicious bread and pastry. You’d also want an oven. While it’s a critical tool, I bet you wouldn’t charge your top pastry chef with the task of knowing how to build that oven; so why is your company focused on the equivalent for machine learning?
Are you in the business of making bread? Or making ovens?
Ms. Kozerkov goes on to explain that the reason businesses so often fail at machine learning is due to a poor understanding of research vs. application, i.e. building the oven vs. baking bread.
Research experts with highly specialized degrees can be incredibly valuable in the right situation (for example, if your business or product is the algorithm) however, most businesses don’t need one. What they need is a ‘baker’ — someone who can ‘bake bread’, sell it, and distribute it effectively using the kitchen that’s already been built somewhere else. If you manage a team and want to hire the research expert, go right ahead, but make sure you pair her/him with someone who can truly solve your problem. Otherwise, don’t complain when you’re not getting the value you thought you would. If you’re saying to yourself “I need someone who’s a great baker AND an engineer,” good luck. While those individuals do exist — they’re probably making more than you do working at Google or Facebook. Instead of hunting for the “Unicorn” build a team specially designed for the problems specific to your business.
If you’re currently in a Data Science role — what challenges does your current company face? Do you understand those challenges extremely well? Can you clearly measure the business value of the projects you’re working on?
For aspiring Data Scientists — what types of problems are you interested in? Healthcare? Business? Self-driving cars? Before you take a deep dive into a technology course or a PhD, do you understand the problems that you would like to solve?
Do you want to build ovens or bake bread?
You Don’t Need AI
Contrary to popular belief, AI is not a magic bullet. In Harvard Business Review’s July, 2017 cover story, The Business of Artificial Intelligence Facebook’s AI boss, Joaquin Candela (speaking of Unicorns) vents his frustrations on this point:
“What frustrates me,” he says, “is that everybody knows what a statistician is and what a data analyst can do. If I want to know ‘Hey, what age segment behaves in what way?’ I get the data analyst.
“So when people skip that, and they come to us and say, ‘Hey, give me a machine learning algorithm that will do what we do,’ I’m like, ‘What is it that I look like? What problem are you trying to solve? What’s your goal? What are the trade-offs?’” Sometimes they’re surprised that there are trade-offs. “If that person doesn’t have answers to those questions, I’m thinking, ‘What the hell are you thinking AI is?’”
They are thinking it’s magic.
“But it’s not. That’s the part where I tell people, ‘You don’t need machine learning. You need to build a data science team that helps you think through a problem and apply the human litmus test. Sit with them. Look at your data. If you can’t tell what’s going on, if you don’t have any intuition, if you can’t build a very simple, rule-based system — like, Hey, if a person is younger than 20 and living in this geography, then do this thing — if you can’t do that, then I’m extremely nervous even talking about throwing AI at your problem.’
Mr. Candela’s insights are incredibly poignant in our current obsession with all things AI and ML. The hard truth is this: you probably don’t need AI — at least not yet anyway. Before you can get to true machine learning that’s actually having an impact on your business, you need to have a very specific, well defined business problem.
In my personal experience, most businesses haven’t defined the problem well enough to even apply a set of simple rules to it. As Mr. Candela emphatically states — if you haven’t even gotten that far how can we even begin discussing applying AI? The bottom-line is, we can’t.
Don’t Do Data Science, Solve Business Problems
Forget about Data Science for a minute and make a concerted effort to unravel problems and make plans on how to solve them. If you do this, a funny thing will happen — the technology/algorithm/or technique you need to apply will make itself apparent. You’ll become good or even expert at it because you won’t just be hacking or calculating, you’ll be solving a real, practical problem.
Here’s a few suggestions for Data Scientists or those on Analytics teams to apply this idea more fully:
- Become a scientist of the business. Spend a little bit less time learning new algorithms and Python packages and more time learning the levers that make your specific business go up or down and the variables that impact those levers. Identify data sources contributing to those variables — usually at the intersection you will find high value opportunities.
- Be ruthless in prioritizing and accepting projects. Prior to moving forward on a DS project, evaluate 1) The action that will be taken with the output and 2) the business value that will be created based on that action. If both the action isn’t clear and the value is not high, don’t waste your time. Side note: Data Science is NOT Business Intelligence, BI is an important IT function that maintains the integrity of data sources and dashboards — your job as a Data Scientist is to solve problems in the business.
- Don’t expect stakeholders to always (or ever) be able to define the problem. In my opinion, this is the number one most important skill for a Data Scientist above any technical expertise — the ability to clearly evaluate and define a problem. Most business stakeholders have problems but haven’t thought about them long enough to be able to define the process behind them. This is the place where you will make Machine Learning and AI work for your organization — by deciphering the needs of the business into a process where Data Science can be applied effectively.
- Make yourself part of the business. Do not under any circumstances become siloed. Proactively get involved with the business unit as a partner, not a support function.
Good lucking solving problems.
