Design the Right Job Functions to Develop Your Data Science Team

Do you constantly search for the best data science talents? Do you often analyze job descriptions in order to design the most effective data science team? Are you looking for ways to unleash the power of collaboration to harvest the opportunities of economies of scale? In this post I want to assist you with the job functions that an effective data science team must have. I will write the essential job descriptions and qualifications for you to modify. No matter which industries that your data science team serves, these essential job descriptions can be applied to most of the industries and specialities.

On the other hand, if you are a novice data science job seeker looking for different type of job functions, the job descriptions below will be important to you as well. For example, some jobs require “a good understanding of data science modeling cycle and architects projects through implementation”. If you are not familiar with the modeling process of a data science project, please visit my previous post “Data Science Modeling Process & Six Consultative Roles”.

I have written articles on a variety of data science topics. For the ease of use, you can bookmark my summary post “Dataman Learning Paths — Build Your Skills, Drive Your Career” that list the links to all articles.

Rule 1: Job Specialization but Not Monotonicity

Job specialization lets employees to master significant tasks fast by learning-by-doing. It comes when you observes economies of scale. For example, it is inefficient to ask a data scientist to master ten specialized software languages for hours of heads-down coding, but at the same time being distracted constantly for general administrative type of work. If the data scientist can have uninterrupted working for eight hours, he will accomplish the quality work far better. This will achieve the best data quality.

However, the downside of job specialization is lack of wide applicable skills. Your employees complain their jobs are dull and monotone. Before they voice out their dullness, you want to offer short-term rotational programs so your employees can broaden their skillsets.

Rule 2: Do a Job Analysis

As the team manager you constantly evaluate new work demand that challenging your team. Does it require new skillsets that your existing team does not have? How do you evaluate new work demand? The best way is to observe and interview employees to find out how tasks are performed. Then you need to evaluate the knowledge, skills and abilities of your team.

The Essential Roles in a Data Science Team

Three job roles are essential in a data science team: (1) Product adoption & Implementation, (2) Data Engineers, and (3) Data Scientists. The product adoption role focuses on the partnership with the customers and their leadership to drive the growth. He/she should be sensitive to any potential demand of the data science models. He/she will continue to deliver use cases to increase the adoption. The data engineers focus on high quality of the data and the speed of delivery. In order to do so, he/she should be proficient in multiple languages and data platforms. The data scientists are responsible for modeling algorithms, creative feature engineering, model predictability and performance monitoring. These roles will interact closely with Information Technology (IT) to ensure the system reliability and total user experience.

Product Adoption & Implementation Lead — Job Description Example

Partner with the Senior and Executive Leadership team to drive product vision and strategies
Demonstrate a strong understanding of the users’ need and the roadmap for growth
Establishes and tracks KPIs focusing on customer adoption and needs
Commercializes product offerings with a strong go to market value and stance
Ability to balance short-term, incremental improvements with long-term, game-changing initiatives
Work closely with User experience (UX) and Engineering to deliver on product strategy
Be the go-to-person for all partners in case of any questions.
Define product improvements in order to integrate and offer new services with partners
Accountable for customers from kick-off to go-live and maintenance including building project plans, coordinating the plan to completion, creating requirements, and weekly status report.
Evaluate any “gaps” viewed by the customers in the process or system.

Qualifications

XX years of experience in product operations, sales engineering or similar
Exceptional people and communication skills to build strong relationships with both internal and external senior stakeholders
Great problem-solving and analytical abilities to identify, evaluate and manage opportunities
Experience working in cross-functional teams combining engineering, operations and product
Able to cope with change and to work in an agile environment

Data Scientist/Sr. Data Scientist — Job Description Example

Develop data mining, machine learning, statistical and graph-based algorithms designed to analyze massive data sets for business insights and partner with the data engineering team to ensure proper implementation and usage of algorithms.
Intermediate to Expert level proficiency with statistical probabilistic modeling techniques such as regression, tree-based methods (Random Forest, GBM), neural networks, support vector machines, supervised/unsupervised clustering techniques (k-means, DBSCAN, Expectation Maximization), principal component and factor analysis, etc.
Perform Natural Language Processing (word categorization, topic modeling, application of machine learning to NLP).
Determine appropriate methods, prove viability of selected method and educate internal teams as to the analytical foundation.
Lead large scale projects that utilize online & offline data, structured & unstructured data.
Mentor complex projects using wide breadth of data sciences and advanced techniques
Mentor junior team members on analytical projects or on cross-functional teams.
Review and approve methodologies used for advanced analysis projects (predictive models, clustering/segmentation, etc) by junior team members and others.
Revise and maintain existing internal procedures to ensure quality and efficiency.
Consistent exercise of independent judgment and discretion in matters of significance.
Use analytical rigor and statistical methods to analyze large amounts of data, extracting actionable insights using advanced statistical techniques such as data analysis, data mining, optimization tools, and machine learning techniques and statistics (e. g., predictive models, LTV, propensity models).

Qualifications

Advanced degree in qualitative fields (Ph.D. or M.S.) in Computer Science, Engineering, Machine Learning, or related discipline.
XX years of experience with Python, Scala, or Java, database technologies, and distributed frameworks (Hadoop, Spark, etc.)
Ability to explain complex statistical problems and solutions to laymen.
A good understanding of overall business, including financial acumen, ability to convert complex data into insights and action plans, demonstrated in-depth understanding of predictive modeling life cycle and architects projects through implementation.
Excellent problem solving skills, critical thinking and conceptual thinking abilities. Strong ability to communicate technical concepts and implications to business partners

Data Engineer — Job Description Example

Designs data marts and data models to support Data Science and other internal customers. Develop complex ETL (Extract / Transform / Load) processes. Integrates data from a variety of sources and assure data quality standards.
Develop frameworks, standards & reference material for architecture and associated products.
Design database systems and tools for real-time and offline analytic processing.
Build robust data pipelines and dynamic systems using programming skills in Python, Java or any of the major languages.
Designs and develops complex and large scale data structures and pipelines to organize, collect and standardize data to generate insights and addresses reporting needs.
Design and implement scalable data pipelines from multiple data sources such as e-commerce platform and third party services

Qualifications

XX years experience with distributed data storage systems/formats & data stores such as Snowflake, Redshift or other Big data systems
In-depth understanding of Architecture of different Databases
Experience working with batch processing/real-time systems using various open source technologies like Spark, MapReduce, NoSQL, Hive, etc.
Have worked with a major cloud provider such as AWS or Azure
Knowledge in data modeling, data access, and data storage techniques for big data platforms
Understand Continuous Integration/Continuous Deployment & Test Driven Development
Data Modeling, Warehouse, and Unstructured Data Skills
Collaborate with talented designers, product managers, and fellow engineers to build and plan scalable data pipelines supporting our business requirements
Implement systems to catch bugs and monitor data quality, ensuring the production data is always accurate and available