Acing the Data Science Take-home Assignment
Whether the take-home assignments serve their purposes and how to conquer them.

Nowadays, many companies include take-home assignments in their Data Scientist recruitment process. In this blog, I will share my thought about this test and some tips that I found handy when facing them.
What is a take-home assignment?
Similar to the coding test for Software Engineer, a take-home assignment is a preliminary filter that helps recruiters find candidates with the best skills/commitment to proceed with the next rounds of interviews. It usually comes after the screening round and before the first technical interview.
In the assignment, the candidate is given a dataset and is asked to do “something” within normally a week. That “something” can be clear, with transparent evaluation metrics, but can sometimes be very ambiguous. The topic and difficulty level varies greatly, depending on the liking and standard of the company.
As an example, here are some assignments that I encountered when looking for a Data Scientist job last year:
- Build a model and visualize the classified regions on the Iris Dataset.
- Build a housing image classification model and a demo service.
- From the Seattle Airbnb Dataset, think of a business problem for Airbnb, and solve it.
- Given ids of previous online advertisements and clicked customers, propose a strategy for the next advertising campaign.
The deliverables are normally the source code and some short slides that explain the works.
Is the take-home assignment a good thing?
I will say it upfront, I personally think that overall the take-home assignment costs both the candidates and the employers a too much time, effort, while not always serve its purpose well. In short, it is bad.
It is bad for the candidate, for three reasons:
- It will take the candidate roughly 10 hours plus to properly complete the assignment. That’s a big investment, considering that it’s only the early stage of the recruitment process.
- Having put a huge effort into the assignment, yet more often than not, the candidate only receives back a short rejection email without any feedback. This discouragement will negatively impact the candidate’s motivation and confidence.
- With such risks, the assignment creates an entry barrier stopping candidates from exploring new opportunities, or from simultaneously applying to multiple companies. Imagine this: if you applied to 4 companies and received their assignments in the same week, that would mean 40 hours worth of work, just like a full-time job.
It is also bad for the recruiter, because:
- Finding the dataset, crafting the assignment, and evaluating the submitted answers will take a huge amount of time and effort.
- Many outstanding candidates would not want to work on such time-consuming tasks. So, by trying to filter out unqualified candidates, the recruited might have also sent away the elite ones.
Comparing to the Data Science assignment, the coding test for Software Engineer is much better. It is easy to set up the test (nowadays many platforms provide this service), easy to judge the candidate’s submission (if the code runs, it runs, if it doesn’t, then it doesn’t) and only cost both sides a session of no more than 90 minutes.
My bag of tricks to pass the Data Science assignment
The take-home assignment is bad. However, since Data Scientist is still a relatively new role, many organization have not yet devised a standardized procedure to properly test the candidate skills. Thus, the take-home assignments become the standard. And you, as the candidate, will need to comply.
In this session, I will share some action tips that I found very useful in order to pass the assignment round at ease
Before the assignment

- Practice by doing Kaggle or personal Data Science projects. The assignment is very much similar to a Kaggle competition, sometimes easier. So those are the best places to do your mock-test. Practicing them will equip you will both confidence and experience to conquer this round.
- Prepare your working framework. Regardless of the dataset, most Data Science solution follows the same pattern: exploratory analysis, data processing, featuring engineering, model building, hyperparameter tuning, training and validating. Knowing that, you can prepare a templated notebook with all those sessions ready. When the data arrives, instead of being overwhelmed and panicking, you now know exactly the steps to follow.
- Prepare utility functions. When practicing Data Science, you will realize that there are some functions being used again and again. Those can be functions for loading and visualizing data, handling missing values, generating features, etc. You should try to generalize those functions and store them on your Github. During the assignment, instead of coding everything from scratch, you can simply reuse those functions. It will significantly save your time and effort.
- Research the hiring company. Coming up with a Data Science assignment is hard, so many companies tend to stick to a small pool of problems. Thus, researching the company on Glassdoor might give you some insights. Hints about the possible questions might also come from the nature of the company’s business. Example: an e-commerce firm may have a recommendation-related problem, a consultant firm may have a vanilla classical machine learning problem, etc.
During the assignment

- Clarify the dataset and expectations. Don’t jump straight to coding right after receiving the data. Ask questions on how the assignment will be graded or if you have any doubts about the dataset. You need to ensure you are super clear about the problem before spending your next 10 hours working on it.
- Go straight to the point. In order to impress the hiring manager, you might be tempted to do extensive exploratory analysis, plot beautiful figures, research thoroughly the pros and cons of each possible algorithm, etc. DON’T. This will distract you from reaching the ultimate goal: providing the solution. After understanding the dataset, go to the next processing step, then the model, then the prediction. When all the core steps are done and the model achieves satisfactory performance, then you can go for the extras. Otherwise, your works might end up having many beautiful figures, and a horrible accuracy.
- Follow the software engineering coding standards. Being a Data Scientist doesn’t mean you can code crappy. After you are done with experimenting and have finalized your solution, refactor your code. Make your code readable, give proper names for variables and functions, write down comments for explanations, etc.
- Present your work properly and tell a good story. You have done a good job completing the assignment, now is the time to sell it. Explain your workflow, your choice of model, why you think that your model has achieved good performance, what other things you have tried, etc. Keep this in mind: the recruiter only has very limited time to look at your work, so make your explanation as short and clear as possible, and make it a pleasant reading experience.
- Use a notebook whenever possible. A notebook will allow you to present your code, analysis, plots, and explanations, with a nice flow and format. This will make it much easier for the employer to understand and appreciate your work.
As an example, here is one of my submitted code that passed the assignment round.
After the assignment

- Thoroughly review your work. Having passed the assignment round does not mean that you have parted ways with it for good. There’s a good chance that your work will be a topic for discussion during the next round of interviews. Thus, be prepared to smoothly explain the what and why of every step you did. Especially, make sure you understand the algorithm behind your model to the bone. Using something you don’t understand is even worse than not knowing them at all.
- See if you enjoy solving the problem given to you because it might very well reflect what your future tasks would be like should you join the company. This is because to test the candidate’s compatibility, many companies give their real data and real problems in this assignment.
- Ask for feedback. Don’t be overhyped should you pass this round, nor run off crying if you are unfortunately rejected. Keep your cool, ask for feedback about your work so that you can grow technically. You have worked hard and submitted a respectful answer, thus you deserve an equally respectful feedback.
Conclusion
To sum up, I think that a take-home assignment is not ideal to include in the interview process, for both the company and the candidate. Yet, if having to face them, there are things that you can do to conquer it with optimized time and effort:
- Before the assignment: gain experience by doing Kaggle, get your framework and utility functions ready, and do some research about the company you applied.
- During the assignment: clarify your doubts, then proceed straight to the solution before doing extra exploration, follow good coding standards, and present your work properly.
- After the assignment: thoroughly review your work, see if you would enjoy doing similar tasks in the future, and ask for feedback.
I hope that some of my tips can be useful for you in acing the take-home assignment.
Thank you for your readings.
