Opinion
Why I Became a Data Scientist over a Data Engineer
To help you to decide on your career

Table of Contents
- Introduction
- Product Stakeholder Collaboration
- Preference for Python over SQL
- Experimentation Oriented
- Summary
- References
Introduction
The long story short, is that of course, I prefer to be a data scientist over a data engineer, but why? Perhaps these reasons I will share from my personal experience can relate to what you are thinking, or at least help you to determine if you want to pursue either path, or reconsider changing to the other role. Although the roles share the same first word, there are a ton of differences between these two positions.
It is simple to think of data engineering as the before, and data science as the after. We will dive deeper into this statement, but essentially, data engineers provide the base and structure of which data science builds off of and reaps the benefits of. It is important to note that some of the job requirements overlap, and some companies even have these two roles combined. However, I think it is best to assume these as separate roles as their focuses are vastly different. With that being said, let’s look into the reasons why data science is preferred over data engineering.
Product Stakeholder Collaboration

In a way, a data scientist is somewhat of a product manager, which can be a pro or con in your decision or reasoning. I enjoy the aspect around defining a problem statement, identifying where the data is and how it can be ingested (usually aided by a data engineer), feature engineering, model comparison, final model deployment, and the analysis of the impact on product users.
Here are some of the experiences for data scientists regarding product stakeholder collaboration that you may or may not enjoy:
- Identify pitfalls of the product that face users
- Develop solutions with algorithms
- See your product changes front and center on an app
- Analyze and be proud of your impact on the business and its users
- Work with product team more
- Work on product strategy more
Of course, there are always going to be some overlaps between these roles, even with these experiences discussed above. However, data engineers focus more on the data itself, whereas data scientists tend to focus more on product projects.
The users for data engineers tend to also be the employees of a workplace, as a data scientist might be a user of a data engineering product pr project. The opposite is usually true for data scientists, where products tend to face more of the outside user, the customer, but there can still be internal tools that data scientists work on.
Preference for Python over SQL

These coding languages are pretty different, and most companies usually expect you to be proficient in both whether you are a data scientist or data engineer. But, the focus is usually on SQL for data engineers, whereas for data scientists, it is Python (or R). With that being said, if you do not prefer SQL, or querying, but still like data itself and working with it, then you are most likely in the camp of data science.
Here are some examples of how data scientists will use Python, and when they would use SQL:
- Python is used in popular libraries
- Python libraries that encompass most of the data science part — the machine learning algorithms
- Python can be used for deployment as well
- SQL is used to query the dataset usually beforehand, or the SQL is used to query the results of the model, however, some of this querying can be done in a Python pandas query library module instead
Once again, you will probably use both in either career, but the difference is which you would like to use as a majority in your day-to-day work. Sometimes you might go two weeks without using SQL, if you are focusing on just the model itself, and other times you can use SQL hourly.
Experimentation Oriented

You can certainly perform experiments as a data engineer, regarding time consumption, memory, cost, etc., but the experiments that I am discussing are the traditional thought.
Here are some of the experiments you can expect to perform as a data scientist:
- Traditional AB testing with significance
- Comparison of feature/importance
- Comparison of models
- Comparison of accuracy or error metrics
- Comparison of business metrics (KPIs — Key Performance Indicators)
- Graphically/visually compare all of the above
- Comparison lends itself well to discussion with stakeholders and non-data scientist users
These experiments are at the heart of data science work, experiments and comparisons can be applied to pretty much any job, but for algorithms and statics, experiments are key.
Summary
Whether you prefer to be more on the product side rather than the strictly engineering side, like Python more than SQL, and enjoy testing experimental situations, then data science might be better for you than data engineering.
To summarize, here are some of the reasons why I prefer data science over data engineering, and maybe these coincide with your thoughts, or might be new to you:
* Product Stakeholder Collaboration* Preference for Python over SQL* Experimentation OrientedI hope you found my article both interesting and useful. Please feel free to comment down below if you agree or disagree with these reasons to choose data science. Why or why not? What other reasons or situations do you think we could discuss that are important? These can certainly be clarified even further, but I hope I was able to shed some light on some more unique and specific why I chose data science over data engineering. Thank you for reading!
I am not affiliated with any of these companies.
Please feel free to check out my profile, Matt Przybyla, and other articles, as well as subscribe to receive email notifications for my blogs by following the link below, or by clicking on the subscribe icon on the top of the screen by the follow icon, and reach out to me on LinkedIn if you have any questions or comments.
Subscribe link: https://datascience2.medium.com/subscribe
References
[1] Photo by Nick Fewings on Unsplash, (2018)
[2] Photo by Jason Goodman on Unsplash, (2019)
[3] Photo by David Clode on Unsplash, (2018)
[4] Photo by Girl with red hat on Unsplash, (2021)
