The Future of Data Professionals According to Stack Overflow
Platforms, databases, and languages you will most likely see in the future according to data developers

Introduction
Stack Overflow is an awesome developer community. There is no denying it. And the best part is that that we all use it one way or another. I was able to find some of the python syntax that is used to transform the data and create the plots for today’s article. And all of that data is open to the public.
How can we use this data to figure out what to learn? Are there some databases, platforms, or languages that should be avoided? I will be analyzing both the survey data from 2017 to 2021. While looking at the survey data and the questions data today, I want to draw some hypothesis or theories about where we are going in the data realm. What languages will we be using? What databases? How about platforms?
Interested in looking how I got the data and these plots? Take a look at my Kaggle notebook here.
Datasets
Surveys
The primary dataset was Stack Overflow Datasets 2011 to Present²; I combined this dataset with the results from this year’s Stack Overflow survey results found here.¹ A lot of the initial code work I performed was based upon the work that Kasaraneni had already performed.³
Since 2017, Stack Overflow’s surveys had 338,489 respondents total. I then proceeded to remove those whose developer type was empty or null, leaving us with 283,272 total respondents. These respondents are spread out across the years as shown below where 2019 had the most respondents.

I filtered out the developer types who contained either ‘Data’ or ‘data’; the remaining developer types were database administrator, data scientist or machine learning specialist, data engineer, and data or business analyst. I was actually surprised to see that the max percentage of data professionals respondents was at a max 24.98% in 2019.

In addition to filtering out non-data professionals, I removed respondents who did not have any professional experience, leaving me with 64,703 respondents. Number of developers are spread out over the 5 years, where the largest percentage was in 2019 at 30.53%.

Questions
The other dataset I want to incorporate is the number of questions that have been asked on the Stack Overflow website. These questions will have tags that I will filter out using the results from the survey. Stack Overflow has their trends page set for the last 12 years; I cannot filter these out by dates, but I want to focus only on the last 7 years.
Analysis
Languages
The most desired languages to be learned were Python, SQL, JavaScript, HTML/CSS, TypeScript, Bash/Shell, C#, Go, Rust, and Java where the order is in order of popularity.

I compared these languages and the number of times that these same languages are tagged in questions on Stack Overflow.

Questions with tags Python and Typescript have increased, meaning those tend to get asked more frequently meanwhile questions with tags c# and Java have decreased. The other tags can still go either way.
Based upon this data, we should stick with Python and if you are interested in learning Typescript, learn Typescript in 2022.
Databases
Similarly, the survey from data professionals showed that the top 10 databases were PostgreSQL, MySQL, MongoDB, Redis, SQLite, Microsoft SQL Server, Elasticsearch, MariaDB, Firebase, and lastly DynamoDB.

I compared these databases and the number of times that these same databases are tagged in questions on Stack Overflow.

Microsoft SQL Server, MySQL, and SQLite are trending downwards. On the other hands, MongoDB and PostgreSQL are trending upwards. The others can either go either way.
In 2022, we should learn how to pull data and understand the architecture of MongoDB and PostgreSQL.
Platforms
Similarly, the survey from data professionals showed that the platforms are AWS, Google Cloud Platform, Microsoft Azure, Heroku, and IBM Cloud or Watson.

I compared these platforms and the number of times that these same platforms are tagged in questions on Stack Overflow.

Questions with tags Amazon Web Services, Microsoft Azure, and Google Cloud Platform have increased over the past 7 years. However, the other cloud providers are unfortunately not as desire-able but could still go either way.
Learning any of the top 3 cloud provides will be useful in 2022.
There you have it folks. The suggested languages, platforms, and databases to learn according to the data professionals who answered the Stack Overflow survey and questions from Stack Overflow.
Interested in looking how I got the data and these plots? Take a look at my Kaggle notebook here.
Resources1.https://insights.stackoverflow.com/survey/2021#developer-profile-developer-roles
2. https://www.kaggle.com/chaitanyakck/stackoverflow-datasets-2011-to-present
3, https://www.kaggle.com/chaitanyakck/eda-on-stack-overflow-survey-results-2017-2020
4. https://stackoverflow.design/brand/logo/
5. https://insights.stackoverflow.com/trends





