Machine Learning Engineer — the new “most sexiest” job in AI and Machine Learning domain?

Summary

Machine Learning Engineers have higher reported salaries than Data Scientists, with a difference of about 14,000 USD/year, or 10% of the average yearly compensation.

Abstract

According to a Machine Learning model trained to predict reported salaries based on various features, Machine Learning Engineers have higher reported salaries than Data Scientists. The average difference between the model predictions for these two job titles is about 14,000 USD/year, or 10% of the average yearly compensation. The largest absolute difference between SHAP values for predicted Machine Learning Engineer and Data Scientist yearly compensations is for Senior level (Expert) positions, while the largest relative difference is for Entry level (Junior) positions. Both absolute gap and relative gap are the largest for the non-remote work and the medium-sized companies (50 to 250 employees). The largest difference in absolute and relative terms is for employees working and residing in leading EU markets such as Germany and France.

Opinions

The author believes that the difference between SHAP values for predicted Machine Learning Engineer and Data Scientist yearly compensations grows between 2022 and 2023, both in absolute and relative terms.
The author thinks that the gap between these job titles becomes larger over time.
The author suggests that the largest relative difference between these job titles is for Entry level (Junior) positions.
The author indicates that both absolute gap and relative gap are the largest for the non-remote work and the medium-sized companies (50 to 250 employees).
The author states that the largest difference in absolute and relative terms is for employees working and residing in leading EU markets such as Germany and France.

def plot_gap(col, main_col="job_title", value1="Machine Learning Engineer", value2="Data Scientist"): df_infl = X_test.copy() df_infl['shap_gd'] = shap_values[:,int(list(X_test.columns).index(main_col))] df1_mean = pd.pivot_table(df_infl, values=['shap_gd'], index=[col, main_col], aggfunc=np.mean) df1_std = pd.pivot_table(df_infl, values=['shap_gd'], index=[col, main_col], aggfunc=np.std) df2_mean = pd.pivot(df1_mean.reset_index(), index=col, columns=main_col, values='shap_gd')[[value1, value2]].dropna(axis=0) df2_mean['gap'] = df2_mean[value1]-df2_mean[value2] df2_std = pd.pivot(df1_std.reset_index(), index=col, columns=main_col, values='shap_gd')[[value1, value2]] df2_std['std'] = np.sqrt(df2_std[value1]**2 + df2_std[value2]**2) df2 = df2_mean[['gap']].join(df2_std[['std']], how='inner') df2 = df2.dropna(axis=0).sort_values('gap', ascending=False).sort_values('gap', ascending=False) plt.figure(figsize=(12,8)) plt.bar(x=df2.index, height=df2['gap']) plt.errorbar(df2.index, df2['gap'], yerr=df2['std'], fmt="o", color="r") plt.title(f'SHAP value of gap per {col}, yearly compensation') plt.ylabel('kUSD/year') plt.tick_params(axis="x", rotation=90) plt.show(); print() print() df_infl['shap_'] = shap_values[:,int(list(X_test.columns).index(col))] df2['avg_pay'] = expected_values + df_infl.groupby(col)['shap_'].mean() df2['avg_pp'] = 100*df2['gap']/df2['avg_pay'] df2 = df2.sort_values('avg_pp', ascending=False) plt.figure(figsize=(12,8)) plt.bar(x=df2.index, height=df2['avg_pp']) plt.errorbar(df2.index, df2['avg_pp'], yerr=100*df2['std']/df2['avg_pay'], fmt="o", color="r") plt.title(f'Gap per {col} relative to average pay') plt.ylabel('Percentage points') plt.tick_params(axis="x", rotation=90) plt.show(); return for col in X_test.columns: if col != 'job_title': print(col) plot_gap(col)

Work year

The difference between SHAP values for predicted Machine Learning Engineer and Data Scientist yearly compensations grows between 2022 and 2023, both in absolute terms:

and in relative terms:

In other words, the gap between these job titles becomes larger over time.

Experience level

While the largest absolute difference between SHAP values for predicted Machine Learning Engineer and Data Scientist yearly compensations is for Senior level (Expert) positions:

the largest relative difference (about 12%) is for Entry level (Junior) positions:

Company size

Both absolute gap and relative gap are the largest for the medium-sized companies (50 to 250 employees):

Residence location

Finally, the largest difference in absolute and relative (about 20%!) terms is for employees working and residing in leading EU markets such as Germany and France:

I hope these results can be useful for you. In case of questions/comments, do not hesitate to write in the comments below or reach me directly through LinkedIn or Twitter.

Machine Learning Engineer — the new “sexiest” job in AI and Machine Learning domain?

Differences of SHAP values for reported 2022–2023 yearly compensations between Machine Learning Engineers and Data Scientists across

Newest salaries in Data Science and AI explained by SHAP values

December 2023 version of the 2022–2023 year gross salaries: SHAP values for experience level, job title, and more

Work year

Experience level

Remote work ratio

Company size