The Harsh Reality of Being an ML Researcher

What you need to expect when entering the field of ML research.

We all know that Machine Learning is the hottest thing to work on right now. And if you are a researcher, especially at a famous lab, start-up, or big tech company, you are literally at the frontier of developing the arguably biggest technology of mankind while earning a lot of money.

So, with this post, I definitely don’t want to talk down the ML researcher career, but I want to shed some light on what the harsh reality of being an ML researcher can look like and whether it is something for you.

For context, I have been an ML student researcher for the past 3 years, am collaborating with a researcher at Google DeepMind, have published a paper at a top conference, and have had 2 papers rejected.

Sigh… The last rejection was so stupid… I just have to explain it to you in a second.

Jobs and Requirments

Okay, let’s directly address the elephant in the room. How likely is it that you will become an ML researcher or even engineer who earns multiple 100k a year at a top company like OpenAI, Google, or Meta?

The short answer and harsh reality is: it is difficult. Very difficult. But not impossible.

Of course, for some, it is more difficult than for others, mainly because of one reason that I’ll get to in a second.

But the long answer is that it depends on many factors, such as your experience, passion, network, and of course, luck.

One of the most common questions that aspiring ML researchers or engineers have is: do I need a Ph.D. to get a job at a top company?

The answer is: no, but it helps. A lot. Or you just need equivalent experience. Explaining how to do that is perhaps for another post. So if you are interested in that let me know in the comments.

Anyway, a Ph.D. is not a requirement for most ML engineering roles, but it is often a prerequisite for ML research roles. Which makes sense, right? You need research experience.

Definitely. But that is even more the case because ML research roles and some ML engineering roles require very domain-specific expertise and experience, which are usually acquired through years of studying and working on one particular problem.

A Ph.D. gives you exactly this freedom: to spend 3–5 years focusing on one research topic and becoming an expert in that field.

A machine learning researcher, also called a research scientist, usually sits above an ML engineer in the hierarchy and earns a bit more money. So it again makes sense to expect a bit more complex ML problem-solving expertise. The ML engineer then helps realize the researcher's ideas. That is if the company you are working at is large enough to be able to differentiate between the two roles.

But okay. That said, all that is, of course, not enough! A Ph.D. is not a guarantee of success either.

I mean, I know a few PhDs, and the spectrum of skills among them is very large. Some are brilliant, some are average, and some are below average. You still need to prove yourself, for example, by publishing papers, collaborating with others, and communicating your ideas clearly.

All this is a harsh reality, but some people struggle more with that than others.

I, for example, also know PhDs who are no tryhards whatsoever. And those are the ones that actually are brilliant. They simply love what they are doing, are super friendly and uncomplicated, and now work as researchers at Google or other amazing labs.

They just enjoyed the process of doing research and solving problems, and enjoyed those 3–5 years of their Ph.D. You really need to love what you are doing. Only then can you continue for so long without getting any crazy compensation for a few more years.

Or you are lucky to have found this passion early on in your master's or even bachelor's and have managed to produce amazing work, e.g., relevant personal projects, open-source contributions, or top published papers.

The point is: It takes time.

This is definitely not something you can achieve in 6 or probably even 12 months. As mentioned, even when only considering the bachelor's and master's and ignoring the Ph.D., people study and work for several years to get to that point.

All of this still requires a specific amount of luck, and I will get to that in a second. I can barely hold myself back from sharing the story of my recently rejected paper…

I could genuinely go on for much longer. There is so much more to uncover, but let’s look at the next thing no one tells you when getting into research.

Coding Skills

Let’s briefly have a peek at the world of physics research.

In the earlier days, a lot of physics research meant a person or small team sitting at a desk with pen and paper and coming up with theories that could be supported through little experiments on a single table.

You did need equipment, e.g., fairly expensive lasers, but nothing that a university could not quite afford.

Fast forward a few decades, and to really make the slightest dent in the modern standard model of physics, you need to work at CERN. A 27 km long circle tube that shoots the same particles at each other as you could do in your small lab. But with much more power.

Now, think of an ML researcher from a few years ago. Research meant you developed ideas and tried them out with simple code. Or at least simple for today's standards.

One of the greatest breakthroughs in DL research happened back in 2015 and was a thing called the ResNet with its skip connections. I don’t want to go into the details here, but essentially, the output of a Neural Network block was its input + the block's output.

Do you know what that meant in terms of code?

You change this line of code.

output = NN_block(input)

To this one.

output = NN_block(input) + input

That’s it.

But now, if working with large models, before being able to actually try out something new, you need to know distributed computing, hardware optimization, and how to manage large computing infrastructure. Not to speak of actually getting access to that hardware.

Before, you could look at lasers in your lab, but now you need to work at CERN.

That is why the big companies split the roles of research scientists and ML engineers!

Luckily, there are a lot of different ML domains to research that don’t require LLMs or other super-large models. Discussing them all would be outside the scope of this post, but I have a list of those ML domains that you can download completely for free here.

Now, depending on the person you are, you might be excited about the amount of software knowledge you need to work on ML research, and I, for example, also quite enjoy it. But if you are a person who wants to be a researcher because you do not want to be an engineer, then this reality might mean you need to explore a bit more and might not want to work on very large models.

What does doing research really involve

So okay, you have decided you want to take on the challenge and spend the time on a master's or, even better, a PhD.

What does doing research really involve? What is it like, and is it for you? At least in my opinion and experience.

There is one property of doing research that is exactly what people want to do and makes it exciting but also very frustrating. In research, you are working on something new, the state-of-the-art! Something that has ideally never been done before.

Now, just imagine you are starting your Ph.D. The beginning is always tough. You have no idea what to research. You might have a rough direction, but first need to explore your area of interest.

In my case, for example, that is video-language modeling, which is still a very broad direction, with many different open problems.

But, okay, you are excited to get started and read a lot of papers on the different open problems, what the current state-of-the-art models are doing, and what the previous ones did.

This alone took me 2–3 months of just reading papers, thinking I had an actual idea, and then scrambling it up again until I actually came up with one that I thought might work.

So, at this point, I was so excited to finally get started with coding and share my results! I felt so confident and motivated and thought that this would be the work that would get me my dream job at Google DeepMind or OpenAI or whatever!

I mean, that’s probably something that you want as well, right?

But then, of course, reality hits, and a bit of anxiety starts to build up.

Imagine you spend months working on your idea, but you never achieve good results. You encounter technical difficulties and theoretical challenges, and your method itself seems incomplete.

You realize that your hypothesis might be wrong or, worse, irrelevant. You wonder if your research is good enough or if you are wasting your time. You feel frustrated, discouraged, and, again, anxious.

This is the issue I mentioned when working on something new. It means you don’t know if your idea will work or be any good… Can you imagine how stressful that can be?

This is the reality of most researchers, especially PhDs, who have to publish a certain amount of papers to complete their PhD.

Nevertheless, you trust in the process and decide to look forward.

You remind yourself that negative results are still results and that they can help you refine your research question and design. You then look for feedback from your supervisor and peers, and you learn from their suggestions and criticisms. You read even more literature and find new perspectives and approaches, so you update your hypothesis and try again.

And guess what! You finally get some positive results. You find evidence that supports your hypothesis or that reveals something new and interesting. You eventually write a paper and want to submit it to a prestigious conference, which itself gets more and more stressful as the submission deadline approaches, but you feel proud and excited.

Only to then again have to face the harsh reality of the review process, which is often unpredictable, random, and sometimes just unfair.

I, for example, recently submitted a new paper to a conference and got two reviews. One of the reviewers gave me and my co-authors a decent rating.

Now, I have to admit I wrote the paper a few months ago, and it wasn’t the best or most relevant, but in my opinion, not too bad. So, I agreed with the first review, which still gave us an acceptance vote.

But the other one, the famous reviewer #2, gave us the lowest scores on all categories and did not even provide any further detailed feedback…

I guess the reviewer had a bad day… But that obviously hurt us and is simply not acceptable. Our paper was rejected, and I have to say, I was upset and disappointed. There was not that much that I could do.

But, you know, there was this amazing 2021 experiment they did at the largest AI conference called NeurIPS where they pretty much showed that 50% of the papers accepted by one committee would have been rejected by a second, different one… That means that if they swapped out the reviewers, 50% of the accepted papers would have been rejected.

In 2014, 49.5% of the papers accepted by the first committee were rejected by the second (with a fairly wide confidence interval as the experiment included only 116 papers). This year, this number was 50.6%. ~ NeurIPS study

Well, anyway, after realizing that the review process is not perfect and that it does not always reflect the true value of your work, we looked at the positive aspects of the feedback and used it to improve our paper.

We then resubmitted it to another conference and are currently waiting for the reviews. So I guess, we could still find some comfort in knowing that we are not alone.

All of this is the harsh reality of being or wanting to be a top-earning machine learning researcher. At least, that is what I have experienced until now. I am sure I missed a lot of other details.

But before you can even get to this point, you have to avoid a plethora of common beginner mistakes, so you might want to read this post next, where I share 7 mistakes you might be making already!

7 Mistakes Beginner ML Students Make Every Year

Don’t study LLMs! You’re making a mistake!

pub.towardsai.net

Thanks for reading. Ba-Bye!