avatarDaniel García

Summary

The article outlines a structured approach to learning machine learning (ML) in 2024, emphasizing foundational knowledge in Python and math, familiarity with the ML developer stack, and engagement with practical courses and real-world projects.

Abstract

The author, a PhD candidate and head of an ML department, presents a comprehensive guide for aspiring ML practitioners to learn machine learning from scratch. The guide is divided into six key steps: mastering Python basics, understanding essential math concepts, getting acquainted with the ML developer stack, completing foundational ML courses, engaging in real-world projects, and participating in competitions like Kaggle. The article stresses the importance of a solid foundation in Python and math, recommends specific libraries and tools such as Jupyter notebooks, pandas, numpy, and matplotlib, and suggests pivotal courses like Andrew Ng's Machine Learning Specialization and Andrej Karpathy's Neural Network series on YouTube. The author also encourages learners to replicate research papers to enhance their ML portfolio and suggests using platforms like PapersWithCode for reference.

Opinions

  • The author believes that a strong grasp of Python is crucial for anyone starting in ML, as it is the primary language used in the field.
  • Mathematics, particularly high school and early college-level concepts, are deemed necessary for understanding the underlying principles of ML.
  • Jupyter notebooks and libraries such as pandas, numpy, and matplotlib are considered essential tools for ML development.
  • Andrew Ng's Machine Learning Specialization on Coursera is highly recommended for its comprehensive coverage of ML concepts and frameworks.
  • Andrej Karpathy's YouTube series on Neural Networks is praised for its in-depth explanation of NLP models and the mathematics behind backpropagation.
  • Engaging in real-world projects and competitions is seen as a critical step in solidifying one's ML skills and building a competitive portfolio.
  • Reimplementing research papers is suggested as a way to challenge oneself and stand out in the ML community.
  • The author expresses a personal preference for PyTorch over other ML frameworks like TensorFlow.

How I’d learn ML in 2024 (If I Could Start Over)

I am a PhD candidate and the head of ML department in an automotibe industry company, but it took me over 4 years to get to this point.

So, today, I will share how I would learn ML if I could start over by revealing the 6 key steps you need to take.

Let‘s get going!

Python

Typically, there’s no strict sequence for tackling these steps, but I’d advise against jumping straight to the final and most crucial phase first.

Instead, I highly suggest beginning with the fundamentals of Python.

Python is the go-to programming language for virtually everyone in the machine learning field, and it serves as the foundation for every subsequent step mentioned here.

This is particularly relevant for novices unfamiliar with concepts like lists or dictionaries, or those who haven’t yet mastered basic programming constructs like if-else statements or for loops. It’s essential, in my opinion, to also grasp concepts such as list comprehensions and class inheritance.

Frankly, if you’re unsure where to start, simply search for a “Python tutorial” or course on YouTube or Google and dive in. The wealth of exceptional, free resources available is astounding, but remember, it’s crucial to practice coding alongside the tutorials.

Maths

Jump into ML with Python to start things off on a fun note, but don’t get too bogged down in the details yet. Sure, the math part will come into play eventually, but that’s no reason to sweat it now.

You might think, “Hey, I don’t need math when I’ve got all these fancy Python libraries doing the heavy lifting!” And yeah, that’s partly true. But, to really get what’s going on in most ML stuff, you’ll need a handle on some basic math concepts like calculus, linear algebra, and probability.

Don’t worry, though — it’s not like you need to be a math whiz. We’re talking about the kind of math you’d run into in high school or the early days of college. Understanding what a derivative is or getting the gist of matrices and dot products is pretty much what you need.

And guess what? There are tons of awesome, free resources out there. I’m talking about courses [1, 2, 3,] or cool sites like Khan Academy. And hey, Brilliant.org is a goldmine for learning the basics (totally not sponsored, just saying 🥲).

Or, you know, you could always just hit up college and knock out those basic math classes if you’re going that route.

I’ll spill the beans on my absolute favorite math resource for neural networks later. But for now, just remember, you don’t have to rush through every math course out there. Take it easy, or it’ll just bum you out and suck the fun out of everything.

Stuck on some math problem later on? Just Google it.

Nail down the basics, and then you’re all set to move on to the next cool part: getting to know the ML developer stack.

ML Developer Stack

Alright, you’ve got the Python basics down, so what’s next? If you haven’t dabbled in them yet, it’s time to play around with some cool tools like Jupyter notebooks and libraries like pandas, numpy, and matplotlib.

Numpy is your go-to for crunching numbers with matrices or arrays. It’s a solid place to start because you can apply the math skills you’ve just picked up and see how easy it is to do something like multiplying two matrices together.

Then there’s matplotlib, which is all about making your data look pretty. It lets you visualize data and graphs, making the math you’re working with a lot more tangible. And honestly, who doesn’t find making graphs a bit fun and super handy?

Now, let’s talk about pandas. It’s a fantastic library for handling data in tables, which is something you’ll encounter a lot in ML. Pandas makes it a breeze to tweak and visualize this kind of data.

These libraries are like bread and butter when used with Jupyter Notebooks, and they’re crucial for your ML journey.

Diving into these tools will also sharpen your Python and ML skills on the fly. But remember, stick to the basics for now with a couple of tutorials. You’ll get a deeper understanding of these libraries when you start tackling actual projects.

Alright, enough with the tools! Time to circle back to some theory and dive into the real deal: ML and Deep Learning!

Learn Machine Learning

So far, so good, right? If you’ve been keeping pace, you’ve probably spent a few weeks on this, depending on your starting point and the time you’re investing.

Now, let’s shift gears to the ML courses that require a bit more commitment.

First up, you’ve got to check out the Machine Learning Specialization by Andrew Ng. It’s like the holy grail of ML courses and a real game-changer for many.

In this series, you’ll dive into some nifty machine learning frameworks like sci-kit learn and TensorFlow. Now, I’m a bit biased towards PyTorch myself — it’s just my personal favorite. But the cool thing is, once you get the hang of one framework, picking up another isn’t too daunting. Trust me, this course is pure gold, and guess what? It’s free!

This isn’t just any beginner’s course. It’s packed with essential ML concepts — the kind you’ll need to breeze through ML job interviews.

Oh, and about that math resource for Neural Networks I mentioned earlier? After you’ve tackled Andrew’s course, give Andrej Karpathy’s Neural Network series on YouTube a whirl. He breaks down building an NLP model from scratch and covers everything up to transformers, not skimping on the math behind backpropagation. Seriously, it’s a must-watch!

Once you’ve got some practical know-how from Andrew’s and Andrej’s courses, level up with the Deep Learning Specialization. This one’s more hands-on with Neural Nets, and you’ll even get to play with Huggingface 🤗. It’s a standout library in the field, so you’ll want to get familiar with it.

And if you’re itching for more on Huggingface, especially in the realm of NLP, their own NLP course is the way to go. It’s perfect for digging deeper into advanced NLP topics.

So, those are my top picks — two, maybe three courses that are totally worth your time.

How To Really Get Good

You’ve made some solid progress, tackling a bunch of smaller projects or tutorials. But now, it’s time to roll up your sleeves and dive into some real-world projects. This is where the learning curve shoots up, and trust me, it’s where the fun begins.

First off, hit up Kaggle. It’s like a playground for data scientists, with challenges of all shapes and sizes. Start with the easier ones to keep your spirits up. Jumping straight into the more intense challenges, especially those with prize money, might be tempting, but keep your expectations in check. Winning isn’t easy and usually requires hefty computational resources.

Once you’ve cut your teeth on Kaggle, brace yourself for the ultimate challenge: reimplementing a research paper and replicating its results. Yep, it’s tough, but the learning payoff is huge. Plus, pulling this off can really make your ML portfolio shine. Use PapersWithCode as reference to see some state-of-the-art models and their implementation.

But hey, there are other, less daunting ways to stand out in the ML world. Curious? Stay tuned for my next post, where I’ll spill the beans on some simpler strategies to boost your visibility in the field.

Hope you enjoyed the article. If this story provided value and you wish to show a little support, you could:

  1. Clap a lot of times for this story
  2. Highlight the parts more relevant to be remembered (it will be easier for you to find it later, and for me to write better articles)
  3. Follow me on Medium
  4. Read my latest articles https://medium.com/@iamdgarcia
  5. Suscribe to receive email whenever I publish. Link here

If you want to read more on the topic here some resources:

Artificial Intelligence
Data Science
Machine Learning
Deep Learning
Programming
Recommended from ReadMedium