The website content discusses the challenges and implications of advanced machine learning techniques, specifically focusing on a paper that presents a method for matrix multiplication that could significantly improve machine learning efficiency.
Abstract
The article delves into the complexities of a groundbreaking paper on matrix multiplication that promises to enhance machine learning processes by reducing computation time. The paper introduces a method that can perform matrix multiplications up to 100 times faster than traditional methods and 10 times faster than current approximate methods, which is particularly advantageous when one matrix is known in advance. The author emphasizes that while the paper is technically complex, traversing multiple technical fields, this complexity is indicative of the multidisciplinary nature of meaningful machine learning. The article also suggests that the difficulty of understanding such advanced papers is a barrier that, once overcome, can lead to significant opportunities in the field, allowing practitioners to contribute uniquely and find joy in their machine learning journeys.
Opinions
The author acknowledges the paper's technical complexity, stating it as the most technically complex they have ever read.
Understanding the paper requires a solid foundation in coding, AI/ML, mathematics, and related fields due to its use of advanced concepts like Product Quantization, prototype learning, hashing, and aggregation.
The paper's significance is highlighted by the fact that matrix multiplication is a core operation in machine learning, and improvements in this area can have a broad impact.
The author believes that the complexity of machine learning is a positive aspect as it opens up new avenues for innovation and specialization.
There is an emphasis on the importance of learning the basics to understand the assumptions and experiment setups in complex machine learning papers.
The author encourages engagement with their content on various platforms and offers resources for those interested in deepening their knowledge in machine learning, including tutorials and coding interview preparation.
The author invites readers to reach out for potential tutoring and suggests using their Robinhood referral link for mutual benefit.
Paper shows why you will struggle at Machine Learning
This is easily the most technically complex paper I’ve ever read.
I’ve been going Multiplying Matrices without Matrices (link: https://arxiv.org/abs/2106.10860). And it’s a paper I have spent a lot of time on. How can I not? The abstract claims, “Experiments using hundreds of matrices from diverse domains show that it often runs 100× faster than exact matrix products and 10× faster than current approximate methods. In the common case that one matrix is known ahead of time, our method also has the interesting property that it requires zero multiply-adds.” If you understand machine learning, this has huge implications for the learning process.
This sentiment is not uncommon. If you feel this way, you are not alone
At the same time, I came across the above tweet on my timeline. And I can definitely see where this is coming from. Meaningful ML is by its nature multi-disciplinary. While the code for an LSTM and Random Forests stay the same, the context around the problem changes. Depending on what you’re working on, the way you get, prepare, clean, and evaluate your data changes. Thus you will end up needing to become proficient at multiple things. This process involves a lot of Googling and can be very frustrating/disheartening.
The paper is a rather extreme example of that. I double major in Math and Computer Science. Selected my courses to get good at Coding and AI/ML in particular. So I’m well suited to understanding the details. But even after a month, a lot of this paper is very challenging.
Me trying to understand the paper.
In this article, I will use the paper as an example of why good Machine Learning is difficult. I will explain why that’s a good thing for you, and what you can do to benefit from this. If nothing else, I hope that by the end of this article you understand what it takes to get to a high level at ML.
Understanding the Implications of this paper
A quick word on why this paper is greatness. In machine learning data points are represented as multi-dimensional matrices. Multiplying matrices is very important for a lot of functions. It is also notoriously difficult. To those interested, this article by Quanta is pretty good to understand.
Don’t underestimate pre-processing.
This is where the paper gets insane. “In the common case that one matrix is known ahead of time, our method also has the interesting property that it requires zero multiply-adds.” When might we see such cases? Imagine our model has the weights and just needs to compute the predictions based on input. The weights are a matrix we know which will be multiplied with the input matrix. Given how much this process happens, your savings will really add up.
This is one example of a great application of matrix multiplacation.
Why this paper is a nightmare to understand.
So now that we have some idea of why this concept is important let’s talk about why this paper is challenging. Simply put, it traverses a lot of technical fields. Here’s a depiction of the Product Quantization they use:
Not only is it using Vectors, but it also relies on prototype learning, hashing, and aggregation. This would require very good coding and mathematical skills. Even their hashing is far from basic. The authors rely on hashing trees, which can be terryfing. Check out section 4.1 for more details. The complexity and wide-ranging nature of the paper was best articulated by the authors as “our work draws on a number of different fields but does not fit cleanly into any of them”. Developing your understanding of the basics will help you at least understand the assumptions and experiment setups.
For a detailed look at some of the assumptions in the paper, check out this video. I go over the assumptions, a concrete example of the matrix multiplication approximation. Make sure to pause the video and read the snippets I’ve taken from the paper. I found them particularly insightful.
Why this complexity is a Good Thing for you
Obviously not every Machine Learning/AI venture is as complex as this paper. However, real-life ML will be complex. Following is an exchange I had with someone who read and enjoyed my article, 5 Unsexy Truths About Working in Machine Learning.
The complexity of Machine Learning opens a lot of doors. It means that there is always new ways to try things, new knowledge to discover, new protocols/ensembles to invent. It will allow you to specialize in the fields you’re most interested in. If you’re willing to put in the work and struggle, you will soon be able to develop your own value-adds. And that’s when it gets fun. How to become a Machine Learning Expert is an article to help you speed up the process. As long as you’re willing to find areas you’re interested in and dive into them, you will be able to get great results in your Machine Learning Journeys.
If you liked this article, check out my other content. I post regularly on Medium, YouTube, Twitter, and Substack (all linked below). I focus on Artificial Intelligence, Machine Learning, Technology, and Software Development. If you’re preparing for coding interviews check out: Coding Interviews Made Simple.
For one-time support of my work following are my Venmo and Paypal. Any amount is appreciated and helps a lot:
If that article got you interested in reaching out to me, then this section is for you. You can reach out to me on any of the platforms, or check out any of my other content. If you’d like to discuss tutoring, text me on LinkedIn, IG, or Twitter. If you’d like to support my work, using my free Robinhood referral link. We both get a free stock, and there is no risk to you. So not using it is just losing free money.