15 Greatest AI/ML Research Papers Of All Time

You must read…

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

By Google Team

Abstract —

TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational devices such as GPU cards. The system is flexible and can be used to express a wide variety of algorithms, including training and inference algorithms for deep neural network models, and it has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields, including speech recognition, computer vision, robotics, information retrieval, natural language processing, geographic information extraction, and computational drug discovery. This paper describes the TensorFlow interface and an implementation of that interface that we have built at Google.

Paper can be found here :

https://arxiv.org/pdf/1603.04467v2.pdf

Code can be found here :

tensorflow/models

The TensorFlow Model Garden is a repository with a number of different implementations of state-of-the-art (SOTA)…

github.com

tensorflow/tensorflow

Documentation TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible…

github.com

Projects Videos —

Subscribe today!

Ignito

Excited to share that we have launched our Youtube channel — Ignito to cover all the projects and coding exercise for …

www.youtube.com

Some of the other best Series —

30 days of Machine Learning Ops

Complete System Design Case Studies Series

30 Days of Natural Language Processing ( NLP) Series

30 days of Data Engineering with projects Series

Data Science and Machine Learning Research ( papers) Simplified **

60 days of Data Science and ML Series with projects

100 days : Your Data Science and Machine Learning Degree Series with projects

23 Data Science Techniques You Should Know

Tech Interview Series — Curated List of coding questions

Complete System Design with most popular Questions Series

Complete Data Visualization and Pre-processing Series with projects

Complete Python Series with Projects

Complete Advanced Python Series with Projects

Kaggle Best Notebooks that will teach you the most

Complete Developers Guide to Git

Exceptional Github Repos — Part 1

Exceptional Github Repos — Part 2

All the Data Science and Machine Learning Resources

210 Machine Learning Projects

Tech Newsletter —

If you are interested, you can join my newsletter through which I send tech interview tips, techniques, patterns, hacks — Software Development, ML, Data Science, Startups and Technology projects to more than 30K readers. You can subscribe to Tech Brew :

Ignito

Data Science, ML, AI and more… Click to read Ignito, by Naina Chaturvedi, a Substack publication. Launched 7 months…

naina0405.substack.com

Github —

Coder-World04 — Overview

You can’t perform that action at this time. You signed in with another tab or window. You signed out in another tab or…

github.com

Scikit-learn: Machine Learning in Python

By Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Andreas Müller, Joel Nothman, Gilles Louppe, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, Édouard Duchesnay

Abstract —

Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings.

Stack Overflow Analyzed Data from 60,000+ Software Developers — Hours They Work, Languages They…

Here is what they found…

medium.com

Paper can be found here :

https://arxiv.org/pdf/1201.0490v4.pdf

Code can be found here :

scikit-learn/scikit-learn

scikit-learn is a Python module for machine learning built on top of SciPy and is distributed under the 3-Clause BSD…

github.com

PyTorch: An Imperative Style, High-Performance Deep Learning Library

By Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury , Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary Devito , Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, Soumith Chintala

What If Programming Languages were “GAME OF THRONES” Characters

Last one is hilarious…

naina0412.medium.com

Abstract —

Deep learning frameworks have often focused on either usability or speed, but not both. PyTorch is a machine learning library that shows that these two goals are in fact compatible: it was designed from first principles to support an imperative and Pythonic programming style that supports code as a model, makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs. In this paper, we detail the principles that drove the implementation of PyTorch and how they are reflected in its architecture. We emphasize that every aspect of PyTorch is a regular Python program under the full control of its user. We also explain how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance. We demonstrate the efficiency of individual subsystems, as well as the overall speed of PyTorch on several commonly used benchmarks.

A College Student Used A Language Generating AI Tool To Create A Viral Blog Post | Data Driven…

As writers, we like to tell ourselves that we're in a profession that can't be automated, not anytime soon anyway. But…

www.datadriveninvestor.com

Paper can be found here :

http://papers.nips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf

Code can be found here :

pytorch/pytorch

PyTorch is a Python package that provides two high-level features: Tensor computation (like NumPy) with strong GPU…

github.com

Automatic Differentiation in PyTorch

By Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, Adam Lerer

Abstract —

In this article, we describe an automatic differentiation module of PyTorch — a library designed to enable rapid research on machine learning models. It builds upon a few projects, most notably Lua Torch, Chainer, and HIPS Autograd, and provides a high-performance environment with easy access to automatic differentiation of models executed on different devices (CPU and GPU). To make prototyping easier, PyTorch does not follow the symbolic approach used in many other deep learning frameworks, but focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead.

Paper can be found here :

https://openreview.net/pdf?id=BJJsrmfCZ

Code can be found here :

pytorch/pytorch

PyTorch is a Python package that provides two high-level features: Tensor computation (like NumPy) with strong GPU…

github.com

Adapting the Tesseract Open Source OCR Engine for Multilingual OCR

By Ray Smith, Daria Antonova, Dar-Shyang Lee

Abstract —

We describe efforts to adapt the Tesseract open source OCR engine for multiple scripts and languages. Effort has been concentrated on enabling generic multi-lingual operation such that negligible customization is required for a new language beyond providing a corpus of text. Although change was required to various modules, including physical layout analysis, and linguistic post-processing, no change was required to the character classifier beyond changing a few limits. The Tesseract classifier has adapted easily to Simplified Chinese. Test results on English, a mixture of European languages, and Russian, taken from a random sample of books, show a reasonably consistent word error rate between 3.72% and 5.78%, and Simplified Chinese has a character error rate of only 3.77%.

Paper can be found here :

https://ai.google/research/pubs/pub35248.pdf

Code can be found here :

tesseract-ocr/tesseract

This package contains an OCR engine — libtesseract and a command line program — tesseract. Tesseract 4 adds a new…

github.com

Programming Horror and Humor Part 3

Keep laughing because it’s hilarious ….

naina0412.medium.com

Caffe: Convolutional Architecture for Fast Feature Embedding

By Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long , Ross Girshick, Sergio Guadarrama, Trevor Darrell

Abstract —

Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models. The framework is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures. Caffe fits industry and internet-scale media needs by CUDA GPU computation, processing over 40 million images a day on a single K40 or Titan GPU (≈≈ 2.5 ms per image). By separating model representation from actual implementation, Caffe allows experimentation and seamless switching among platforms for ease of development and deployment from prototyping machines to cloud environments. Caffe is maintained and developed by the Berkeley Vision and Learning Center (BVLC) with the help of an active community of contributors on GitHub. It powers ongoing research projects, large-scale industrial applications, and startup prototypes in vision, speech, and multimedia.

Paper can be found here :

https://arxiv.org/pdf/1408.5093v1.pdf

Code can be found here :

BVLC/caffe

Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by Berkeley AI…

github.com

Well-Read Students Learn Better: On the Importance of Pre-training Compact Models

By Iulia Turc, Ming-Wei Chang, Kenton Lee, Kristina Toutanova

Abstract —

Recent developments in natural language representations have been accompanied by large and expensive models that leverage vast amounts of general-domain text through self-supervised pre-training. Due to the cost of applying such models to down-stream tasks, several model compression techniques on pre-trained language representations have been proposed (Sun et al., 2019; Sanh, 2019). However, surprisingly, the simple baseline of just pre-training and fine-tuning compact models has been overlooked. In this paper, we first show that pre-training remains important in the context of smaller architectures, and fine-tuning pre-trained compact models can be competitive to more elaborate methods proposed in concurrent work. Starting with pre-trained compact models, we then explore transferring task knowledge from large fine-tuned models through standard knowledge distillation. The resulting simple, yet effective and general algorithm, Pre-trained Distillation, brings further improvements. Through extensive experiments, we more generally explore the interaction between pre-training and distillation under two variables that have been under-studied: model size and properties of unlabeled task data. One surprising observation is that they have a compound effect even when sequentially applied t the same data. To accelerate future research, we will make our 24 pre-trained miniature BERT models publicly available.

Paper can be found here :

https://arxiv.org/pdf/1908.08962v2.pdf

Code can be found here :

google-research/bert

This is a release of 24 smaller BERT models (English only, uncased, trained with WordPiece masking) referenced in…

github.com

FastText.zip: Compressing text classification models

By Armand Joulin, Edouard Grave, Piotr Bojanowski, Matthijs Douze, Hérve Jégou, Tomas Mikolov

Abstract —

We consider the problem of producing compact architectures for text classification, such that the full model fits in a limited amount of memory. After considering different solutions inspired by the hashing literature, we propose a method built upon product quantization to store word embeddings. While the original technique leads to a loss in accuracy, we adapt this method to circumvent quantization artefacts. Our experiments carried out on several benchmarks show that our approach typically requires two orders of magnitude less memory than fastText while being only slightly inferior with respect to accuracy. As a result, it outperforms the state of the art by a good margin in terms of the compromise between memory usage and accuracy.

Paper can be found here :

https://arxiv.org/pdf/1612.03651v1.pdf

Code can be found here :

facebookresearch/fastText

fastText is a library for efficient learning of word representations and sentence classification. You can find answers…

github.com

Bag of Tricks for Efficient Text Classification

By Armand Joulin Edouard Grave, Piotr Bojanowski, Tomas Mikolov

Abstract —

This paper explores a simple and efficient baseline for text classification. Our experiments show that our fast text classifier fastText is often on par with deep learning classifiers in terms of accuracy, and many orders of magnitude faster for training and evaluation. We can train fastText on more than one billion words in less than ten minutes using a standard multicore~CPU and classify half a million sentences, among~312K classes, in less than a minute.

Paper can be found here :

https://arxiv.org/pdf/1607.01759v3.pdf

Code can be found here :

facebookresearch/fastText

fastText is a library for efficient learning of word representations and sentence classification. You can find answers…

github.com

brightmart/text_classification

The purpose of this repository is to explore text classification methods in NLP with deep learning. Language…

github.com

graykode/nlp-tutorial

nlp-tutorial is a tutorial for who is studying NLP(Natural Language Processing) using TensorFlow and Pytorch. Most of…

github.com

Enriching Word Vectors with Subword Information

By Piotr Bojanowski, Edouard Grave, Armand Joulin, Tomas Mikolov

Abstract —

Continuous word representations, trained on large unlabeled corpora are useful for many natural language processing tasks. Popular models that learn such representations ignore the morphology of words, by assigning a distinct vector to each word. This is a limitation, especially for languages with large vocabularies and many rare words. In this paper, we propose a new approach based on the skipgram model, where each word is represented as a bag of character nn-grams. A vector representation is associated to each character nn-gram; words being represented as the sum of these representations. Our method is fast, allowing to train models on large corpora quickly and allows us to compute word representations for words that did not appear in the training data. We evaluate our word representations on nine different languages, both on word similarity and analogy tasks. By comparing to recently proposed morphological word representations, we show that our vectors achieve state-of-the-art performance on these tasks.

Paper can be found here :

https://arxiv.org/pdf/1607.04606v2.pdf

Code can be found here :

facebookresearch/fastText

fastText is a library for efficient learning of word representations and sentence classification. You can find answers…

github.com

brightmart/text_classification

The purpose of this repository is to explore text classification methods in NLP with deep learning. Language…

github.com

plasticityai/magnitude

A feature-packed Python package and vector storage file format for utilizing vector embeddings in machine learning…

github.com

OpenAI Gym

By Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, Wojciech Zaremba

Abstract —

OpenAI Gym is a toolkit for reinforcement learning research. It includes a growing collection of benchmark problems that expose a common interface, and a website where people can share their results and compare the performance of algorithms. This whitepaper discusses the components of OpenAI Gym and the design decisions that went into the software.

Paper can be found here :

https://arxiv.org/pdf/1606.01540v1.pdf

Code can be found here :

openai/gym

Status: Maintenance (expect bug fixes and minor updates) OpenAI Gym is a toolkit for developing and comparing…

github.com

XGBoost: A Scalable Tree Boosting System

By Tianqi Chen, Carlos Guestrin

Abstract —

Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

Paper can be found here :

https://arxiv.org/pdf/1603.02754v3.pdf

Code can be found here :

dmlc/xgboost

Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient…

github.com

MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems

By Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang , Tianjun Xiao, Bing Xu, Chiyuan Zhang, Zheng Zhang

Abstract —

MXNet is a multi-language machine learning (ML) library to ease the development of ML algorithms, especially for deep neural networks. Embedded in the host language, it blends declarative symbolic expression with imperative tensor computation. It offers auto differentiation to derive gradients. MXNet is computation and memory efficient and runs on various heterogeneous systems, ranging from mobile devices to distributed GPU clusters. This paper describes both the API design and the system implementation of MXNet, and explains how embedding of both symbolic expression and tensor operation is handled in a unified fashion. Our preliminary experiments reveal promising results on large scale deep neural network applications using multiple GPU machines.

Paper can be found here :

https://arxiv.org/pdf/1512.01274v1.pdf

Code can be found here :

apache/incubator-mxnet

Apache MXNet (incubating) is a deep learning framework designed for both efficiency and flexibility. It allows you…

github.com

Image Super-Resolution Using Deep Convolutional Networks

By Chao Dong, Chen Change Loy, Kaiming He, Xiaoou Tang

Abstract —

We propose a deep learning method for single image super-resolution (SR). Our method directly learns an end-to-end mapping between the low/high-resolution images. The mapping is represented as a deep convolutional neural network (CNN) that takes the low-resolution image as the input and outputs the high-resolution one. We further show that traditional sparse-coding-based SR methods can also be viewed as a deep convolutional network. But unlike traditional methods that handle each component separately, our method jointly optimizes all layers. Our deep CNN has a lightweight structure, yet demonstrates state-of-the-art restoration quality, and achieves fast speed for practical on-line usage. We explore different network structures and parameter settings to achieve trade-offs between performance and speed. Moreover, we extend our network to cope with three color channels simultaneously and show better overall reconstruction quality.

Paper can be found here :

https://arxiv.org/pdf/1501.00092v3.pdf

Code can be found here :

nagadomi/waifu2x

Image Super-Resolution for Anime-style art using Deep Convolutional Neural Networks. And it supports photo. The demo…

github.com

YOLOv4: Optimal Speed and Accuracy of Object Detection

By Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao

Abstract —

There are a huge number of features which are said to improve Convolutional Neural Network (CNN) accuracy. Practical testing of combinations of such features on large datasets, and theoretical justification of the result, is required. Some features operate on certain models exclusively and for certain problems exclusively, or only for small-scale datasets; while some features, such as batch-normalization and residual-connections, are applicable to the majority of models, tasks, and datasets. We assume that such universal features include Weighted-Residual-Connections (WRC), Cross-Stage-Partial-connections (CSP), Cross mini-Batch Normalization (CmBN), Self-adversarial-training (SAT) and Mish-activation. We use new features: WRC, CSP, CmBN, SAT, Mish activation, Mosaic data augmentation, CmBN, DropBlock regularization, and CIoU loss, and combine some of them to achieve state-of-the-art results: 43.5% AP (65.7% AP50) for the MS COCO dataset at a realtime speed of ~65 FPS on Tesla V100.

Paper can be found here :

https://arxiv.org/pdf/2004.10934v1.pdf

Code can be found here :

pjreddie/darknet

Darknet is an open source neural network framework written in C and CUDA. It is fast, easy to install, and supports CPU…

github.com

Tianxiaomo/pytorch-YOLOv4

A minimal PyTorch implementation of YOLOv4.

github.com

AlexeyAB/darknet

Paper Yolo v4: https://arxiv.org/abs/2004.10934 More details: medium link About Darknet framework…

github.com

References and credits —