Artificial Intelligence Essentials
The Brief History of Convolutional Neural Networks
Discover the history of one of the most popular deep learning models used for almost every computer vision tasks
Convolutional Neural Networks are the most important artificial neural network architecture today for almost any computer vision and image processing-related AI tasks. In this post, we will briefly visit the origins of CNNs from biological experiments of the 1950s until today’s complex pre-trained Computer Vision models.
From Simple and Complex to Grandmother Cells
In 1959, David Hubel and Torsten Wiesel discovered simple and complex cells. According to their study, for visual pattern recognition, we use two kinds of cells. A simple cell can recognize edges and bars of particular orientations at a particular part of the image, such as the image below:

On the other hand, a complex cell responds to edges and bars of particular orientations as well. In addition to this capability, complex cells -in contrast to simple cells- can respond to these edges and bars at any location in the scene.
For instance, while a simple cell can only respond to a vertical bar located in the upper section of a scene, a complex cell can respond to vertical scenes that are located anywhere in the scene.
Complex cells can achieve this location-agnostic recognition capability by summing information from multiple simple cells. Throughout the human body, we see simple and complex cell structures, which together comprise our visual system.
Therefore, in our body, probably there is one complex neuron that obtains its information from other complex neurons (necessary to detect a specific item) and only activates when we have the visual of our grandma.
The Neocognitron by Kunihiko Fukushima
Inspired by Hubel and Wiesel’s work, in the 1980s, Dr. Kunihiko Fukushima designs an artificial neural network that mimics the functioning of simple and complex cells. While S-cells operate as artificial simple cells, C-cells operate as artificial complex cells. They are artificial because they are not biological neurons, but instead, they mimic the algorithmic structure of simple and complex cells. The main idea of Fukushima’s Neocognitron was simple: Capture complex patterns (e.g., a dog) using complex cells that gather their information from other lower-level complex cells or simple cells that detect simpler patterns (e.g., a tail).

Check out the Neocognitron paper: Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position
The LeNet by Yann LeCun
Although the work of Fukushima was very powerful in the newly developing field of artificial intelligence, the first modern application of convolutional neural networks was implemented in the 90s by Yann LeCun et al. in their paper Gradient-Based Learning Applied to Document Recognition, which is probably by far the most popular AI paper from the 90s (cited by 34378 papers).

In the paper, YaCun trained a convolutional neural network with the MNIST dataset of handwritten digits.
The MNIST database contains 60,000 training images and 10,000 testing images taken from American Census Bureau employees and American high school students [Wikipedia]. MNIST dataset contains greyscale RGB codes of handwritten digits (from 0 to 9) with their labels that represent which number they actually are:
The idea was a follow-up of Fukushima’s Neocognitron: Aggregating simpler features into more complicated features using complex artificial cells. The LeNet was trained on MNIST by following:
- Provide the model with an example image;
- Ask the model to predict the label;
- Update the model settings comparing the outcome of the prediction and the real label value;
- Repeat this process until reaching the optional model settings where the loss is minimized.
LeCun’s implementation set the standards for today’s computer vision and image processing applications.
From the 1990s Onwards
The 90s, 00s, and 10s are the years where the streamlined process of building convolutional neural networks was used for more and more complex models trained on larger and larger datasets.
In 2005, the PASCAL VOC challenge -where the participants compete to achieve the lowest loss + highest accuracy performance with their model- started with approximately 20,000 images and 20 object classes. However, with the advancements in the field, these numbers were dwarfed by other private studies. Starting from 2010, Fei-Fei Li started collaborating with the PASCAL VOC team to make a very large image dataset available with the name, ImageNet. Every year researchers were invited to the ImageNet Large Scale Visual Recognition Challenge (ILSVRC). Currently, the ImageNet dataset contains 14,197,122 images in 1000 different object classes.
In 2012, a deep convolutional neural network architecture called AlexNex achieved a 16% error rate (10% lower than the runner-up) by utilizing GPUs. The use of GPUs for computer vision tasks became standard after AlexNext’s incredible achievement for its time.
Check out the AlexNet paper: ImageNet Classification with Deep Convolutional Neural Networks
In 2017, 29 of 38 competing teams at the ILSVRC achieved less than 5% error. Therefore, since we are at the point of solving complex 2D classification problems, the organizers of ILSVRC announced that the format of ILSVRC would be a 3D object classification in a near future.
Final Notes
From the discovery of simple and complex cells in our brain to 3D object detection challenges, convolutional neural network structures came a long way and they will only get more advanced from this moment onwards. It is exciting to see how many new developments we will see in the near future. If you are interested in being part of this process and learn deep learning, check out the Guide to my Content.
more specifically, if you are looking for building your own convolutional neural networks using TensorFlow and classifying handwritten digits using MNIST dataset, check out this article:
Subscribe to the Mailing List for My Latest Content
If you liked what I shared so far, consider subscribing to the Newsletter! ✉️
With my subscribers, I also share my private Google Colab notebooks, containing full codes for every post I published.
If you are reading this article, I am sure that we share similar interests and are/will be in similar industries. So let’s connect via Linkedin! Please do not hesitate to send a contact request! Orhan G. Yalçın — Linkedin




