Detect Objects in Images using C# and ML.NET Machine Learning
There’s an old saying in AI that computers are great at things that humans find hard (like doing complex math) and computers really struggle with things that humans find easy (like catching a ball or recognizing objects).
Let’s take recognizing objects as an example. Check out the following collection of images:

These 20 images depict a broccoli, a canoe, a coffee pot, a pizza, a teddy bear, and a toaster. How hard would it be to build an app that can recognize the object in every image?
Really hard, actually.
In fact, it’s so difficult that there’s an annual challenge called the ImageNet Large Scale Visual Recognition Challenge. The challenge requires apps to classify a collection of 1.2 million images into 1,000 unique categories.
Here are the competition results up to 2016:

The red line depicts the 5% human error rate on the image classification challenge. Only in 2015 a team finally developed an app that could beat human performance levels.
That was 4 years ago. Can I build a C# app today with ML.NET and NET Core that can do the same?
ML.NET is Microsoft’s new machine learning library. It can run linear regression, logistic classification, clustering, deep learning, and many other machine learning algorithms.
And NET Core is the Microsoft multi-platform NET Framework that runs on Windows, OS/X, and Linux. It’s the future of cross-platform NET development.
My first thought was to build a convolutional neural network in ML.NET, train it on the 1.2 million images in the ImageNet set, and then use the trained network to predict the 20 images in my test set.
But there’s no need to go through all that trouble. Fully-trained object-detection networks are readily available, and ML.NET can easily host and run a neural network that has already been trained.
So my best course of action is to grab a TensorFlow neural network that has been trained on the ImageNet data, and just drop it into ML.NET for immediate use.
I’ll use the Google Inception network in my app. What makes the Inception model unique is its use of stacked ‘Inception Modules’: special neural submodules that run convolutions with different kernel sizes in parallel, like this

This is a single inception module shown in Netron, a popular neural network viewer. The three convolution kernels (1x1, 3x3, and 5x5) are highlighted in red and run in parallel.
This trick of running several different convolutions in parallel gives Inception excellent predictive ability on a wide range of images.
You can download the Inception model from here.
I’ll also use a folder with test images and corresponding labels. I’ll use this small 20-image set from a Microsoft ML.NET code sample.
The set includes a TSV file which looks like this:

It’s a tab-separated file with only 2 columns of data:
- The filename of the image to test
- The type of object in the image
Let’s get started. Here’s how to set up a new console project in NET Core:
$ dotnet new console -o ImageDetector
$ cd ImageDetectorNext, I need to install the ML.NET packages:
$ dotnet add package Microsoft.ML
$ dotnet add package Microsoft.ML.ImageAnalytics
$ dotnet add package Microsoft.ML.TensorFlowThe ImageAnalytics package contains libraries that help ML.NET deal with image data. And the Tensorflow package adds support for running pretrained TensorFlow models.
Now I’m ready to add some classes. I’ll need one to hold an image record, and one to hold my model’s predictions.
I will modify the Program.cs file like this:












