Hands-on Tutorial
Introduction to Fuzzy c-means for Clustering Algorithm
Basic introduction and implementation of Fuzzy c-means clustering algorithm using Python
There are a lot of clustering algorithms out there for the numerical data type. The k-means is one of the basic clustering algorithms that is commonly used by the researcher or analyst. But have you ever heard about the Fuzzy c-means before for clustering? If you haven’t, this article is for you.
In this short article, you will explore the Fuzzy c-means, starting from the basic structure of fuzzy, manual calculation and formula of Fuzzy c-means, and the implementation of Fuzzy c-means in Python using dummy data.
Okay, without further ado, let’s jump in!
Hard partition vs. fuzzy partition
Before talking about the basic theory of Fuzzy c-means, firstly better we talk about how the data points are theoretically allocated into clusters. Basically, there are two approaches, hard partition and fuzzy partition.
Hard partition — where the data points are strictly allocated as a member of one cluster and are not a member of another cluster, assuming that the number of clusters is known. The k-means is one of the algorithms that use a hard partition.
For instance, there are X = {x1, x2, …, x10}. They will be assigned into two clusters, let’s say cluster 1 and cluster 2. However, x6 and x7 are unfortunately in a grey area of two clusters.

Let’s say U is the partition matrix for X. Thus, the elements of matrix U will be as follows. The columns represent the data points while the rows are the clusters.

Remember that in a hard partition, there are only binary values [0, 1] so every data point must be assigned to one cluster. In this case, x6 is in cluster 1 while x7 is in cluster 2.

Fuzzy partition — where every data point is given a probability of closeness [0, 1] for existing clusters, assuming that the number of clusters is known. One of the algorithms that use fuzzy partition is Fuzzy c-means that we will talk about it in depth.
For instance, using the previous case, we have X = {x1, x2, …, x10}.

Look at x6 and x7 where they are in a grey area of two clusters. In fuzzy partition, they have a probability of 0.5 to be assigned in cluster 1 and 2. It will be fair for both of them.

The basic theory of Fuzzy c-means
Fuzzy c-means (FCM) was first introduced by Jim Bezdek in 1981. This method is an improvement of k-means by combining the fuzzy principle. Unlike the k-means, the data points that are clustered using FCM will become a member of each existing cluster. The dominant cluster for each data point is determined by the probability of its closeness which is in the range of 0 to 1.

In clustering, the k-means in allocating the data back into each cluster are based on the distance between the data and centroids in each existing cluster. The data is strictly reallocated to the cluster which has the closest centroids to the data points. Meanwhile, the FCM allocates the data points into each cluster by utilizing fuzzy theory. This theory generalizes the allocation method which is a hard partition used in k-means. In the FCM, membership functions are used, μ(x) which refers to how likely the data points can be the member of a certain cluster.

Furthermore, theoretically, it is quite possible to fail to converge in the k-means and fuzzy c-means. In FCM, the possibility of this problem occurring is rare because each data point has a membership function to become a member of the cluster.
To understand the mathematical calculation of FCM, look at the following document.









