The Bisecting K-Means algorithm is a variant of the traditional K-Means clustering method that iteratively divides the dataset into two clusters until the desired number of clusters is reached, offering efficiency and the ability to recognize non-spherical clusters.
Abstract
The Bisecting K-Means clustering algorithm is a modification of the standard K-Means algorithm, which involves a hierarchical approach to partitioning data. It begins with a single cluster and recursively applies K-Means with K=2 to split the data into two clusters at each step. This process continues until the specified number of clusters is achieved. The method is more efficient than regular K-Means because it only calculates distances within selected clusters rather than the entire dataset at each iteration. Additionally, it can identify clusters of various shapes and sizes, unlike the traditional K-Means which assumes spherical clusters. However, it shares a limitation with K-Means in that it can be sensitive to outliers. The article also provides a Python code example using the Sklearn library to illustrate the implementation of Bisecting K-Means and suggests considering this algorithm over standard K-Means for potentially better clustering results.
Opinions
The author suggests that Bisecting K-Means is an improvement over traditional K-Means due to its efficiency and ability to recognize diverse cluster shapes.
The article implies that Bisecting K-Means can be particularly useful for analyzing web-log data, as indicated by a research article comparing its performance to K-Means.
The author recommends using the "Bisecting K-Medoid" technique as a solution to the potential flaw in estimating cluster centers caused by outliers.
The author encourages readers who typically use K-Means to consider Bisecting K-Means as an alternative for potentially superior clustering outcomes.
A promotional opinion is presented at the end of the article, recommending an AI service, ZAI.chat, as a cost-effective alternative to ChatGPT Plus (GPT-4).
Bisecting K-Means Algorithm — Clustering in Machine Learning
Understanding Bisecting K-Means Clustering Algorithm (Visuals and Code)
Bisecting K-means clustering technique is a little modification to the regular K-Means algorithm, wherein you fix the procedure of dividing the data into clusters. So, similar to K-means, we first initialize K centroids (You can either do this randomly or can have some prior). After which we apply regular K-means with K=2 (that’s why the word bisecting). We keep repeating this bisection step until the desired number of clusters are reached. After the first bisection (when we have 2 clusters) is done, one can think of multiple strategies for selecting one of the clusters and re-iterating the entire process of bisection and assignment within that cluster — for example: choose cluster with maximum variance or the spread-out cluster, choose cluster with the maximum number of data points, etc.
Don’t have the time to read the entire blog? Then watch this quick <60 sec YouTube shorts video of the same –
You can visualize this entire flow as shown below—
Steps to Bisecting K-Means | Image by Author
As you can see in the figure above, we start by assuming all of the data inside a single cluster (1st fig.), and after the first step we get 2(bisection) clusters, we then check if reached the desired number of clusters or not. If not, we select one (red coloured) of the two clusters from the previous step and again apply K-means with K=2 and we repeat the “check” and “bisection” step.
As you might have already guessed by now, that this looks like a mixture of hierarchical and K-Means clustering. Because here you are building a tree, which is a hierarchical structure, where a node gets divided into two child nodes based on K-means strategy and assignment.
Improvement over K-Means
Unlike regular K-Means where we compute the distance between every data point and centroid at every step until the convergence criteria are met, here, instead, we only do this just once (first step), and after that, we only use data points within a particular cluster for calculating distance and further subdivision, making it efficient over regular K-Means. It can also recognize clusters of any shape and size unlike K-Means which assumes spherical-shaped clusters. I also found one interesting research article that compares the performance of both K-Means and Bisecting K-Means when analysing web-log data — Read here.
Limitations
Since the base clustering technique is still K-Means over here, this algorithm is also likely to suffer from a flawed estimate of the cluster centre in the case of Outliers.
Tip: “Bisecting K-Medoid” clustering technique can help you resolve the above limitation.
Code
I will be using Python’s Sklearn library for implementation purposes —
from sklearn.cluster import KMeans
import numpy as np
The figure below shows a pictorial walkthrough of the above example code —
Walkthrough of the above example | Image by Author
In short, when considering K-Means as your choice of algorithm, it’s also a good idea to try out Bisecting K-Means. For the reasons discussed above, it’s highly possible that you might get better results with this.
Hope you enjoyed reading this blog. Thank you for your time!