K-Medoid Clustering (PAM)Algorithm in Python
A step-by-step tutorial—with a solved example

1. Introduction
1.1 K-Means clustering and challenges
Clustering of large-scale data is key to implementing segmentation-based algorithms. Segmentation can include identifying customer groups to facilitate targeted marketing, identifying prescriber groups to allow health care players to reach out to them with the right messaging, and identifying patterns or abnormal values in the data. K-Means is the most popular clustering algorithm adopted across different problem areas, mostly owing to its computational efficiency and ease of understanding the algorithm. K-Means relies on identifying cluster centers from the data. It alternates between assigning points to these cluster centers using the Euclidean distance metric and recomputes the cluster centers till a convergence criterion is achieved. K-Means clustering, however, suffers from a series of drawbacks:
- K-Means clustering does not use any other distance metric other than Euclidean distance. Different distance metrics like cosine distance is often utilized in a recommendation system that uses sparse data.
- In K-Means clustering, the final centers are not necessarily data points from the data and act as a bottleneck to interpretability for different application areas like images in computer vision.
2. Partition Around Medoids (PAM)
PAM stands for “Partition Around Medoids.” PAM converts each step of PAM from a deterministic computational to a statistical estimation problem and reduces the complexity of a sample size n to O(n log n). Medoids are data points chosen as cluster centers. K-Means clustering aims at minimizing the intra-cluster distance (often referred to as the total squared error). In contrast, K-Medoid minimizes dissimilarities between points in a cluster and points considered as centers of that cluster.
Any point in a dataset can be considered as a medoid if and only if its dissimilarity to all other data points in the cluster is minimum.
The fundamental concept of PAM includes:
- Find a set of k Medoids (k refers to the number of clusters, and M is a collection of medoids) from the data points of size n (n being the number of records).
- Using any distance metric (say d(.), could be euclidean, manhattan, etc.), try and locate Medoids that minimize the overall distance of data points to their closest Medoid.
- Finally, swap Medoid and non-Medoid pairs that reduce the loss function L among all possible k(n-k) pairs. The loss function is defined as:

Update centroids: In the case of K-Means, we were computing the mean of all points present in the cluster. But for the PAM algorithm, the updation of the centroid is different. If there are m-point in a cluster, swap the previous centroid with all other (m-1) points and finalize the point as a new centroid with a minimum loss. Minimum loss is computed by the cost function shown in Figure 1. A working example of the same is outlined below (GeeksforGeeks, 2019).
2.1 Algorithm — Dataset
Let us consider a set of data points and features.

2.2 Initialization Step
- Let us randomly select two medoids, so select 𝐾= 2 and let M1 =(4, 5), and M2 =(9, 10) be the two medoids. Note clustering algorithm chooses data points from the data as medoids
- Calculate the dissimilarity of all points to Medoids M1 and M2. Note dissimilarity is calculated as the summation of the absolute difference between medoids and data points. E.g., D=|Mx — Feature X| + |My — Feature Y|

2.3 Assignment Step
Assign data points to Medoids with lower Dissimilarity (Cost). Data Points 0 and 2 go to M2, and Data points 3, 4, and 5 go to M1. Total Cost = Cost/Dissimilarity of points assigned to M1 + Cost/Dissimilarity of points assigned to M2.
Total Cost=(4+3+4)+(4+7)=22; Cost for M1 = Dissimilarity of point 3 with M1 (4) + Dissimilarity of point 4 with M1 (3) +Dissimilarity of point 5 with M1 (4). Repeat the same calculation for M2.
2.4 Centroid Recompilation Step
Randomly select one non-medoid point and recalculate the cost. Let’s select Data Point 5 M1 as (2,3) as the Medoid now and recompute the cost.

Now only points 4 and 6 go to M1, and the rest of the points go to M2. Total Cost=(7+4)+(4+7+6)=28 and Swap Cost = New Cost — Previous Cost=28–22=6.
We will undo the swap since the Swap Cost is greater than 0. This process is continued till a convergence criterion is met.
3. Implementation using Python
We will simulate the above example using K-Medoid algorithm from scikit-learn-extra. Few of the codes and concepts and inspired from package documentation (scikit-learn-extra, 2019)
# — — — — — — -Importing Packages — — — — — — — — — — — -import matplotlib.pyplot as plt
import numpy as npfrom sklearn_extra.cluster import KMedoids# — — — — — — -Assigning Initial Centers — — — — — — — — — — — -
centers = [[4, 5], [9, 10]]# — — — — — — -Assigning Data: Dummy Data used in example above — — — — — — — — — — — — — — — — — —
df=np.array([[7,8], [9,10], [11,5], [4,9], [7,5], [2,3], [4,5]])# — — — — — — -Fit KMedoids clustering — — — — — — — — — — — -
KMobj = KMedoids(n_clusters=2).fit(df)# — — — — — — -Assigning Cluster Labels — — — — — — — — — — — -
labels = KMobj.labels_Visualizing the Cluster.
# — — — — — — -Extracting Unique Labels — — — — — — — — — — — -unq_lab = set(labels)# — — — — — — -Setting Up Color Codes — — — — — — — — — — — -
colors_plot = [
plt.cm.Spectral(each) for each in np.linspace(0, 1, len(unq_lab))
]for k, col in zip(unq_lab, colors_plot):class_member_mask = labels == k
# — — — — — — -Setting datapoint Feature X and Feature Y — — — — — — — — — — — -xy = df[class_member_mask]
# — — — — — — -Plotting Feature X and Feature Y for each cluster labels — — — — — — — — — — — -
plt.plot(
xy[:, 0],
xy[:, 1],
“o”,
markerfacecolor=tuple(col),
markeredgecolor=”white”,
markersize=10,
);# — — — — — — -Annotate Centroids — — — — — — — — — — — -plt.plot(
KMobj.cluster_centers_[:, 0],
KMobj.cluster_centers_[:, 1],
“o”,
markerfacecolor=”orange”,
markeredgecolor=”k”,
markersize=10,
);# — — — — — — -Add title to the plot — — — — — — — — — — — -plt.title(“KMedoids clustering on Dummy Data- Medoids are represented in Orange.”, fontsize=14);
Reference
- scikit-learn-extra. (2019). KMedoids Demo — scikit-learn-extra 0.2.0 documentation. Scikit-Learn-Extra.readthedocs.io. https://scikit-learn-extra.readthedocs.io/en/stable/auto_examples/plot_kmedoids.html#sphx-glr-auto-examples-plot-kmedoids-py
- GeeksforGeeks. (2019, May 17). ML | K-Medoids clustering with solved example. GeeksforGeeks. https://www.geeksforgeeks.org/ml-k-medoids-clustering-with-example/
About the Author: Advanced analytics professional and management consultant helping companies find solutions for various problems through a mix of business, technology, and math on organizational data. A Data Science enthusiast, here to share, learn and contribute; You can connect with me on Linked and Twitter;





