avatarSummer He

Summary

This article provides a comprehensive tutorial on creating a customer journey visualization using Python's Plotly library and Sankey diagrams.

Abstract

The article begins by introducing Sankey diagrams, a type of flow diagram that emphasizes the most important contributions within a system. The author explains the benefits of using Sankey diagrams in customer journey analysis, such as identifying areas with the most significant opportunities and areas that need improvement. The article then demonstrates how to use Python's Plotly library to create a Sankey diagram that visualizes the customer journey for a hypothetical e-commerce business. The author provides a step-by-step guide, including data transformation, defining labels and colors, and plotting the diagram. The article concludes with a list of references and a note about the author's other visualization projects.

Bullet points

  • Sankey diagrams are a type of flow diagram that emphasizes the most important contributions within a system.
  • Sankey diagrams can be used in customer journey analysis to identify areas with the most significant opportunities and areas that need improvement.
  • The article demonstrates how to use Python's Plotly library to create a Sankey diagram that visualizes the customer journey for a hypothetical e-commerce business.
  • The author provides a step-by-step guide, including data transformation, defining labels and colors, and plotting the diagram.
  • The article concludes with a list of references and a note about the author's other visualization projects.

Visualizing the Customer Journey with Python’s Sankey Diagram: A Plotly Example

Learn How to Create a Stunning Customer Journey Visualization with this Comprehensive Tutorial

Image by Author via Python

Sankey diagrams are an excellent tool for understanding customer behavior. By visualizing the connections between customers, products, and transactions, we can gain insights into the patterns and relationships that drive customer behavior. In this article, I will show you how to use a Sankey diagram to visualize customer behavior data across multiple dimensions that allow you to visualize the customer journey of your products using Python.

This article includes the following:

  • Introduction to Sankey diagram
  • Why Sankey diagram in customer journey analysis
  • Examples of using the Sankey diagram to generate the customer journey diagram via Python Plotly

What is Sankey Diagram?

Sankey diagrams are a type of flow diagram in which the width of the arrows is proportional to the flow rate. It emphasizes most transfers or flows within a system to help locate the most important contributions. They often show conserved quantities within defined system boundaries. The things being connected are called nodes, and the connections are called links. Sankeys are best used to show a many-to-many mapping between two domains (e.g., universities and majors) or multiple paths through a set of stages (for instance, customer journey on buyer's movements across all touchpoints of your brand).

Why Sankey Diagram in the customer journey?

Sankey diagrams in the customer journeys can help the business look from the viewpoint of their customers on the products. It helps to identify the following:

  • The areas with the most significant opportunities. What's the happy path for customers to complete an order?
  • The areas that need more improvements. What happens before customers abandon their shopping carts?

How can Python Plotly visualize the customer journey in a Sankey diagram?

Let's imagine we are a small business owner selling products on an e-commerce shopping website; we would like to understand the journey for our customers from discovering to buying. In the case of customer journeys, the number of nodes can convey the event's quantity and chronological order information, and the width of the links can display the proportion of users who moved from one specific event to another.

Starting Point — A made-up Dataset on customer behavior

We start with a made-up DataFrame in pandas named df, with the following columns:

user_id: distinct user-id event_name: event name ['Home,' 'Cart,' 'Product,' 'Cancel,' 'Purchase,' etc.] platform: platform associated with the event ['Andriod,' 'iOS,' 'PC'] time: timestamp at which the event took place

The syntax to generate the DataFrame:

The head of this data frame looks like this:

Image by Author via Python

Data Transformation — Reference/Code Logic

I will use some of Andreas's code here as the backbone for converting a DataFrame to one that plotly can use as input to draw the Sankey diagram.

The logic is listed as follows.

  1. Sort all customer events by the timestamp of events and define the earliest' Home' event as the start point for each customer
  2. Aggregate the first n steps for each customer; if fewer than n, mark the journey's end with 'End.'
  3. Transform the DataFrame into the source and target data with event pairwise counts

Wrapping everything up into function:

Parameters — Define labels, colors, and source/target ids

Then, before the final step, let's define some parameters of the Sankey diagram to help our chart be more beautiful. We will create the following:

  • A list of labels for annotation on the plot
  • A list of node colors and the corresponding list of link colors
  • The unique integer ids for both source and target

Visualization — Plotly Sankey

Everything is now ready to plot the Sankey diagram. We will follow the Plotly tutorial to pass all the data and parameters for the plot.

I hope you found what you were looking for in this article. The full notebook can be found here.

P.S.

I have compiled a collection of my past visualizations which follow a consistent format that includes a randomly generated dataset and corresponding syntax to create the different charts. Please feel free to suggest any visualization topics that you'd like me to prioritize on my list. If you found this article interesting, please follow me on Medium. Enjoy reading and coding!

Reference

[1] https://plotly.com/python/sankey-diagram/

[2] https://readmedium.com/user-journey-sankey-diagram-25bb1aa42484

[3] https://towardsdatascience.com/visualizing-in-app-user-journey-using-sankey-diagrams-in-python-8373a7bb2d22

Python
Sankey Diagram
User Journey Mapping
Customer Journey
Plotly
Recommended from ReadMedium