Hands-on Guide to Create beautiful Sankey Charts in d3js with Python
The Sankey chart is a great way to discover the most prominent contributions just by looking at how individual items flow across states.

The Sankey chart is great in case you need an understanding of the flows in a system to reveal potential inefficient states in the process. The input data is similar to that of a network chart with source/target and weights but can reveal new insights into the most prominent contributions. It can be used in many use cases, such as improving the customer journey in marketing, cost analysis, energy flows, etc. The Sankey chart is part of the D3Blocks library and can be created using Python. The output is entirely encapsulated into a single HTML file for which you only need an internet browser to show the graph. Sharing and publishing is thus super easy. In this blog, I will introduce the Sankey Chart, and demonstrate with hands-on examples how to use it.
If you found this article helpful, use my referral link to continue learning without limits and sign up for a Medium membership. Plus, follow me to stay up-to-date with my latest content!
The Sankey chart is part of D3Blocks.
D3Blocks is a library that contains various charts for which the visualization part is built on (d3) javascript but configurable using Python. In this manner, the D3Blocks library combines the advantages of d3-javascript such as speed, scalability, flexibility, and unlimited creativity together with Python for fast and easy access to a broad community such as the Data Science field. Especially for this field, it is key that it should scale easily to very large data sets. Each chart in D3Blocks, such as the Sankey chart, is entirely encapsulated into a single HTML file which makes it very easy to share or publish on websites. Moreover, it does not need any other technology than a browser to publish or share the graphs. More information about the D3Blocks library can be found in this blog [1].
The Sankey chart.
The Sankey charts can be created in Python without worrying about any of the d3 javascript modules. After importing the D3Blocks library, you can set the user-defined parameters, and create the chart based on your input dataset. Behind the scenes, the Sankey module will create the colors, positions, ordering, and labels for the state and flows that are in the data set. It will also include the user-defined parameters, connects all d3 parts, and then finally transform it into a single HTML file that is stored on disk.
Reasons to use the Sankey chart.
The Sankey graph is insightful when one action follows the other across time or states. It can help to reveal potential inefficient states in the process, such as the discovery of a bottleneck in a process. Although the input with source-target-weights is similar to that of network analysis, the use of network analysis can be hard to interpret when the flows need to be analyzed. Each flow in the Sankey chart can differ in height, depending on its quantity, and therefore it becomes more straightforward to determine the most prominent or problematic states. Such segmentation makes it easy to draw conclusions from data.
Installation.
Before we go through the functionalities of Sankey, we first need to install the D3Blocks library:
pip install d3blocksInput Data Frame.
The input data is a DataFrame containing the following three columns:
- source: describes the source state.
- target: describes the target state.
- weight: describes the relative importance of flow between the source and target state.
Let’s load an example of the input DataFrame from the energy dataset [2]. In this example, there are 68 rows with 3 columns. The interpretation of the weight can be seen as the strength of the relationship between source-target, which makes the width of the flow.
print(df) source target weight
0 Agricultural 'waste' Bio-conversion 124.729
1 Bio-conversion Liquid 0.597
2 Bio-conversion Losses 26.862
3 Bio-conversion Solid 280.322
4 Bio-conversion Gas 81.144
.. ... ... ...
63 Thermal generation District heating 79.329
64 Tidal Electricity grid 9.452
65 UK land based bioenergy Bio-conversion 182.010
66 Wave Electricity grid 19.013
67 Wind Electricity grid 289.366[68 rows x 3 columns]The input parameters.
The Sankey block contains various input parameters that are described in code section 1.







