This article introduces seven lesser-known Python visualization packages, namely Bokeh, Seaborn, ggplot, Plotly, Altair, Geoplotlib, and Missingno, that can be used for more efficient and visually appealing data visualization.
Abstract
The article titled "7 Unexplored Python Visualisation Packages You Must Know" discusses seven Python visualization libraries that are not as commonly used as Matplotlib but offer unique features and advantages. These libraries are Bokeh, Seaborn, ggplot, Plotly, Altair, Geoplotlib, and Missingno. Bokeh is a native Python library that can generate interactive graphs, while Seaborn is built on top of Matplotlib and provides more aesthetically pleasing graphs. ggplot is based on The Grammar of Graphics and allows for the addition of multiple component layers to create visualizations. Plotly is a web-based toolkit that can be accessed from a Python notebook and offers interactive visualizations. Altair is a declarative statistical visualization library that is based on Vega and Vega-Lite visualization grammar. Geoplotlib is a dedicated Python toolbox for visualizing geographical data, while Missingno is a Python library that helps visualize missing data in a pandas data frame.
Opinions
The author believes that each of these libraries has its own advantages and can provide better results than standard libraries if used properly.
The author recommends having a good understanding of Matplotlib to get the most out of Seaborn.
The author suggests keeping data in data frames to get the most out of ggplot.
The author highlights the interactive nature of visualizations created using Bokeh, Plotly, and Altair.
The author emphasizes the importance of handling missing data properly for better analysis of data and recommends using Missingno for visualizing missing data in a pandas data frame.
The author encourages readers to refer to the official documentation of these libraries for a detailed overview.
The author invites readers to subscribe to their newsletter for more exciting articles on data science and technology.
7 Unexplored Python Visualisation Packages You Must Know
Missingno, Bokeh, Altair, Geoplotlib, and much more
An important aspect of data science is visualising the data. When we have a fairly large set of data that we cannot comprehend only by going through it, we need to plot them in different formats to understand better. Visualisation packages make this job a lot easier for data scientists.
This article will talk about seven Python visualisation packages that you must try to plot your data. We have excluded Matplotlib from the list as it is fairly common and used by all data science enthusiasts.
1. Bokeh
Bokeh is a native Python library that is based on The Grammar of Graphics (Statistics and Computing). This library can be used to generate graphs that can easily be imported as JSON objects or HTML documents, so it can easily be used on a webpage.
Also, the visualisations are interactive in nature, i.e., you can get specific information about specific parts of the graph easily.
Run the following command to install Bokeh.
pip install bokeh
Let’s see some interesting plots using Bokeh.
After you run the above code, a new tab will open with a plot and some clickable buttons that let you play with it.
Screenshot added by author
2. Seaborn
Seaborn is a visualisation library built on top of Matplotlib. That is why you can access almost all functionalities of metrics live but with fewer lines of code and more aesthetically pleasing graphs.
As Seaborn is based on Matplotlib, a good understanding of Matplotlib will be very useful to get more out of it.
Run the following command to install Seaborn.
pip install seaborn
To get a scatter plot using a dataset, we can use the following:
Screenshot added by author
3. ggplot
Just like Bokeh, this library is based on The Grammar of Graphics, but the operation is slightly different from that of the other libraries. Using the ggplot library you can add multiple component layers to get a final version of your visualisation.
To get the most out of it, keeping the data in data frames is advised, as according to the official documentation, ggplot has a symbiotic relationship with pandas.
Plotly is a bit different from the other packages, as it is a web-based toolkit. But it can also be accessed from a Python notebook using its API.
This package has some great visualisation features, like box plots and contour multiple-axis plots, and the plots are also interactive in nature.
You can run the below command to install it.
pip install plotly=5.1.0
For plotting a bar chart using Plotly:
5. Altair
This Python visualisation library is based on powerful Vega and Vega-Lite visualisation grammar. It is a declarative statistical visualisation library, i.e., we need to declare only the links between the data columns (x-axis, y-axis, colour, etc.) and the rest of the work will be done automatically.
This makes the plotting process easier and also takes fewer lines of code. Also, the plots can be made interactive.
You can run the below command to install it.
pip install altair vega_datasets
Let’s use the iris dataset to visualise some stuff.
An interesting thing about this package is that it provides some cool features. Click on the three dots a the top right.
You can save this graph for future use. Also, you can use the Vega editor to explore more cool stuff.
6. Geoplotlib
Geoplotlib is a dedicated Python toolbox for the visualisation of geographical data. This toolbox can be used for dealing with maps and region-based data, like population, heatmaps, climate, etc. To use this toolkit, one needs to install NumPy and pyglet.
You can run the below command to install it.
pip install geoplotlib
7. Missingno
One of the biggest headaches of data scientists while working with data is data missing from data sets. These missing values should be handled properly for a better analysis of the data. missingno is a Python library that helps you visualise the missing data in a pandas data frame.
It helps us to determine the frequency for the position of the missing values in a data set, which can be really helpful for data scientists.
You can run the below command to install it.
pip install missingno
To plot missing data as a bar chart:
Conclusion
In this article, we have described seven Python visualisation libraries or toolkits that we can use for better and faster visualisation of data.
The question “why do we need so many plotting libraries when we have the powerful Matplotlib?” arises. The answer is the diversity that these libraries offer.
Each library has its own advantages which, if properly used, can give better results than the standard libraries. For a detailed overview, you can refer to the official documentation of these libraries.
If you want to get more exciting articles on data science and technology and are interested in knowing about my favourite book collections, here is my free newsletter: Pranjal’s Newsletter.