avatarRoman Orac

Summary

Mito is a free JupyterLab extension that simplifies data manipulation and visualization in a spreadsheet-like interface, offering features comparable to Excel and generating Python code automatically.

Abstract

The article introduces Mito, a JupyterLab extension designed to streamline Exploratory Data Analysis (EDA) by providing a user-friendly, Excel-like interface for data manipulation and visualization. Mito allows users to perform CRUD operations, create pivot tables, and utilize dynamic formulas, making it accessible for those less familiar with programming. It also supports data visualization without the need for coding, generating bar charts, box plots, histograms, and scatter plots. A notable feature is its ability to automatically convert user actions into pandas code, facilitating learning and code reuse. The installation process is straightforward, and the extension is compatible with Python 3.6 and above. The author endorses Mito for initial EDA and commends its performance and well-documented resources.

Opinions

  • The author expresses that Mito is a significant advancement for data scientists, making EDA more enjoyable by reducing tedious work.
  • Mito is seen as a game-changer, bringing Excel-like functionality into the JupyterLab environment, which is particularly beneficial for those who prefer a graphical interface over coding.
  • The author is impressed by Mito's dynamic formulas and automatic code generation features, highlighting them as particularly impressive.
  • The author believes that Mito can be a valuable learning tool for less experienced data scientists by demonstrating "the pandas way" of data analysis.
  • The author recommends Mito for data scientists to add to their toolbox, especially for initial data exploration, and suggests that the extension's maturity and reliability make it a worthwhile addition to JupyterLab's ecosystem.

Another JupyterLab Extension You Should Know About

Mito is a JupyterLab extension that enables exploring and transforming datasets with the ease of Excel… and it’s FREE.

Photo by Benjamin Davies on Unsplash

It’s really an exciting time to be a part of the Data Science community with all the new JupyterLab extensions that are coming out. They make Data Science much more enjoyable by minimizing the tedious work.

I remember the old days where we had to rely on numpy and matplotlib as our main tools for Exploratory Data Analysis in Python. Luckily for us, those days are long gone.

You’ll see what I mean by “long gone”, with the JupyterLab extension that is the main topic of this article.

In case you’ve missed other articles about Mito:

Meet Mito

Iris dataset loaded with Mito (image made by author)

Mito is a free JupyterLab extension that enables exploring and transforming datasets with the ease of Excel.

Mito is a missing pandas extension that we were waiting for years

When you start Mito, it shows a spreadsheet view of a pandas Dataframe. With a few clicks, you can perform any CRUD operation.

CRUD stands for Create, Read, Update, Delete

How to start Mito?

Photo by Laura Gariglio on Unsplash

To load your data with Mito and show the spreadsheet view is as simple as:

import mitosheet
import pandas as pd
url = 'https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv'
iris = pd.read_csv(url)
mitosheet.sheet(iris)

Mito opens a powerful spreadsheet viewer, which enables filtering, sorting and editing the data.

Sorting values by petal_width (image made by author)

And it doesn’t just stop with basic editing capabilities…

Pivot tables

With just a few clicks, Mito can create a pivot table. It supports many common aggregations, like sum, median, mean, count, unique, etc.

What’s a pivot table? (from Wikipedia)

A pivot table is a table of grouped values that aggregates the individual items of a more extensive table within one or more discrete categories.

What are the most impressive Mito features?

If pivoting tables didn’t impress you enough to give Mito a try, I’m quite confident that the following features will.

Spreadsheet Formulas

Calculating formulas with Mito (Image made by author)

Dynamic formulas are Excel's killer feature. Excel makes it easy to create complex spreadsheets for those who’re not familiar with programming.

What if I told you that Mito supports dynamic formulas in an “Excel way”. This feature really surprised me as the team behind Mito had spent a lot of development time to implement it.

Formula Reference shows supported formulas (image made by author)

Take a look at the GIF below to see Mito’s sum formula in action:

Visualizing Data

Photo by Campaign Creators on Unsplash

We, Data Scientists, appreciate the tools that simplify data visualization.

At first, pandas made a huge leap from using barebones matplotlib — a powerful python package for data visualization.

Then came seaborn and plotly, which can make stunning visualizations in Python with just a few commands… a giant leap again.

… and then came Mito, which can visualize your data without writing a line of code.

Mito supports bar charts, box plots, histograms, and scatter plots.

In the GIF below, I make a bar plot with sepal width on the x-axis and species on the y-axis.

Automatic Code Generation

Photo by Joshua Aragon on Unsplash

Mito transforms each operation that you make into pandas code, which you can then share with your colleagues.

The main intention of this feature is to repeat the analysis on another dataset. It’s like a pandas macro.

This is also a great feature for less experienced Data Scientists as they can learn “the pandas way” of doing Data Analysis.

I did some clicking and Mito produced the following code snippet:

Mito autogenerates the code on the fly (Image made by author)

How to install Mito?

Mito requires a Python 3.6 or above.

First, you need to download Mito's installer with:

python -m pip install mitoinstaller

Then to install it, simply run:

python -m mitoinstaller install

In case you have some installation errors, take a look at Mito's Common Installation Issues.

Conclusion

Photo by David Mullins on Unsplash

It amazes me how far has JupyterLab’s extension ecosystem came. The initial extensions were clunky, error-prone and hard to install.

The times have changed and JupyterLab’s extensions are maturing. Mito is a great example of this trend.

I’ve taken Mito to a test drive and after a couple of hours, I didn’t see the degraded performance (or some strange error).

I will add Mito to my Data Science toolbox. I plan to use it for the initial Exploratory Data Analysis — to get the feel of the data. Typing the same set of commands over and over gets tedious.

In case you’d like to learn more about Mito, it has well-written documentation (and many tutorials), which is always a good sign with such extensions.

Before you go

If you enjoy reading these stories, why not become a Medium paying member? It is $5 per month, and you will get unlimited access to 10000s of stories and writers. If you sign up using my link, I will earn a small commission.

Photo by Priscilla Du Preez on Unsplash
Python
Data Science
Data Analytics
Excel
Jupyter Notebook
Recommended from ReadMedium