avatarThiago Carvalho

Summarize

Create Power BI reports in Jupyter Notebooks

Exploring the Synergy Between Power BI and Jupyter Notebooks

In today’s data-driven world, where information flows incessantly from every corner, extracting meaningful insights and making informed decisions has become paramount.

Recognizing the significance of seamless data visualization and analysis, Power BI has taken a leap forward with its latest update. This update introduces an exciting integration between Power BI and Jupyter Notebooks, allowing users to create PBI reports within their existing workflow.

In this article, we’ll explore the new functionality of the powerbiclient package, how it integrates with Jupyter, and the challenges to productionize the outputs.

Setting up

First, we have to install the package and configure our environment.

This should be as simple as:

pip install powerbiclient

#For Juyter Lab
pip install powerbiclient
jupyter labextension install @jupyter-widgets/jupyterlab-manager

In my experience, it wasn’t really that simple. Everything seemed to work until I tried to generate the report, then I kept getting errors such as:

Error displaying widget: model not found

I tried updating the packages, creating a new env, disabling and enabling the widgets in Jupyter Lab, and praying to the data gods. Nothing worked, and I couldn’t make the Power BI report appear in Jupyter Lab.

Then I tried Jupyter Notebook, and surprisingly, it worked just fine.

There’s not much information online about the requirements, how to set up the environment, how to get started, or troubleshooting in general. At first look, the library seems unreliable and not robust enough to handle the integration with constantly evolving libraries in Python.

Creating a Report

Once we’ve installed the package, we can import the classes needed for creating a report.

from powerbiclient import QuickVisualize, get_dataset_config, Report
from powerbiclient.authentication import DeviceCodeLoginAuthentication

The idea is to load up the data to a dataframe and make all the transformations we need with Pandas, such as creating new fields, cleaning the data, and aggregating. Once the dataset is ready to be visualized, with one function, we can create a Power BI report automatically.

import pandas as pd

# Data generated in www.mockaroo.com
df = pd.read_csv('MOCK_DATA.csv')
df.head()

Now that our dataframe is ready, we must authenticate to Power Bi service.

device_auth = DeviceCodeLoginAuthentication()

After running this block, we’ll get a code and be directed to the MS device log in page, there we need to follow the instructions to authenticate our machine.

This is usually just a simple login/ password; if you have two-step authentication, it might ask for a code in the authenticator app.

Once we finish, it’ll let us know in the Jupyter Notebook.

Now that everything is set up, we can finally generate the Power BI report.

PBI_visualize = QuickVisualize(get_dataset_config(df), auth=device_auth)
PBI_visualize

Power BI will start building our report.

The final result will look like the one below, with summarized stats at the top, a more prominent visual with some text summaries, and more charts.

When the cursor is over a chart, the edit button appears; it has some options for making quick changes to the charts. We can save the report to a workspace and edit it if we need more options.

With the report saved to a workspace, we can quickly share it, make edits in the service, or download the .pbix file to edit with PBI Desktop.

Data Sources and Conclusions

It doesn’t capture the true source of your data frame nor all the transformation you might’ve done in Jupyter, so if you’re in a more mature analytics environment, this might be useful for presentations, drafting dashboards, or citizen development.

If you’re reporting from a data warehouse, you want your data to be discoverable, refreshable, understandable, and so on. And that will still require more work.

If your data is static, you’re only using it for one presentation, or you want a client to have a quick way to explore a dataset, this might fit flawlessly.

What I want to see is a package where I can define the same connections as I can in Power BI Desktop, transform the data with Pandas or any other library, and then quickly move this small pipeline to Azure with a data source connection, transformations, and presentation layer all distinguishable and editable.

Overall, this update is a step forward and can accelerate data visualization. And hopefully, it will improve as MS Fabric matures.

Thanks for reading :)

Data Analysis
Python
Power Bi
Jupyter Notebook
Data Visualization
Recommended from ReadMedium