avatarJohn Loewen, PhD

Summary

The author has conducted a comparative analysis of Streamlit and Plotly Dash frameworks for creating interactive data visualization dashboards using GPT-4, focusing on their ability to handle complex tasks with a single prompt.

Abstract

The author, an expert in Python data visualization, has spent 8 months optimizing GPT-4 prompt engineering for interactive dashboards. The article presents a side-by-side comparison of two popular Python frameworks, Streamlit and Plotly Dash, by prompting GPT-4 with a single prompt to create a dashboard with a dropdown menu and two types of maps using the UN food security dataset. The author details the preprocessing required for the dataset, the specific instructions given to GPT-4, and the resulting code for both frameworks. Despite some issues with deprecated code and errors in the Streamlit example, the author concludes that both frameworks can achieve the task with one prompt, with Plotly Dash slightly outperforming Streamlit in this particular test due to fewer re-occurring issues.

Opinions

  • The author believes that GPT-4 has become adept at handling Python code creation for multi-visual dashboards but notes that it operates at the edge of its capabilities for complex tasks.
  • GPT-4 is prone to forgetting certain instructions or using deprecated code when dealing with high task complexity, necessitating multiple iterations to achieve the desired outcome.
  • Streamlit encountered issues with deprecated code and errors related to non-numeric values, which required additional prompts to resolve, thus failing the one-prompt criteria in the author's test.
  • Plotly Dash, while also presenting challenges, was ultimately more successful in generating the desired dashboard with fewer issues, leading the author to declare it the winner of the one-prompt test.
  • The author acknowledges that GPT-4's performance in creating complex dashboards is impressive and continuously improving, but also notes that it is not yet consistently reliable for such tasks.
  • The author suggests that readers subscribe to their Substack for exclusive content on data analysis and visualization, indicating a commitment to sharing knowledge and insights in this field.

GPT-4 One-Prompt Dashboard Showdown: Plotly Dash vs Streamlit

Which framework reigns supreme in GPT-4, prompting simplicity?

Dall-E generated image: impresionist painting of two similar dashboards on two computer screens

For the past 8 months, I have been incessant on optimizing GPT-4 prompt engineering for Python data visualization code.

I now have a solid grasp on what works and what doesn’t when prompting for interactive data visualizations.

Two of the most used Python frameworks for multi-visual dashboards are Streamlit and Plotly dash. In my experience, GPT-4 has become quite good at handling Python code creation for multi-visual dashboards.

How good is each framework? Let’s put them side-by-side and prompt them with the same data set to create an interactive data-visual dashboard that has:

  1. A dropdown menu (allowing the user to interact with the dataset)
  2. Two different types of maps (a choropleth map and a bubble map) that are updated based on the user’s choice.

And for each framework, let’s put GPT-4 to the test using a single prompt!

The Dataset

For this test, let’s use the UN food security data as it is up to date and provides a yearly snapshot of the global food security situation at the country level.

The UN food security dataset can be found HERE.

There are a few options to consider when downloading the dataset:

For the CSV file to download for this article, I have chosen all of the countries for all of the years available (2000–2022), along with 2 important indicators:

  1. Prevalence of undernourishment (percent)
  2. Number of people undernourished (million)

After downloading, I saved the file as UN_food_insecurity.csv.

On inspection, the relevant fields in the dataset are:

  1. Area: The name of the geographical area, such as a country or region.
  2. Item: The name or description of the item or indicator.
  3. Year: The actual year or range of years as a string.
  4. Unit: The unit of measurement for the data values.
  5. Value: The data value.

And for the actual data, there are a few things to consider:

  1. For the Item — this is the indicator field that we specified during the download. We know that there are 2 main indicators.
  2. For the Year — This one is a bit tricky. It is a 3-year rolling average for the year, so the actual year is the middle year (i.e., 2002 = 2001–2003).
  3. For the Value — some of the data using a comparison operator (i.e., >2.5). We need to have a way to handle this data. The simplest way is to just remove the operator.

Now that we know what needs to be “massaged” we can upload our data file to GPT-4 and prompt it to do the rest.

Click on the attachment icon and upload the CSV file that you downloaded in the previous section (I called my file UN_food_security.csv)

Setting Up A Test Prompt for Both Frameworks

To put it all together, our GPT-4 prompt needs to include the following:

  1. Reading the dataset and preprocessing it to extract the necessary information.
  2. Creating a dropdown menu to allow the user to select the year.
  3. Displaying a choropleth map for the selected year to show severe food insecurity percentages by country.
  4. Displaying a bubble map for the selected year to show severe food insecurity numbers (millions) by country

Putting all of these pieces together, here is the prompt to use for both tests:

Our One-Stop Prompt to GPT-4:

Using the dataset provided, create the Python [framework] code for a dashboard to display two maps (created with Plotly express).

For the Year, the data is in the format ‘YYYY-YYYY’ where the actual year is the middle year of the range. For the value, some values have “<” or “>” in front. Remove this value. Make sure all values in the Value field are numeric.

On the dashboard, allow the user to select the year from a dropdown. Using the COUNTRY NAME from the Area field, using the “RdYlGn” color scheme (where red is the highest value) display a choropleth map that uses the data from the Item field “Prevalence of undernourishment (percent)” for that year. Below the first map, also using the country name from the Area field, create a second map as a bubble map that uses the data from the Item field “Number of people undernourished (million)”. Each bubble must be red. Do not use deprecated code.

The only thing to change for each prompt is the name of the framework we will be using (insert into the [framework] placeholder).

Let’s give it a go!

Streamlit Prompt and Results

After prompting GPT-4 with our ready-made detailed one-stop prompt, the resulting Streamlit code generated:

import streamlit as st
import pandas as pd
import plotly.express as px

# Load and preprocess the dataset
@st.cache
def load_data():
    df = pd.read_csv('UN_food_insecurity.csv')  # Adjust the path as necessary
    df['Middle_Year'] = df['Year'].apply(lambda x: str(round((int(x.split('-')[0]) + int(x.split('-')[1])) / 2)))
    df['Value_Clean'] = df['Value'].str.replace('<', '').str.replace('>', '').astype(float)
    # Ensure that rows with NaN values in 'Value_Clean' are not included
    df = df.dropna(subset=['Value_Clean'])
    return df

df = load_data()

# Streamlit app layout
st.title('Global Food Insecurity Analysis')

# Dropdown for year selection
selected_year = st.selectbox('Select Year', options=df['Middle_Year'].unique())

# Filter data based on the selected year and item
df_filtered_percent = df[(df['Middle_Year'] == selected_year) & (df['Item'].str.contains('Prevalence of undernourishment \(percent\)'))]
df_filtered_million = df[(df['Middle_Year'] == selected_year) & (df['Item'].str.contains('Number of people undernourished \(million\)'))]

# Choropleth map for Prevalence of undernourishment (percent)
fig_percent = px.choropleth(df_filtered_percent,
                            locations='Area',
                            locationmode='country names',
                            color='Value_Clean',
                            color_continuous_scale='RdYlGn_r',
                            title=f'Prevalence of Undernourishment (%) in {selected_year}')
st.plotly_chart(fig_percent)

# Bubble map for Number of people undernourished (million)
# Ensure there are no NaN values for 'Value_Clean' before plotting
df_filtered_million = df_filtered_million.dropna(subset=['Value_Clean'])
fig_million = px.scatter_geo(df_filtered_million,
                             locations='Area',
                             locationmode='country names',
                             size='Value_Clean',
                             color_discrete_sequence=['red'],
                             title=f'Number of People Undernourished (Millions) in {selected_year}')
st.plotly_chart(fig_million)

NOTE: GPT-4 may give you deprecated code for the @st.cache statement — if it does, this needs to be replaced with the newer @st_cache.data statement. AND there may be an issue with the bubble map and non-numeric values (more on this below)

To run the Streamlit code, you need to make sure you have the Streamlit and Plotly express libraries installed.

Once installed, you can run the Streamlit application from your Terminal (command prompt):

To run your streamlit code, open a terminal prompt and type in “streamlit run [file_name]”

Streamlit will find an open port and then display the dashboard in your default browser:

Interactive Streamlit dashboard running from the GPT-4 generated code

In the above example, the Streamlit app is running in my default browser at port 8511.

So far so good. Let’s move on to our second test with Plotly dash.

Plotly Dash Prompt and Results

With the same one-stop prompt used for the Streamlit example (make sure you replace [framework] with Plotly dash), the code generated by GPT-4:

# Import necessary libraries
import dash
from dash import dcc, html
from dash.dependencies import Input, Output
import plotly.express as px
import pandas as pd

# Load the dataset
data_path = 'UN_food_insecurity.csv'
df = pd.read_csv(data_path)

# Data preprocessing
# Extract the middle year
df['Middle_Year'] = df['Year'].apply(lambda x: str(round((int(x.split('-')[0]) + int(x.split('-')[1])) / 2)))
# Clean the 'Value' column
df['Value_Clean'] = df['Value'].str.replace('<', '').str.replace('>', '').astype(float)
# Filter for the specific items
df_undernourishment_percent = df[df['Item'].str.contains('Prevalence of undernourishment \(percent\)')]
df_undernourished_million = df[df['Item'].str.contains('Number of people undernourished \(million\)')]

# Initialize the Dash app
app = dash.Dash(__name__)

# Unique years for dropdown
years = df['Middle_Year'].unique()

app.layout = html.Div([
    dcc.Dropdown(
        id='year-dropdown',
        options=[{'label': year, 'value': year} for year in years],
        value=years[0]  # Default value
    ),
    dcc.Graph(id='map-undernourishment-percent'),
    dcc.Graph(id='map-undernourished-million')
])

@app.callback(
    Output('map-undernourishment-percent', 'figure'),
    Output('map-undernourished-million', 'figure'),
    [Input('year-dropdown', 'value')]
)
def update_maps(selected_year):
    # Filter data based on selected year
    df_filtered_percent = df_undernourishment_percent[df_undernourishment_percent['Middle_Year'] == selected_year]
    df_filtered_million = df_undernourished_million[(df_undernourished_million['Middle_Year'] == selected_year) & (df_undernourished_million['Value_Clean'].notna())]

    # Create the choropleth map for Prevalence of undernourishment (percent)
    fig_percent = px.choropleth(df_filtered_percent,
                                locations='Area',
                                locationmode='country names',
                                color='Value_Clean',
                                color_continuous_scale='RdYlGn_r',
                                title='Prevalence of Undernourishment (%)')

    # Ensure there are no NaN values for the size parameter in the bubble map
    # Filter out NaN values from df_filtered_million['Value_Clean']

    # Create the bubble map for Number of people undernourished (million)
    fig_million = px.scatter_geo(df_filtered_million,
                                 locations='Area',
                                 locationmode='country names',
                                 size='Value_Clean',
                                 color_discrete_sequence=['red'],
                                 title='Number of People Undernourished (Millions)')

    return fig_percent, fig_million

# Run the app
if __name__ == '__main__':
    app.run_server(debug=True)

To run this code, you need to ensure that you have the Plotly dash and plotly.express libraries installed on your system. Also, make sure the data_path variable is set to the correct path (and name) to your CSV file.

The resulting dashboard using Plotly Dash

By default, Plotly dash displays the dashboard in your default browser at the location of localhost:8050 (port 8050).

Success! From one careful prompt.

And The Winner Is…

Whew, this is a tough question to answer. They both are visually similar and both have a few issues with this level of task complexity.

I had to go through a number of iterations to get both of them to work properly.

When prompting GPT-4 to create 2 different maps from 2 different fields in the dataset, the LLM is truly at the outer edge of its capabilities. This manifests itself in a few different ways.

  1. When running these tests, GPT-4 may “lose its place”. One thing I notice time and time again is that the higher the complexity of the task, the more likely GPT-4 will forget something important. For example, it may forget that it’s supposed to be using only the data from the dataset.
  2. Because there are two different types of maps, there is a higher chance that GPT-4 will mess one of them up. GPT-4 gets confused some times about needing a “ISO-3” file to map country names with 3-letter country codes. This is why it is necessary to explicitly state that the maps be created with Plotly express — it doesn’t need the ISO-3 code.
  3. GPT-4 may default to trying to use a geopandas file. This is not necessary if you use Plotly express to generate the map as this library can use the Country name (from the Area field) as an indicator.

NOTE: If I have a data set with country names, I usually include an ISO3-to-country-name file that I have in my CSV folder. This way, I can add it to my GPT-4 prompt to simplify my prompting.

So which framework is the winner? I am not going to sit on the fence!

Whereas GPT-4 was able to create the Plotly dash dashboard without issue, there were two (2) re-occurring issues with Streamlit:

  1. The GPT-4 generated Python Streamlit code kept coming up with a deprecated code message for @st.cache — even when I prompted it to NOT use deprecated code (the current usage is @st.cache.data)
  2. A re-occurring error that GPT-4 could not seem to figure out in the one-prompt test (even when explicitly stated):
Streamlit error with the Value field for the bubble map.

This error was easily fixed with a second prompt to GPT-4 (by copying and pasting the error into the prompt window):

Prompting GPT-4 to correct the NaN error in the Streamlit Code.

GPT-4 successfully re-generated the code with the correction to ensure all values for the bubble map are numeric.

But… this means that for this particular test, Streamlit failed the one-prompt criteria.

That leaves Python Plotly dash framework as the one-prompt test winner!

In Summary…

Testing GPT-4 to create a one-prompt interactive Streamlit and Plotly dash dashboard with 2 different maps is truly at the edge of this LLM’s current capabilities .

With both frameworks, GPT-4 gave me some issues with how to create the map using the data set provided.

For example, it lost its place on a few occasions — forgetting that it was supposed to use the dataset, and that it should not use deprecated code. From my experience, this happens when it is on the edge of its capabilities.

Was I ultimately able to generate the dashboards for both libraries with one prompt? Yes.

Am I confident that GPT-4 will be successful every time I use the same prompt? No — at least not yet.

However, GPT-4 keeps on getting better and better at this kind of task.

Stay tuned for the next update!

Thank you for reading.

If this type of story is right up your alley, and you want to support me as a writer, subscribe to my Substack.

On Substack, I publish a newsletter and articles that you won’t find on the other platforms where I create content.

ChatGPT
Prompt Engineering
Data Analysis
Data Visualization
Python
Recommended from ReadMedium