avatarTrainDataHub

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

2816

Abstract

e4">We can see briefly that the lighter colors show the stronger correlation than the brighter colors according to the color bar.</p><p id="ce98">Now let’s try out different styles of correlation heatmap.</p><p id="4d92"><b>Annotated Correlation Heatmap</b></p><p id="d089">We will add in <b>annot=True</b> to display the correlation numbers on the heatmap.</p><div id="958c"><pre><span class="hljs-comment"># Add figure title and size</span> <span class="hljs-attribute">fig</span>, ax = plt.subplots() <span class="hljs-attribute">fig</span>.set_size_inches(<span class="hljs-number">12</span>,<span class="hljs-number">11</span>) <span class="hljs-attribute">plt</span>.title('HeatMap Correlation Matrix', size = <span class="hljs-number">20</span>, color = 'Black', alpha = <span class="hljs-number">0</span>.<span class="hljs-number">9</span>)</pre></div><div id="bbc8"><pre><span class="hljs-comment"># Correlation</span> corr = df.corr() sns.heatmap(corr, <span class="hljs-attribute">annot</span>=<span class="hljs-literal">True</span>, <span class="hljs-attribute">fmt</span>=<span class="hljs-string">".1f"</span>, <span class="hljs-attribute">cmap</span>=<span class="hljs-string">"ocean"</span>, <span class="hljs-attribute">center</span>=0, <span class="hljs-attribute">ax</span>=ax, alpha = 0.5)</pre></div><figure id="13de"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*nxRfNRyRJ0XHPJ_0"><figcaption></figcaption></figure><p id="59be">Annotated heatmap is more preferable than the basic one since we can spot the correlation coefficients easily.</p><p id="c557">Now let’s customize the heatmap a bit with size of the annotated text, linewidth, and linecolors.</p><div id="1e2a"><pre><span class="hljs-comment"># Add figure title and size</span> <span class="hljs-attribute">fig</span>, ax = plt.subplots() <span class="hljs-attribute">fig</span>.set_size_inches(<span class="hljs-number">12</span>,<span class="hljs-number">10</span>) <span class="hljs-attribute">plt</span>.title('HeatMap Correlation Matrix', size = <span class="hljs-number">20</span>, color = 'Black', alpha = <span class="hljs-number">0</span>.<span class="hljs-number">9</span>)</pre></div><div id="1ec7"><pre><span class="hljs-comment"># Correlation</span> corr = df.corr() sns.heatmap(corr, <span class="hljs-attribute">annot</span>=<span class="hljs-literal">True</span>, <span class="hljs-attribute">fmt</span>=<span class="hljs-string">".1F"</span>, <span class="hljs-attribute">cmap</span>=<span class="hljs-string">"plasma"</span>, alpha = 0.8, annot_kws={<span class="hljs-string">"size"</span>:12}, linewidths = 2.5, linecolor = <span class="hljs-string">'yellow'</span>)</pre></div><figure id="ab32"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*FpsVf6T8ePasM23W"><figcaption></f

Options

igcaption></figure><p id="d11d"><b>Annotated Correlation Heatmap with Specific Condition</b></p><p id="e565">Let’s say we only want to see the correlation pairs in which the correlation coefficients are higher than 0.5.</p><p id="794a">Notice the use of <b>corr >= 0.5</b> for selection of the pair that are greater than 0.5</p><div id="17f0"><pre><span class="hljs-attribute">fig</span>, ax = plt.subplots() <span class="hljs-attribute">fig</span>.set_size_inches(<span class="hljs-number">12</span>,<span class="hljs-number">11</span>) <span class="hljs-attribute">plt</span>.title('HeatMap Correlation Matrix with Correlation > <span class="hljs-number">0</span>.<span class="hljs-number">5</span>', size = <span class="hljs-number">20</span>, color = 'Black', alpha = <span class="hljs-number">0</span>.<span class="hljs-number">9</span>)</pre></div><div id="b937"><pre><span class="hljs-comment"># Correlation</span> corr = df.corr() corr_modified = corr[corr>=0.5] sns.heatmap(corr_modified, <span class="hljs-attribute">annot</span>=<span class="hljs-literal">True</span>, <span class="hljs-attribute">fmt</span>=<span class="hljs-string">".1f"</span>, <span class="hljs-attribute">cmap</span>=<span class="hljs-string">"Pastel1_r"</span>, <span class="hljs-attribute">center</span>=0, <span class="hljs-attribute">ax</span>=ax)</pre></div><figure id="cd98"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*sE7sv1B_ZFVaCBfI"><figcaption></figcaption></figure><p id="6adc">Now we will only see the variable pair that have correlation coefficient greater than 0.5.</p><div id="d211"><pre><span class="hljs-attribute">fig</span>, ax = plt.subplots() <span class="hljs-attribute">fig</span>.set_size_inches(<span class="hljs-number">12</span>,<span class="hljs-number">10</span>) <span class="hljs-attribute">plt</span>.title('HeatMap Correlation Matrix with Correlation > <span class="hljs-number">0</span>.<span class="hljs-number">5</span>', size = <span class="hljs-number">20</span>, color = 'Black', alpha = <span class="hljs-number">0</span>.<span class="hljs-number">9</span>)</pre></div><div id="a314"><pre><span class="hljs-comment"># Correlation</span> corr = df.corr() corr_modified = corr[corr>=0.5] sns.heatmap(corr_modified, <span class="hljs-attribute">annot</span>=<span class="hljs-literal">True</span>, <span class="hljs-attribute">fmt</span>=<span class="hljs-string">".1f"</span>, <span class="hljs-attribute">cmap</span>=<span class="hljs-string">"rainbow_r"</span>, annot_kws = {<span class="hljs-string">'size'</span>:16}, linewidth = 1.5, linecolor = <span class="hljs-string">'pink'</span>)</pre></div><figure id="3ba3"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*oYKDd0GB12_fJO7L"><figcaption></figcaption></figure></article></body>

How To Construct Different Types Of Correlation Heatmap With Seaborn In Python

A correlation heatmap is the the visual graph that show the relationship between the numerical variables within the data set. The correlation values range from -1 to 1 with 1 being the strongest relationship and -1 being the weakest.

In this post, we will focus on how to generate the different types of correlational heatmap using the Seaborn visualization package in Python.

Here are the definition of the Python’s arguments needed to create the correlation heatmap.

  • df — name of the data frame
  • fmt — format of the text on each cell ( in this example, we set fmt = “.1F” so that one decimal place of scientific notation for the correlation coefficients will be displayed)
  • cmap — name of the colormap
  • alpha = 0.5 — to adjust the color intensity of the heatmap ( the higher, the brighter, and vice versa)
  • annot = True — to add the coefficient values on each cell
  • annot_kws = { “size”: 10} — set the size of the text on annotated cell
  • linewidth — thickness of the lines between each cell
  • linecolor — the color of the lines between each cell

Basic Correlation Heatmap

# Import required Python packages
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Add title and assign size of heatmap
fig, ax = plt.subplots()
fig.set_size_inches(12,11)
plt.title('HeatMap Correlation Matrix', size = 20, color = 'Black', alpha = 0.9)
# Correlation
corr = df.corr()
# Heatmap
sns.heatmap(corr, cmap="BuGn_r")

Now we will see the basic correlation heatmap like below.

We can see briefly that the lighter colors show the stronger correlation than the brighter colors according to the color bar.

Now let’s try out different styles of correlation heatmap.

Annotated Correlation Heatmap

We will add in annot=True to display the correlation numbers on the heatmap.

# Add figure title and size
fig, ax = plt.subplots()
fig.set_size_inches(12,11)
plt.title('HeatMap Correlation Matrix', size = 20, color = 'Black', alpha = 0.9)
# Correlation
corr = df.corr()
sns.heatmap(corr, annot=True, fmt=".1f", cmap="ocean", center=0, ax=ax, alpha = 0.5)

Annotated heatmap is more preferable than the basic one since we can spot the correlation coefficients easily.

Now let’s customize the heatmap a bit with size of the annotated text, linewidth, and linecolors.

# Add figure title and size
fig, ax = plt.subplots()
fig.set_size_inches(12,10)
plt.title('HeatMap Correlation Matrix', size = 20, color = 'Black', alpha = 0.9)
# Correlation
corr = df.corr()
sns.heatmap(corr, annot=True, fmt=".1F", cmap="plasma", alpha = 0.8, annot_kws={"size":12}, linewidths = 2.5, linecolor = 'yellow')

Annotated Correlation Heatmap with Specific Condition

Let’s say we only want to see the correlation pairs in which the correlation coefficients are higher than 0.5.

Notice the use of corr >= 0.5 for selection of the pair that are greater than 0.5

fig, ax = plt.subplots()
fig.set_size_inches(12,11)
plt.title('HeatMap Correlation Matrix with Correlation > 0.5', size = 20, color = 'Black', alpha = 0.9)
# Correlation
corr = df.corr()
corr_modified = corr[corr>=0.5]
sns.heatmap(corr_modified, annot=True, fmt=".1f", cmap="Pastel1_r", center=0, ax=ax)

Now we will only see the variable pair that have correlation coefficient greater than 0.5.

fig, ax = plt.subplots()
fig.set_size_inches(12,10)
plt.title('HeatMap Correlation Matrix with Correlation > 0.5', size = 20, color = 'Black', alpha = 0.9)
# Correlation
corr = df.corr()
corr_modified = corr[corr>=0.5]
sns.heatmap(corr_modified, annot=True, fmt=".1f", cmap="rainbow_r", annot_kws = {'size':16}, linewidth = 1.5, linecolor = 'pink')
Correlation
Heatmap
Seaborn Tutorial
Seaborn Image
Heatmap Tool
Recommended from ReadMedium