avatarThe PyCoach

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

4657

Abstract

L. Messi'</span>, <span class="hljs-string">'height_cm'</span>] <span class="hljs-comment"># iloc</span> <span class="hljs-meta">>>> </span>df.iloc[<span class="hljs-number">0</span>, <span class="hljs-number">1</span>]</pre></div><div id="e796"><pre>170</pre></div><p id="405b">As you can see <code>loc</code> needs the “L.Messi” and “height_cm” because they’re their row/column labels, while <code>iloc</code> needs 0 and 1 because they’re the row/column position. Both methods get the same output 170, which is the height of Lionel Messi.</p><p id="5e9e">Now let’s get the height of Cristiano Ronaldo.</p><div id="956a"><pre><span class="hljs-comment"># get the height of Cristiano Ronaldo</span></pre></div><div id="9bbc"><pre><span class="hljs-comment"># loc</span> <span class="hljs-meta">>>> </span>df.loc[<span class="hljs-string">'Cristiano Ronaldo'</span>, <span class="hljs-string">'height_cm'</span>] <span class="hljs-comment"># iloc</span> <span class="hljs-meta">>>> </span>df.iloc[<span class="hljs-number">1</span>, <span class="hljs-number">1</span>]</pre></div><div id="9328"><pre>187</pre></div><p id="f60d">In case you want to get all the data of a specific row/column, use the <code>:</code> in both <code>loc</code> and <code>iloc</code>.</p><div id="fa45"><pre><span class="hljs-comment"># get all the data about L.Messi</span></pre></div><div id="2b69"><pre><span class="hljs-comment"># loc</span> <span class="hljs-meta">>>> </span>df.loc[<span class="hljs-string">'L. Messi'</span>, :] <span class="hljs-comment"># iloc</span> <span class="hljs-meta">>>> </span>df.iloc[<span class="hljs-number">0</span>, :]</pre></div><div id="51af"><pre><span class="hljs-attribute">age</span> <span class="hljs-number">32</span> <span class="hljs-attribute">height_cm</span> <span class="hljs-number">170</span> <span class="hljs-attribute">nationality</span> Argentina <span class="hljs-attribute">club</span> Paris Saint-Germain</pre></div><h2 id="1bfc">Selecting multiple rows/columns with a list</h2><p id="c5d8">We can use a list of labels/positions to select multiple rows and columns with <code>loc</code> and <code>iloc</code> respectively.</p><div id="f761"><pre><span class="hljs-comment"># get all data about L.Messi and Cristiano Ronaldo</span></pre></div><div id="08bf"><pre># loc >>> df.loc<span class="hljs-string">[['L. Messi', 'Cristiano Ronaldo']]</span>

iloc

>>> df.iloc<span class="hljs-string">[[0, 1]]</span></pre></div><p id="0d80">Here’s the output of our selection:</p><figure id="41d8"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*tEyrSADMOg9m0vYZtouNlg.png"><figcaption>Image by author</figcaption></figure><p id="f227">Now let’s get only the height of these two players by adding the ‘height_cm’ column label/‘1’ column position to <code>loc</code> and <code>iloc</code> respectively.</p><div id="bf10"><pre><span class="hljs-comment"># get the height of L.Messi and Cristiano Ronaldo</span></pre></div><div id="2135"><pre># loc >>> df.loc[[<span class="hljs-string">'L. Messi'</span>, <span class="hljs-string">'Cristiano Ronaldo'</span>], <span class="hljs-string">'height_cm'</span>]

iloc

>>> df.iloc[[<span class="hljs-number">0</span>, <span class="hljs-number">1</span>], <span class="hljs-number">1</span>]</pre></div><div id="c658"><pre><span class="hljs-attribute">L</span>. Messi <span class="hljs-number">170</span> <span class="hljs-attribute">Cristiano</span> Ronaldo <span class="hljs-number">187</span></pre></div><h2 id="9fa1">Selecting multiple rows/columns with a slice</h2><p id="6800">We can also select multiple rows and columns with a slice with both <code>loc</code> and <code>iloc</code>.</p><p id="92fe">Let’s do a slice that gets the column ‘age’, ‘height_cm’, and ‘nationality’.</p><div id="0b59"><pre># <span class="hljs-keyword">slice</span> <span class="hljs-keyword">column</span> labels: <span class="hljs-keyword">from</span> age <span class="hljs-keyword">to</span> nationality</pre></div><div id="1502"><pre># loc <span class="hljs-meta prompt_">>>></span> <span class="language-python">players = [<span class="hljs-string">'L. Messi'</span>, <span class="hljs-string">'Cristiano Ronaldo'</span>]</span> <span class="hljs-meta prompt_">>>></span> <span class="language-python">df.loc[players, <span class="hljs-string">'age'</span>:<span class="hljs-string">'nationality'</span>]</span>

iloc

<span class="hljs-meta prompt_">>>></span> <span class="language-p

Options

ython">players = [<span class="hljs-number">0</span>, <span class="hljs-number">1</span>]</span> <span class="hljs-meta prompt_">>>></span> <span class="language-python">df.iloc[players, <span class="hljs-number">0</span>:<span class="hljs-number">3</span>] <span class="hljs-comment"># age:nationality+1</span></span></pre></div><p id="fd8f">Unlike lists, when we select elements with a slice the last element “nationality” is included in the <code>loc</code> method, while the last element “3” is excluded in the <code>iloc</code>.</p><p id="0fda">Here’s the output we get for both <code>loc</code> and <code>iloc</code>:</p><figure id="9582"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*mYQ9-CUp3jrQdGq5215RCA.png"><figcaption>Image by author</figcaption></figure><h2 id="f8d2">Selecting with conditions</h2><p id="ffb7">Whenever you want to select elements based on certain conditions, keep in mind that <code>iloc</code> needs a boolean list, so we have to use the <code>list()</code> function to convert our Series to a boolean list.</p><div id="1735"><pre># <span class="hljs-keyword">one</span> <span class="hljs-keyword">condition</span>: <span class="hljs-keyword">select</span> player <span class="hljs-keyword">with</span> height above <span class="hljs-number">180</span>cm</pre></div><div id="ef04"><pre># loc columns = [<span class="hljs-string">'age'</span>, <span class="hljs-string">'height_cm'</span>, <span class="hljs-string">'club'</span>] df.loc[df[<span class="hljs-string">'height_cm'</span>]><span class="hljs-number">180</span>, columns]

iloc

columns = [<span class="hljs-number">0</span>,<span class="hljs-number">1</span>,<span class="hljs-number">3</span>] df.iloc[list(df[<span class="hljs-string">'height_cm'</span>]><span class="hljs-number">180</span>), columns]</pre></div><p id="4abd">Here’s the output we get for both <code>loc</code> and <code>iloc</code>:</p><figure id="6ff5"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*af5DBOEQ7riW9ySZz3OJrQ.png"><figcaption>Image by author</figcaption></figure><p id="d219">The same rule goes in case you want to apply multiple conditions. Say we want to obtain players with a height above 180cm that played in PSG<i>.</i></p><div id="ea0b"><pre><span class="hljs-comment"># multiple conditions: select player with height above 180cm that played in PSG</span></pre></div><div id="13e4"><pre># loc df.loc[(df[<span class="hljs-string">'height_cm'</span>]><span class="hljs-number">170</span>) & (df[<span class="hljs-string">'club'</span>]==<span class="hljs-string">'Paris Saint-Germain'</span>), :]

iloc

df.iloc[list((df[<span class="hljs-string">'height_cm'</span>]><span class="hljs-number">170</span>) & (df[<span class="hljs-string">'club'</span>]==<span class="hljs-string">'Paris Saint-Germain'</span>)), :]</pre></div><p id="8b93">Here’s the output we get for both <code>loc</code> and <code>iloc</code>:</p><figure id="c2fb"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*YQv60HqtZB4kQTM345B8vA.png"><figcaption>Image by author</figcaption></figure><p id="7ec1">That’s it! Now you’re ready to select elements from dataframes using <code>loc</code> and <code>iloc</code>. You can find all the code written in this article on my <a href="https://github.com/ifrankandrade/python-course-for-excel-users.git">Github</a>.</p><p id="e050"><a href="https://frankandrade.ck.page/bd063ff2d3"><b>Join my email list with 10k+ people to get my Python for Data Science Cheat Sheet I use in all my tutorials (Free PDF)</b></a></p><p id="a5f0">If you enjoy reading stories like these and want to support me as a writer, consider signing up to become a Medium member. It’s $5 a month, giving you unlimited access to thousands of Python guides and Data science articles. If you sign up using <a href="https://frank-andrade.medium.com/membership">my link</a>, I’ll earn a small commission with no extra cost to you.</p><div id="d769" class="link-block"> <a href="https://frank-andrade.medium.com/membership"> <div> <div> <h2>Join Medium with my referral link — Frank Andrade</h2> <div><h3>As a Medium member, a portion of your membership fee goes to writers you read, and you get full access to every story…</h3></div> <div><p>frank-andrade.medium.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*xJErm7xRo6Ru3zNo)"></div> </div> </div> </a> </div></article></body>

loc vs iloc in Pandas. Here’s The Difference.

How to select rows and columns in Pandas with loc and iloc

Photo by Neil and Zulma Scott on Unsplash

When it comes to selecting data in Pandas, there are different alternatives. One of the most popular is using loc and iloc, but what are the differences between them?

I had the same question when I started learning Pandas and, in this article, I’ll show you the main differences between selecting data with loc and iloc in Pandas and also show some examples to make it clear.

By the end of this article, you’ll know how to select single values, multiple rows, and columns using both loc and iloc.

Differences between loc and iloc

The main difference between loc and iloc is that loc is label-based (you need to specify the row and column labels) while iloc is integer-position based (you need to specify the row and column by the integer position values, which start with 0)

Below are practical examples to understand this much better. You can either watch my YouTube video or continue reading this article.

Consider the dataframe with the following row labels and row positions:

Image by author

Now let’s see how we’d select a random element using loc and iloc:

Image by author

In the examples above, loc and iloc return the same output except for the slicing where the last element is included in the loc and excluded in the iloc.

Selecting elements from a Dataframe using loc and iloc

To see in detail de differences between loc and iloc let’s create a dataframe with basic information about top football players.

Here’s the code to create this little dataframe:

And here’s the dataframe you should obtain after running the code:

Image by author

Now let’s see how we can select elements from a dataframe with a single value, list, and slicing.

Selecting data with a single value

We can locate elements with loc and iloc by adding a single value as input. Here’s the syntax to follow:

  • loc[row_label, column_label]
  • iloc[row_position, column_position]

Say we want to get the height of Lionel Messi.

# get the height of L.Messi
# loc
>>> df.loc['L. Messi', 'height_cm']
# iloc
>>> df.iloc[0, 1]
170

As you can see loc needs the “L.Messi” and “height_cm” because they’re their row/column labels, while iloc needs 0 and 1 because they’re the row/column position. Both methods get the same output 170, which is the height of Lionel Messi.

Now let’s get the height of Cristiano Ronaldo.

# get the height of Cristiano Ronaldo
# loc
>>> df.loc['Cristiano Ronaldo', 'height_cm']
# iloc
>>> df.iloc[1, 1]
187

In case you want to get all the data of a specific row/column, use the : in both loc and iloc.

# get all the data about L.Messi
# loc
>>> df.loc['L. Messi', :]
# iloc
>>> df.iloc[0, :]
age                             32
height_cm                      170
nationality              Argentina
club           Paris Saint-Germain

Selecting multiple rows/columns with a list

We can use a list of labels/positions to select multiple rows and columns with loc and iloc respectively.

# get all data about L.Messi and Cristiano Ronaldo
# loc
>>> df.loc[['L. Messi', 'Cristiano Ronaldo']]
# iloc
>>> df.iloc[[0, 1]]

Here’s the output of our selection:

Image by author

Now let’s get only the height of these two players by adding the ‘height_cm’ column label/‘1’ column position to loc and iloc respectively.

# get the height of L.Messi and Cristiano Ronaldo
# loc
>>> df.loc[['L. Messi', 'Cristiano Ronaldo'], 'height_cm']
# iloc
>>> df.iloc[[0, 1], 1]
L. Messi             170
Cristiano Ronaldo    187

Selecting multiple rows/columns with a slice

We can also select multiple rows and columns with a slice with both loc and iloc.

Let’s do a slice that gets the column ‘age’, ‘height_cm’, and ‘nationality’.

# slice column labels: from age to nationality
# loc
>>> players = ['L. Messi', 'Cristiano Ronaldo']
>>> df.loc[players, 'age':'nationality']
# iloc
>>> players = [0, 1]
>>> df.iloc[players, 0:3] # age:nationality+1

Unlike lists, when we select elements with a slice the last element “nationality” is included in the loc method, while the last element “3” is excluded in the iloc.

Here’s the output we get for both loc and iloc:

Image by author

Selecting with conditions

Whenever you want to select elements based on certain conditions, keep in mind that iloc needs a boolean list, so we have to use the list() function to convert our Series to a boolean list.

# one condition: select player with height above 180cm
# loc
columns = ['age', 'height_cm', 'club']
df.loc[df['height_cm']>180, columns]
# iloc
columns = [0,1,3]
df.iloc[list(df['height_cm']>180), columns]

Here’s the output we get for both loc and iloc:

Image by author

The same rule goes in case you want to apply multiple conditions. Say we want to obtain players with a height above 180cm that played in PSG.

# multiple conditions: select player with height above 180cm that played in PSG
# loc
df.loc[(df['height_cm']>170) & (df['club']=='Paris Saint-Germain'), :]
# iloc
df.iloc[list((df['height_cm']>170) & (df['club']=='Paris Saint-Germain')), :]

Here’s the output we get for both loc and iloc:

Image by author

That’s it! Now you’re ready to select elements from dataframes using loc and iloc. You can find all the code written in this article on my Github.

Join my email list with 10k+ people to get my Python for Data Science Cheat Sheet I use in all my tutorials (Free PDF)

If you enjoy reading stories like these and want to support me as a writer, consider signing up to become a Medium member. It’s $5 a month, giving you unlimited access to thousands of Python guides and Data science articles. If you sign up using my link, I’ll earn a small commission with no extra cost to you.

Python
Data Science
Programming
Education
Machine Learning
Recommended from ReadMedium