Masked Arrays and Boolean Indexing in NumPy
Learn how to selectively access and manipulate data in NumPy arrays
Masked arrays and Boolean indexing are two powerful features of NumPy that allow you to work with arrays in a more efficient and flexible manner. In this tutorial, we will explore these features and how they can be used to manipulate arrays in NumPy.
Masked Arrays
A masked array is an array in which some elements have been masked, or hidden, from view. This allows you to work with arrays that have missing or invalid data without having to remove them from the array entirely. Masked arrays are created using the numpy.ma module.
Creating a Masked Array
To create a masked array, you can use the masked_array() function from the numpy.ma module. This function takes two arguments: the original array and a mask. The mask is a Boolean array that specifies which elements should be masked.
import numpy as np
import numpy.ma as ma
# Create a sample array
x = np.array([1, 2, 3, 4, 5])
# Create a mask for the array
mask = np.array([False, False, True, False, True])
# Create a masked array
mx = ma.masked_array(x, mask)
print(mx)Output:
[1 2 -- 4 --]
In the above example, we created a masked array mx from the original array x and the mask mask. The elements with True values in the mask are masked and replaced with --.
Manipulating a Masked Array
Once you have created a masked array, you can manipulate it just like a regular array. You can perform mathematical operations, use indexing and slicing, and access properties like the shape and size of the array.
# Perform mathematical operations on a masked array
print(mx.mean())
print(mx.sum())
# Use indexing and slicing
print(mx[0])
print(mx[1:3])
# Access properties
print(mx.shape)
print(mx.size)Output:
2.3333333333333335 7 1 [2 --] (5,) 5
Applying Functions to a Masked Array
You can also apply functions to a masked array using the apply_along_axis() function from the numpy.ma module. This function applies a function to each row or column of a masked array.
# Define a function to apply to the array
def myfunc(x):
return x.sum() / x.size
# Apply the function along the rows of the masked array
result = ma.apply_along_axis(myfunc, 0, mx)
print(result)Output:
[1. 2. 4. 4. 5.]
In the above example, we defined a function myfunc() that calculates the mean of an array. We then applied this function along the rows of the masked array mx using the apply_along_axis() function. The resulting array contains the mean of each row of the masked array.
Boolean Indexing
Boolean indexing is a powerful feature in NumPy that allows you to select elements from an array based on a Boolean condition. This allows you to extract only the elements of an array that meet a certain condition, making it easy to perform operations on specific subsets of data.
Creating a Boolean Mask
To create a Boolean mask, you can use a comparison operator to compare an array to a scalar value or another array. The result of the comparison is a Boolean array with the same shape as the original array.
import numpy as np
# Create a sample array
x = np.array([1, 2, 3, 4, 5])
# Create a Boolean mask
mask = x > 2
print(mask)Output:
[False False True True True]
In the above example, we created a Boolean mask by comparing the array x to the scalar value 2. The resulting mask is a Boolean array with True values where the original array is greater than 2, and False values otherwise.
Indexing with a Boolean Mask
Once you have created a Boolean mask, you can use it to index into an array and select only the elements that meet the condition. To do this, simply pass the Boolean mask as an index to the array.
# Index the array with the Boolean mask
result = x[mask]
print(result)Output:
[3 4 5]
In the above example, we used the Boolean mask to index into the array x and select only the elements that are greater than 2. The resulting array contains only the elements that meet the condition.
Combining Boolean Masks
You can also combine Boolean masks using logical operators to create more complex conditions. For example, you can use the & operator to create a mask that selects only elements that are both greater than 2 and less than 5.
# Create a complex Boolean mask
mask = (x > 2) & (x < 5)
print(mask)Output:
[False False True True False]
In the above example, we used the & operator to create a complex Boolean mask that selects only elements that are greater than 2 and less than 5. The resulting mask is a Boolean array with True values where the original array meets both conditions, and False values otherwise.
Updating Elements with a Boolean Mask
You can also use Boolean indexing to update elements of an array that meet a certain condition. To do this, simply use the Boolean mask as an index and assign a new value to the selected elements.
# Update elements that meet the condition
x[mask] = 0
print(x)Output:
[1 2 0 0 5]
In the above example, we used the Boolean mask to select only the elements of the array x that meet the condition. We then assigned a new value of 0 to these elements, effectively updating them in the original array.
Course index:
Have you spent your learning budget for this month, you can join Medium here:
Level Up Coding
Thanks for being a part of our community! Before you go:
- 👏 Clap for the story and follow the author 👉
- 📰 View more content in the Level Up Coding publication
- 💰 Free coding interview course ⇒ View Course
- 🔔 Follow us: Twitter | LinkedIn | Newsletter
🚀👉 Join the Level Up talent collective and find an amazing job
