40 Most Insanely Usable Methods in Python

Data cleaning and wrangling in data science and machine learning

Photo by ThisisEngineering RAEng on Unsplash

This article will take you to the most usable python methods used in the field of data analysis, data science, and machine learning for data wrangling. The scope of this article is to make you feel comfortable around these methods for the long run in your career.

Topics to be covered:

1. String Methods    15. isdecimal()       29. maketrans()
2. capitalize()      16. isdigit()         30. partition()
3. casefold()        17. isidentifier()    31. replace()
4. center()          18. islower()         32. rfind()
5. count()           19. isnumeric()       33. rindex()
6. encode()          20. isprintable()     34. rjust()
7. endswith()        21. zfill()     35. rpartition()
8. expandtabs()      22. isspace()         36. rsplit()
9. find()            23. istitle()         37. rstrip()
10. format()         24. isupper()         38. splitlines()
11. format_map()     25. join()            39. startswith()
12. index()          26. ljust()           40. upper()
13. isalnum()        27. lower()           
14. isalpha()        28. lstrip()

String Methods

The string methods are very useful in data wrangling in any application related to data.

2. capitalize()

In this method the first letter word or sentences are capitalize and rest of the letters are lower.

#Example of word and sentence

word = "aMIT"
sentence = "hEllo WOrLD"

word.capitalize()

#output:
'Amit'

#if we use variable then the quotes in the output won't come

w = word.capitalize()
print(w)

#output:
Amit

#for the sentence

s = sentence.capitalize()
print(s)

#output:
Hello world

3. casefold()

This method is used to lower all the letters in the word or sentences, but if there are any different language letters, it will convert to some lower letters.

word = "aMIT"

word.casefold()

#output:
'amit'

4. center()

In this method the the word or sentences are padded in the prefix and suffix with space or other character.

word = "aMIT"
without_padding = word.center(15)

print(without_padding)

#output:
      aMIT

#With padding
word = "aMIT"
with_padding = word.center(15, '*')

print(with_padding)

#output:
******aMIT*****

5. count()

It is used to find the occurrence of letter, word, phrase in the document.

sentence = "Happy Happy Happy Happy"
letter_count = sentence.count("a")

print(letter_count)

#output:
Letter 'a' occurred: 4 Times

#find occurrence with positions

sentence = "Happy Happy Happy Happy"
letter_count = sentence.count("a", 0, 15)

print("Letter 'a' occurred in position between:", letter_count,
                                                          "Times")

#output:
Letter 'a' occurred in position: 3 Times

6. encode()

It is used to encode the words or sentences for security of the messages.

word = " Άmit "
encoded_word = word.encode()
print(encoded_word)

#output:
b' \xbe\x92mit '

7. endswith()

It will return the boolean value if the last letter or word i.e. suffix is matched or not.

sentence = "Happy Happy Happy Happy"
result = sentence.endswith("py")
print(result)

#output:
True

#when the suffix not matched
sentence = "Happy Happy Happy Happy"
result = sentence.endswith("it")
print(result)

#output:
False

8. expandtabs()

This method uses ‘\t’ character to give the space between the words or sentence.

sentence = 'happy\thappy\thappy'
r = sentence.expandtabs()

print(r)

#output:
happy   happy   happy

9. find()

It is used to find the first-time occurrence of a letter or word or sub-string in the sentence and returns the starting index.

sentence = "Happy Happy Happy Happy"
a = sentence.find("pp")
print(a)

#output:
2

10. format()

This method is very useful in data analytics research. It has key and value parameters to be used in a positional manner. The first position value goes to the first curly bracket and so on.

print("My name is {}, I live in {}.".format("Amit", "Delhi"))

#output:
My name is Amit, I live in Delhi.

11. format_map()

It is used to mapping the values in the dictionary.

#normal format method
dict1 = {'a':'Amit','b':'Delhi'}
print("Hello {a}, I live in {b}.".format(**dict1))

#output:
Hello Amit, I live in Delhi.

#With format_map() method
dict1 = {'a':'Amit','b':'Delhi'}
print("Hello {a}, I live in {b}.".format_map(dict1))

#output:
Hello Amit, I live in Delhi.

Fully Explained Array Data Structure in Python

Handy concepts in data structures for data science and machine learning

levelup.gitconnected.com

Python IO basics and Open(), Read() and write() methods

Understand basic concepts to manipulate files

medium.com

12. index()

It is used to find the index of the word or letter that is in the document.

sentence = "Happy Happy Happy Happy"
print(sentence.index('pp'))

#output:
2

13. isalnum()

This method will return the boolean value based on the alphanumeric letters in the word. If all the letters are either numeric or alphabet then it will return ‘true’ otherwise ‘false’.

word = "aMIT235"
print(word.isalnum())

#output:
True

#If there is any space it will return false.
word = "aMIT 235"
print(word.isalnum())

#output:
False

14. isalpha()

This method will return the boolean value and true if all the letters are alphabet.

word = "aMIT"
print(word.isalpha())

#output:
True

word = "aMIT235"
print(word.isalpha())

#output:
False

15. isdecimal()

This method will return the boolean value and true if all the letters are decimal.

word = "235"
print(word.isdecimal())

#output:
True

#All the letters are not numeric
word = "Amit235"
print(word.isdecimal())

#output:
False

16. isdigit()

This method will return the boolean value and true if all the letters are digit.

word = "235"
print(word.isdigit())

#output:
True

#When all the letters are not numeric
word = "Amit235"
print(word.isdigit())

#output:
False

17. isidentifier()

In this method if the word is an identifier then it will return true otherwise false.

word = "Amit235"
print(word.isidentifier())

#output:
True

word = "Amit 235"
print(word.isidentifier())

#output:
False

18. islower()

This method will check the sentence or word if it is in lower case or not.

word = "amit 235"
print(word.islower())

#output:
True

word = "Amit 235"
print(word.islower())

#output:
False

19. isnumeric()

This method will return the boolean value and true if all the letters are numeric.

word = "235"
print(word.isnumeric())

#output:
True

word = "A235"
print(word.isnumeric())

#output:
False

20. isprintable()

This method will identify that if the space is occupied with pritable characters or not.

sentence = "Happy Happy Happy Happy"
print(sentence.isprintable())

#output:
True

#The new line sequence in the last is not printable space
sentence = "Happy Happy Happy Happy\n"
print(sentence.isprintable())

#output:
False

21. zfill()

It is used to fill the padding from the left side with zero i.e. ‘0’. In the below example the word has 4 characters and the padding width is 15 then it will be 11 zeroes to the left side.

word = "A235"
print(word.zfill(15))

#output:
00000000000A235

22. isspace()

This method will return true if all the characters are white spaces in the word otherwise false.

word = "A235"
print(word.isspace())

#output:
False

word = ""
print(word.isspace())

#output:
False

word = "       "
print(word.isspace())

#Output:
True

23. istitle()

In this method, the words in the sentence start with upper case, and the rest of the letters in the word are the lower case then it will return true otherwise false.

sentence = "Happy Happy Happy Happy"
print(sentence.istitle())

#output:
True

sentence = "Happy HAppy Happy Happy"
print(sentence.istitle())

#output:
False

24. isupper()

In this method, it will return true if all the letters are in capital case and return false if all the letters are not in upper case.

sentence = "HAPPY HAPPY"
print(sentence.isupper())

#output:
True

sentence = "Happy Happy"
print(sentence.isupper())

#output:
False

25. join()

This method uses a separator to join the items in a iterate fashion.

items = ['Happy', 'Happy', 'Happy', 'Happy']
print(' '.join(items))

#output:
Happy Happy Happy Happy

#Using with different seprator
items = ['Happy', 'Happy', 'Happy', 'Happy']
print('--'.join(items))

output:
Happy--Happy--Happy--Happy

Common Pitfalls When Learning to Code

Stop wasting time

amitprius.medium.com

Introduction to Jupyter Notebook and its Features

Jupyter Notebook is an open-source web application

pub.towardsai.net

26. ljust()

It is used to give the space on the right side and the word will be on the left side.

word = 'Happy'
width = 10

print(word.ljust(width, '@'))

#output:
Happy@@@@@

27. lower()

This method is used to convert all the letters to lower case.

sentence = "HAPPY HAPPY"
print(sentence.lower())

#output:
happy happy

28. lstrip()

It is used to strip the characters or spaces from the left side and remove the characters till it matches.

#Removing the spaces on left side
word = '    Happy'
print(word.lstrip())

#output:
Happy

#removing character 
word = '...,,,bgrrrr..,,Happy'
a = word.lstrip(",.gbr")
print(a)

#output:
Happy

29. maketrans()

It is used to make mapping the characters in the words with the help of arguments with equal length.

#One argument has to be dictionary only
dict1 = {"A": "15", "M": "46", "I": "79", "T": "84"}
word = "AMIT"
print(word.maketrans(dict1))

#output:
{65: '15', 77: '46', 73: '79', 84: '84'}

30. partition()

This method separates the words into different tuples, if the sub-string is found then it will separate the words before the sub-string, the sub-string itself, and separate the words after the sub-string.

sentence = "Happy sad Everyone is sad Happy"
print(sentence.partition('is'))

#output:
('Happy sad Everyone ', 'is', ' sad Happy')

31. replace()

It is used to replace the existing word or letter in the document with the new mentioned word or letter.

sentence = "Happy sad Everyone is sad Happy"
print(sentence.replace('Happy', 'Sad'))

#output:
Sad sad Everyone is sad Sad

32. rfind()

This is used to find the sub-string in the sentence at maximum index position.

sentence = "Happy sad Everyone is sad Happy"
result = sentence.rfind('Ha')
print("Sub-string 'Ha':", result)

#output:
Sub-string 'Ha': 26

The ‘Ha’ sub-string repeated 4 times then it will search the sub-string that is on maximum index.

33. rindex()

It is almost same as rfind() method.

sentence = "Happy sad Everyone is sad Happy"
result = sentence.rindex('Ha')
print("Sub-string 'Ha':", result)

#output:
Sub-string 'Ha': 26

34. rjust()

It is used to give the space on the left side and the word will be on the right side.

word = 'Happy'
width = 10

print(word.rjust(width, '@'))

#output:
@@@@@Happy

35. rpartition()

This method is used to separate the sentences into tuple on the basis of last sub-string index position.

sentence = "Happy sad Everyone is sad Happy"
print(sentence.rpartition('is'))

#output:
('Happy sad Everyone ', 'is', ' sad Happy')

36. rsplit()

It is used to split the sentences into a list of words. If we mention the number of splits from the right side of the sentence.

sentence = "Happy Happy Everyone is Happy"

# splits at space
print(sentence.rsplit())

#output:
['Happy', 'Happy', 'Everyone ', 'is', 'Happy']

#With number of splits
sentence = "Happy Happy Everyone is Happy"
print(sentence.rsplit(' ', 3))

#output:
['Happy Happy', 'Everyone', 'is', 'Happy']

37. rstrip()

It is used to strip the characters or spaces from the right side and remove the characters till it matches.

word = 'Happy     '
print(word.rstrip())

#output:
Happy

word = 'Happy.....'
print(word.rstrip("."))

#output:
Happy

38. splitlines()

This method splits the words or letters based on some boundary characters i.e. \n, \r, \f, and many more.

sentence = "Happy\nHappy\rEveryone\fis\vHappy"
print(sentence.splitlines())

#output:
['Happy', 'Happy', 'Everyone', 'is', 'Happy']

39. startswith()

This method will return true if the sentence or words start with the testing sub-string.

sentence = "Happy sad Everyone is sad Happy"
result = sentence.startswith("Happy")
print(result)

#output:
True

40. upper()

It is used to converts all the lower case letters to the upper case letters.

sentence = "Happy sad Everyone is sad Happy"
print(sentence.upper())

#output:
HAPPY SAD EVERYONE IS SAD HAPPY

I hope you like the article. Reach me on my LinkedIn and twitter.