avatarLiu Zuo Lin

Summarize

17 Regex Questions That Are Harder Than They Seem

# Are you up for the challenge?

Try to solve these ONLY using regex.

1) Replacing Smileys

You are given a string containing some HTML text. In the string, each icon is represented by <i class="icon-name"></i>. Write a regex that replaces each icon with a simplified version ::icon-name.

text = 'hello <i class="smile"></i> my name is tim <i class="laugh"></i>'

import re
new = re.sub('TO DO', 'TO DO', text)

# new = 'hello ::smile my name is tim ::laugh'

2) Words containing repeating letters

Write a regex to extract all words containing at least 1 repeat letter.

words = 'apple orange pear pineapple durian'

import re
output = re.findall('TO DO', words)

# output = ['apple', 'pineapple']

3) Words containing consecutive repeat letters

Write a regex to extract all words that contain consecutive repeat letters ie. repeat letters that are one after another.

words = 'apple orange pear pineapple durian banana'

import re
output = re.findall('TO DO', words)

# output = ['apple', 'pineapple']

# Note: banana has repeat letters, but they are not consecutive

4) Words containing 2+ unique vowels

Write a regex to extract words containing 2 or more unique vowels. Words like poop contain 2 vowels, but only 1 unique vowel o, and don’t count.

words = 'dog banana tree poop apple pear'

import re
output = re.findall('TO DO', words)

# output = ['apple', 'pear']

5) Words not containing 2 consecutive vowels

Write a regex to extract words that do not contain 2 consecutive vowels (doesn’t matter if they are the same vowel).

words = 'alien guide dog grapes tree'

import re
output = re.findall('TO DO', words)

# output = ['dog', 'grapes']

6) Words not containing ‘apple’

Write a regex to extract words that do not contain the string apple.

words = 'apple pineapple snapple dog cat orange'

import re
output = re.findall('TO DO', words)

# output = ['dog', 'cat', 'orange']

7) Words containing 3 or more unique vowels

Write a regex to extract words that contain 3 or more unique vowels. For instance, appletree contains only 2 unique vowels a and e (even though e appears 3 times), so appletree doesn’t match. Conversely, guide contains 3 unique vowels u, i & e, so it matches.

words = 'appletree aeaeaeaeaeaea dog guide alien'

import re
output = re.findall('TO DO', words)

# output = ['guide', 'alien']

8) Words not containing repeat letters

Write a regex to extract words that do no contain repeat letters. In other words, words that contain only unique letters. For instance, apple contains 2 p's, so it is not matched. Conversely, orange contains no repeat letters, so it is matched.

words = 'apple orange pear tree'

import re
output = re.findall('TO DO', words)

# output = ['orange', 'pear']

9) Words containing 4 or more unique letters

Write a regex to extract words containing 4 or more unique letters. For instance, banana contains only 3 unique letters b, a & n (even though some of them appear multiple times) — so it is not matched. Conversely, apple contains 4 unique letters a, p, l & e — so it is matched.

words = 'banana tree poop apple pear'

import re
output = re.findall('TO DO', words)

# output = ['apple', 'pear']

10) Words containing an XYXY pattern

Write a regex to extract words containing an XYXY pattern. For instance, in banana, the substring anan is an XYXY pattern, so it is matched.

words = 'banana hahaha hehehe bob boba poop'

import re 
output = re.findall('TO DO', words)

# output = ['banana', 'hahaha', 'hehehe']

11) Switching HTML opening/closing tags

Write a regex to switch the opening and closing HTML tags in some text.

words = '<h1>hello</h1> my <strong>name</strong> is <u>lala</u>'

import re
output = re.sub('TO DO', 'TO DO', words)

# output = '</h1>hello<h1> my </strong>name<strong> is </u>lala<u>'

12) PascalCase to snake_case

Write a regex replacement to convert PascalCase text to snake_case. Assume all words will be valid PascalCase words.

words = 'HelloWorld GiantApplePie YummyOrangeJuice'

import re

def replacement(match_object):
    # TO DO

output = re.sub('TO DO', replacement, words)

# output = 'hello_world giant_apple_pie yummy_orange_juice'

13) YELLING_SNAKE_CASE to PascalCase

Write a regex replacement to convert YELLING_SNAKE_CASE to PascalCase. Assume all words will be valid YELLING_SNAKE_CASE.

words = 'HELLO_WORLD GIANT_APPLE_PIE YUMMY_ORANGE_JUICE'

import re

def replacement(match_object):
    # TO DO

output = re.sub('TO DO', replacement, words)

# output = 'HelloWorld GiantApplePie YummyOrangeJuice'

14) Reversing words inside HTML text

Write a regex replacement to reverse the text that is enclosed within opening and closing HTML tags.

words = '<h1>hello</h1> world my name is <u>lala</u>'

import re

def replacement(match_object):
    # TO DO

output = re.sub('TO DO', replacement, words)

# output = '<h1>olleh</h1> world my name is <u>alal</u>'

15) Correcting HTML tags

You are given some HTML tags where the closing tags are wrong and different from the opening tag. Write a regex replacement to correct the closing HTML tag, and make it the same as the opening HTML tag.

words = '<h1>hello</h2> my name is <strong>lala</weak>'

import re
output = re.sub('TO DO', 'TO DO', words)

# output = '<h1>hello</h1> my name is <strong>lala</strong>'

16) Valid Passwords

You are given a bunch of passwords. For a password to be valid:

  1. There must be at least one uppercase letter
  2. There must be at least one lowercase letter
  3. There must be at least one number
  4. There must be at least one special character — either ! @ # $ or %
  5. The password must have at least 8 characters
words = 'Testing123! testing123! TESTING123! Testing!!! Testing123 Test1!'

import re
output = re.findall('TO DO', words)

# output = ['Testing123!']

17) Passwords not containing password & variations

Usually our passwords cannot contain password. But some people use stuff like p4ssword passw0rd and so on.

Write a regex to extract the passwords (separated by space) that do not contain:

  • password
  • p4ssword
  • passw0rd
  • p4ssw0rd

Or any different-casing permutation of the above 4 blacklisted words eg.

  • paSSword
  • P4ssw0RD
  • pAsSwOrD
  • And so on
words = 'failword Password123! p4sSw0rD!! pppppassword paSSW0RD'

import re
output = re.findall('TO DO', words)

# output = ['failword']

Conclusion

Hope these were challenging enough, and hope your regex skills did improve after attempting these questions!

I actually have the answers (colour-coded breakdowns + detailed explanation for every question) to the above questions in my latest book [40 Regex Practice Questions], which I’ve spent quite a lot of hours on! Hopefully this adds value to your programming journey.

Link: https://payhip.com/b/nb2Cg

Some Final words

If this story provided value and you wish to show a little support, you could:

  1. Clap 50 times for this story (this really, really helps me out)
  2. Sign up for a Medium membership using my link ($5/month to read unlimited Medium stories)

Get my free Ebooks: https://zlliu.co/books

  1. 40 Python Practice Questions For Beginners
  2. 20 Recursion Practice Questions
  3. Python From Zero To One

My Home Office Setup: https://zlliu.co/workspace

Python
Python Programming
Programming
Regex
Regular Expressions
Recommended from ReadMedium