17 Regex Questions That Are Harder Than They Seem
# Are you up for the challenge?

Try to solve these ONLY using regex.
1) Replacing Smileys
You are given a string containing some HTML text. In the string, each icon is represented by <i class="icon-name"></i>. Write a regex that replaces each icon with a simplified version ::icon-name.
text = 'hello <i class="smile"></i> my name is tim <i class="laugh"></i>'
import re
new = re.sub('TO DO', 'TO DO', text)
# new = 'hello ::smile my name is tim ::laugh'2) Words containing repeating letters
Write a regex to extract all words containing at least 1 repeat letter.
words = 'apple orange pear pineapple durian'
import re
output = re.findall('TO DO', words)
# output = ['apple', 'pineapple']3) Words containing consecutive repeat letters
Write a regex to extract all words that contain consecutive repeat letters ie. repeat letters that are one after another.
words = 'apple orange pear pineapple durian banana'
import re
output = re.findall('TO DO', words)
# output = ['apple', 'pineapple']
# Note: banana has repeat letters, but they are not consecutive4) Words containing 2+ unique vowels
Write a regex to extract words containing 2 or more unique vowels. Words like poop contain 2 vowels, but only 1 unique vowel o, and don’t count.
words = 'dog banana tree poop apple pear'
import re
output = re.findall('TO DO', words)
# output = ['apple', 'pear']5) Words not containing 2 consecutive vowels
Write a regex to extract words that do not contain 2 consecutive vowels (doesn’t matter if they are the same vowel).
words = 'alien guide dog grapes tree'
import re
output = re.findall('TO DO', words)
# output = ['dog', 'grapes']6) Words not containing ‘apple’
Write a regex to extract words that do not contain the string apple.
words = 'apple pineapple snapple dog cat orange'
import re
output = re.findall('TO DO', words)
# output = ['dog', 'cat', 'orange']7) Words containing 3 or more unique vowels
Write a regex to extract words that contain 3 or more unique vowels. For instance, appletree contains only 2 unique vowels a and e (even though e appears 3 times), so appletree doesn’t match. Conversely, guide contains 3 unique vowels u, i & e, so it matches.
words = 'appletree aeaeaeaeaeaea dog guide alien'
import re
output = re.findall('TO DO', words)
# output = ['guide', 'alien']8) Words not containing repeat letters
Write a regex to extract words that do no contain repeat letters. In other words, words that contain only unique letters. For instance, apple contains 2 p's, so it is not matched. Conversely, orange contains no repeat letters, so it is matched.
words = 'apple orange pear tree'
import re
output = re.findall('TO DO', words)
# output = ['orange', 'pear']9) Words containing 4 or more unique letters
Write a regex to extract words containing 4 or more unique letters. For instance, banana contains only 3 unique letters b, a & n (even though some of them appear multiple times) — so it is not matched. Conversely, apple contains 4 unique letters a, p, l & e — so it is matched.
words = 'banana tree poop apple pear'
import re
output = re.findall('TO DO', words)
# output = ['apple', 'pear']10) Words containing an XYXY pattern
Write a regex to extract words containing an XYXY pattern. For instance, in banana, the substring anan is an XYXY pattern, so it is matched.
words = 'banana hahaha hehehe bob boba poop'
import re
output = re.findall('TO DO', words)
# output = ['banana', 'hahaha', 'hehehe']11) Switching HTML opening/closing tags
Write a regex to switch the opening and closing HTML tags in some text.
words = '<h1>hello</h1> my <strong>name</strong> is <u>lala</u>'
import re
output = re.sub('TO DO', 'TO DO', words)
# output = '</h1>hello<h1> my </strong>name<strong> is </u>lala<u>'12) PascalCase to snake_case
Write a regex replacement to convert PascalCase text to snake_case. Assume all words will be valid PascalCase words.
words = 'HelloWorld GiantApplePie YummyOrangeJuice'
import re
def replacement(match_object):
# TO DO
output = re.sub('TO DO', replacement, words)
# output = 'hello_world giant_apple_pie yummy_orange_juice'13) YELLING_SNAKE_CASE to PascalCase
Write a regex replacement to convert YELLING_SNAKE_CASE to PascalCase. Assume all words will be valid YELLING_SNAKE_CASE.
words = 'HELLO_WORLD GIANT_APPLE_PIE YUMMY_ORANGE_JUICE'
import re
def replacement(match_object):
# TO DO
output = re.sub('TO DO', replacement, words)
# output = 'HelloWorld GiantApplePie YummyOrangeJuice'14) Reversing words inside HTML text
Write a regex replacement to reverse the text that is enclosed within opening and closing HTML tags.
words = '<h1>hello</h1> world my name is <u>lala</u>'
import re
def replacement(match_object):
# TO DO
output = re.sub('TO DO', replacement, words)
# output = '<h1>olleh</h1> world my name is <u>alal</u>'15) Correcting HTML tags
You are given some HTML tags where the closing tags are wrong and different from the opening tag. Write a regex replacement to correct the closing HTML tag, and make it the same as the opening HTML tag.
words = '<h1>hello</h2> my name is <strong>lala</weak>'
import re
output = re.sub('TO DO', 'TO DO', words)
# output = '<h1>hello</h1> my name is <strong>lala</strong>'16) Valid Passwords
You are given a bunch of passwords. For a password to be valid:
- There must be at least one uppercase letter
- There must be at least one lowercase letter
- There must be at least one number
- There must be at least one special character — either ! @ # $ or %
- The password must have at least 8 characters
words = 'Testing123! testing123! TESTING123! Testing!!! Testing123 Test1!'
import re
output = re.findall('TO DO', words)
# output = ['Testing123!']17) Passwords not containing password & variations
Usually our passwords cannot contain password. But some people use stuff like p4ssword passw0rd and so on.
Write a regex to extract the passwords (separated by space) that do not contain:
passwordp4sswordpassw0rdp4ssw0rd
Or any different-casing permutation of the above 4 blacklisted words eg.
paSSwordP4ssw0RDpAsSwOrD- And so on
words = 'failword Password123! p4sSw0rD!! pppppassword paSSW0RD'
import re
output = re.findall('TO DO', words)
# output = ['failword']Conclusion
Hope these were challenging enough, and hope your regex skills did improve after attempting these questions!
I actually have the answers (colour-coded breakdowns + detailed explanation for every question) to the above questions in my latest book [40 Regex Practice Questions], which I’ve spent quite a lot of hours on! Hopefully this adds value to your programming journey.
Link: https://payhip.com/b/nb2Cg
Some Final words
If this story provided value and you wish to show a little support, you could:
- Clap 50 times for this story (this really, really helps me out)
- Sign up for a Medium membership using my link ($5/month to read unlimited Medium stories)
Get my free Ebooks: https://zlliu.co/books
- 40 Python Practice Questions For Beginners
- 20 Recursion Practice Questions
- Python From Zero To One
My Home Office Setup: https://zlliu.co/workspace
