avatarMatt Croak Code

Summary

The article discusses how to construct a regular expression (regex) to match multiple conditions simultaneously, effectively implementing a logical AND operation within regex.

Abstract

The article delves into the intricacies of regex, focusing on the challenge of matching multiple patterns within a string. It addresses the need for a logical AND operation in regex, which is not natively supported like the logical OR operator. The author explains the concept of grouping using parentheses to create regex expressions that must all be satisfied for a match to occur. The article provides examples and explanations on how to use groups to ensure that a string contains all specified conditions, such as uppercase words and numbers, or specific substrings like "love," "Matthew," and "Mets." The author also discusses the use of word boundaries, the importance of allowing any characters between grouped patterns, and the multiline flag for matching conditions across multiple lines.

Opinions

  • The author emphasizes the importance of matching all conditions in a regex pattern, not just one or some, to achieve precise results.
  • The use of groups in regex is highlighted as a robust feature necessary for implementing logical AND operations.
  • The article suggests that the lack of a native logical AND operator in regex can be overcome with creative use of grouping and quantifiers.
  • The author acknowledges the complexity of regex and provides guidance for readers to achieve more nuanced pattern matching.
  • The article encourages readers to engage with the content by asking for their input and alternative methods for achieving the same results in regex.

How to Match Multiple Conditions in Regex (at once)

The “Logical AND” Operator

Photo by Joshua Aragon on Unsplash

In a previous post, I wrote about how to effectively match multiple conditions in regex and show you how to write a regular expression that can contain multiple conditions — allowing you to match multiple things.

See the code from the post below.

const line = 'My name is Matthew Croak. I love the NY Mets.';
const regex = /love|Mets|Matthew/g;
const found = line.match(regex);

console.log(found)

> ['Matthew', 'love', 'Mets']

By using the logical OR operator (|) we can have a regular expression that can match not just one pattern, but multiple, within the same string. There is, however, one scenario that this doesn’t consider. Sims Mike left a comment on that post saying that they needed a solution that matched a line only if all conditions were satisfied.

The above code will work even if there aren’t any instances of the other patterns in the regex.

For example, if the string were to read ‘My name is Mathew Croak. I love the NY mets.’, it would still return the match [‘love’] and be considered truthy.

So how could I rewrite the above regex so that it would find matches only if all conditions in the regex were satisfied?

What this sounds like is a logical AND operation.

The only problem is, there is no logical AND operator in regex that works like the pipe (|) for logical OR. So how can we achieve this for logical AND?

In order to do this, we’ll need to make use of a few regex features, we’ll need to make use of a robust regex feature called grouping.

Grouping

In my previous post, we used the logical OR (|) to lay out multiple conditions to be met — they didn’t all need to be met at once. This is because, logically, the pattern will return matches if one OR more of the patterns are matched.

For our current case, instead of using the pipe, we can create groups using parenthesis.

In these parenthesis, we can include separate regex expressions, i.e. grouping them, within a larger regular expression. We can see this grouping exemplified by using this tutorial on capturing groups.

First, we have a string.

var str = "The price of PINEAPPLE ice cream is 20"

We want to create a regex that matches two things.

  • Uppercase letters
  • Numbers

I won’t go too in depth about the syntax here, but in short, you should know that [A-Z] will match any uppercase letter, \d will match any digit 0–9. + means one or more instances of a preceding expression, and .+ means one or more instances of any character. Finally, \b anchors a pattern to a word boundary.

Get all of my latest content by creating a Medium Partner Program account and subscribing to my emails. :)

A word boundary, simply put, is a word character that is not followed or preceded by another word character. Say you want to replace all instances of a with $ if it is not part of another word.

var str = 'I have a cat'
var regex = /\ba/g
console.log(str.replace(regex, "$"))

> "I have $ cat"

Now, back to our pineapple example! We can use the below regex to find instances of uppercase words AND numbers.

var str = "The price of PINEAPPLE ice cream is 20"
var regex = /(\b[A-Z]+\b).+(\b\d+)/g
console.log(str.match(regex))

> ['PINEAPPLE ice cream is 20']

Perfect! Thanks to the use of groups, this regex will only find matches if BOTH grouped regexes are satisfied. If we used the string “The price of pineapple ice cream is 20”, the logged result would be null.

If we were to use the logical OR, however, it would find a match. See below.

var str = "The price of pineapple ice cream is 20"
var regex = /\b[A-Z]+\b|.+\b\d+/g //notice the lack of parenthesis and presence of |
console.log(str.match(regex))

> ['The price of pineapple ice cream is 20']

Awesome! You now understand the value of groups and how to use them in regex. But how can they work for the example I used in my previous blog post? How can we make sure that love, Matthew, AND Mets are present in a string in order for it to be a match?

Instead of using word boundaries and looking for uppercase letters and numbers like the above example, we need to find exact strings. We can start this by simply grouping our exact strings! See below.

var newRegex = /^(name)(Matthew)(Mets)/g

So far so good, but we’re not done.

Soon, my friend, soon.

If you tried the above regex with our original string, you’d still get null. This is because you’re expecting them to appear exactly like this. Meaning in order to return a match, your string would have to be something like "nameMatthewMets".

So how can we fix this? Well, we need to allow our regex to match these patterns regardless of what comes between them. This can be done using .*.

  • . means "any character".
  • * means "any number of this".

When used in conjunction, it basically means that any character, any number of times after the preceding letter, number, etc. is eligible for matching. By adding this combination before each of our strings within each of the groups, we can get our desired result.

Below is our working code.

const line = "My name is Matthew Croak. I love the NY Mets.";
var regex = /(.*name)(.*Matthew)(.*Mets)/g
const found = line.match(regex);

console.log(found)

> ["My name is Matthew Croak. I love the NY Mets."]

Here it matches the whole line because the line itself satisfies all of the conditions at once, not just one or some of them. The line contains name, Matthew AND Mets. If we were to, say, change Mets to mets, the returned result would be null because not all matches were met.

There is still one tiny problem left to resolve though.

This only works for one line.

In order for it to work for multiple lines, which was the case for Sims Mike, we need to add one more thing. At the end, where we have our /g, we can simply add a m, which is the symbol for multiline. You can confirm this works by copy/pasting the below code in your console.

const line = `My name is Matthew Croak. I love the NY Mets. 

My name is Matthew Croak.
I love the NY Mets.

My name is Matthew Croak. I love the NY Mets.

My name is mathew Croak. I love the NY Mets.
`;
const found = line.match(newRegex);

console.log(found)

> ['My name is Matthew Croak. I love the NY Mets.', 'My name is Matthew Croak. I love the NY Mets.']

// The above returns two matches, line one and line six.
// Lines three and four are not combined so they are not a match.
// Line 8 is not a match because while 'name' and 'Mets' are correct, `mathew` is not

Well, Sims Mike, I hope this helps! And for anybody else trying to find a solution where you need to match all regex conditions (at once, not either or), I hope this helped you too.

Do you have another way of achieving this? Let me know in the comments!

Upgrade your free Medium membership and receive unlimited, ad-free, stories from thousands of writers on a wide variety of publications. This is an affiliate link and a portion of your membership helps me be rewarded for the content I create.

You can also subscribe via email and get notified whenever I post something new!

References

Level Up Coding

Thanks for being a part of our community! Before you go:

🚀👉 Join the Level Up talent collective and find an amazing job

Regex
Code
Programming
Technology
JavaScript
Recommended from ReadMedium