Vinod Sebastian – B.Tech, M.Com, PGCBM, PGCPM, PGDBIO

Hi I'm a Web Architect by Profession and an Artist by nature. I love empowering People, aligning to Processes and delivering Projects.

Advertisements




Regex or Regular Expression

What is RegEx?

A RegEx, short for Regular Expression, is a powerful tool used for pattern matching within strings. It consists of a sequence of characters that define a search pattern, allowing you to search for, manipulate, and validate text based on specific criteria.

Regular expressions are widely used in programming, text processing, and data validation tasks to efficiently handle complex search patterns.

Key Concepts to Remember

  • *: Matches the preceding character zero or more times.
  • +: Matches the preceding character one or more times.
  • ?: Matches zero or one of the preceding characters.
  • .: Matches any single character.
  • […]: Matches any single character inside the brackets.
  • [^…]: Matches any single character not in the brackets.
  • \d: Matches any digit character (equivalent to [0-9]).
  • \D: Matches any non-digit character.
  • \w: Matches any word character (equivalent to [a-zA-Z0-9_]).
  • \W: Matches any non-word character.
  • \b: Matches a word boundary.
  • \s: Matches any whitespace character.
  • \S: Matches any non-whitespace character.
  • ^: Matches the start of a line or string.
  • $: Matches the end of a line or string.
  • {M,N}: Matches the preceding element at least M and not more than N times.

Example Using Python

Here are some basic examples of using regular expressions in Python:

# Python code snippets for regex examples
import re

print(re.findall(".", "Hello"))  # Output: ['H', 'e', 'l', 'l', 'o']
print(re.findall(".*", "Hello"))  # Output: ['Hello', '']

Greedy vs. Reluctant vs. Possessive Quantifiers

In regular expressions, quantifiers determine the number of occurrences of a character or group in a pattern. Understanding the differences between greedy, reluctant, and possessive quantifiers is crucial for efficient pattern matching.

  • Greedy Quantifier (*): Matches as many characters as possible while still allowing the overall match to succeed.
  • Reluctant Quantifier (*?): Matches as few characters as possible and still allows the match to succeed.
  • Possessive Quantifier (*+): Matches as many characters as possible and does not backtrack, even if it causes the overall match to fail.

Let’s see these quantifiers in action:

Enter your regex: .*test // Greedy quantifier
Enter input string to search: xtestxxxxxxtest
Match: "xtestxxxxxxtest" from index 0 to 15.

Enter your regex: .*?test // Reluctant quantifier
Enter input string to search: xtestxxxxxxtest
Matches: "xtest" from index 0 to 5, "xxxxxxtest" from index 5 to 15.

Enter your regex: .*+test // Possessive quantifier
Enter input string to search: xtestxxxxxxtest
No match found.

A Complete Example In Python

Let’s consider a more complex example where we want to extract specific patterns from a text using regular expressions in Python:

# Python code snippet for a complete example
import re

text = "This matches given regular expression in PHP.n"
text += "This matches given regular expression in Python.n"
text += "This matches given regular expression in C.n"
text += "This matches given regular expression in Pearl."

result = re.findall("This.* Pw{3,5}.", text)

if result:
    print(result)
else:
    print("No match")

The output will be:

['This matches given regular expression in Python.', 'This matches given regular expression in Pearl.']
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x