Python

 

 

 

 

Python - Regular Expression 

 

Regular Expression is a kind of standard defining a pattern for a string. This is most widely used to find a substring using a pattern. You can easily google a lot of information on Regular Expression (I think Wikipedia : Regular Expression would be pretty good start), but it would be almost impossible to understand the details without practicing on your own.

 

Python Regular Expressions, also known as RegEx, is a powerful tool for handling text patterns and manipulating strings. RegEx is a sequence of characters that define a search pattern, mainly used for string pattern matching and manipulation.

 

Python's re module provides support for working with regular expressions. The module offers various functions and methods to perform operations like searching, matching, replacing, and splitting text based on specified patterns.

 

Here are some key concepts and elements of Python Regular Expressions:

  • Patterns: A pattern is a sequence of characters that define the structure of the text you want to match or manipulate. Patterns may include literals, metacharacters, and special sequences.
  • Literals: These are regular characters that match themselves, such as letters, numbers, and symbols.
  • Metacharacters: These are special characters that have a specific meaning in RegEx syntax, such as ^, $, *, +, ?, {}, [], (), |, and ..
  • Special Sequences: These are sequences of characters that have a special meaning in the pattern, such as \d, \s, \w, \D, \S, and \W.
  • Flags: Flags are modifiers that affect the behavior of the RegEx pattern matching. Some common flags include re.IGNORECASE, re.MULTILINE, and re.DOTALL.

 

The re module provides several functions for working with RegEx:

  • re.search(): Searches the entire string for a pattern and returns a match object if found.
  • re.match(): Determines if the pattern matches at the beginning of the string.
  • re.fullmatch(): Checks if the entire string matches the pattern.
  • re.findall(): Returns a list of all non-overlapping matches of the pattern in the string.
  • re.finditer(): Returns an iterator yielding match objects for all non-overlapping matches of the pattern in the string.
  • re.sub(): Replaces all occurrences of the pattern with a specified string or a function's result.
  • re.split(): Splits the string based on the occurrences of the pattern.
  • re.compile(): Compiles a RegEx pattern into a pattern object, which can then be used for pattern matching and manipulation.

I will just keep posting a bunch of examples as my own cheatsheet and practice. I hope this helps you as well.

 

 

 

Importing Regular Expression Package

 

import re

 

 

 

Finding Patterns

 

There are several different ways to find a pattern using regular expression. search(), match(), findall()

search(), match(), and findall() are functions provided by the re module in Python for working with regular expressions. They are used for different purposes when it comes to pattern matching in strings.

 

The main differences between these functions are:

  • search() finds the first occurrence of the pattern anywhere in the string.
  • match() checks for a match only at the beginning of the string.
  • findall() returns a list of all non-overlapping matches of the pattern in the entire string.

 

In the following example, the re.search() function is being used to search for the first occurrence of a digit in the input string '1 little 10 little 1000 little indians'.

 

re.search(r'\d','1 little 10 little 1000 little indians')

<re.Match object; span=(0, 1), match='1'>

 

The pattern r'\d' is a regular expression that matches any single digit (0-9). The r prefix before the pattern indicates that it is a raw string, which means backslashes are treated as literal characters and not escape characters. This is a common practice when defining regular expression patterns, as it helps avoid potential issues with escape sequences.

 

The re.search() function takes the pattern and the input string as arguments and searches for the first occurrence of the pattern in the string. In this case, it will find the digit '1' at the beginning of the string. Since a match is found, the function returns a match object containing information about the match.

 

In the following example, the re.search() function is being used in conjunction with the group() method to extract the first occurrence of a digit from the input string '1 little 10 little 1000 little indians'.

 

re.search(r'\d','1 little 10 little 1000 little indians').group()

1

 

The group() method is then called on the match object returned by re.search(). The group() method returns the matched portion of the input string as a string. In this example, it will return the digit '1' as a string.

 

In the following example, a regular expression pattern is compiled using the re.compile() function, and the findall() method is called on the compiled pattern object to find all occurrences of digits in the input string '1 little 10 little 1000 little indians'.

 

p = re.compile('\d')

p.findall('1 little 10 little 1000 little indians'))

['1', '1', '0', '1', '0', '0', '0']

 

p = re.compile('\d'): The re.compile() function is used to compile a regular expression pattern into a pattern object. In this case, the pattern '\d' is used to match any single digit (0-9). The compiled pattern object is stored in the variable p. Compiling the pattern is useful when you plan to reuse the same pattern multiple times, as it can improve performance by avoiding recompilation of the pattern with each use.

 

p.findall('1 little 10 little 1000 little indians'): The findall() method is called on the compiled pattern object p. This method returns a list of all non-overlapping matches of the pattern in the input string. In this case, the input string contains several digits: '1 little 10 little 1000 little indians'. The findall() method will return the following list of matches: ['1', '1', '0', '1', '0', '0', '0'].

 

Following example is same as previous one except that the input string is not compiled.

 

re.findall(r'\d','1 little 10 little 1000 little indians')

['1', '1', '0', '1', '0', '0', '0']

 

 

 

Find All Integers

 

In the following example, the re.findall() function is being used to find all occurrences of digits in the input string '1 little 10 little 1.234 little indians'

 

re.findall(r'\d','1 little 10 little 1.234 little indians')

['1', '1', '0', '1', '2', '3', '4']

 

The pattern r'\d' is a regular expression that matches any single digit (0-9). The r prefix before the pattern is used to indicate that it is a raw string, which ensures that backslashes are treated as literal characters and not escape characters.

 

The re.findall() function takes the pattern and the input string as arguments and returns a list of all non-overlapping matches of the pattern in the string. In this case, the input string contains several digits: '1 little 10 little 1.234 little indians'. The findall() function will return the following list of matches: ['1', '1', '0', '1', '2', '3', '4'].

 

 

 

re.findall(r'\d.','1 little 10 little 1.234 little indians')

['1 ', '10', '1.', '23', '4 ']

 

 

In the following example, the re.findall() function is being used to find all occurrences of a digit followed by any character in the input string '1 little 10 little 1.234 little indians'

 

re.findall(r'\d?','1 little 10 little 1.234 little indians')

['1', '', '', '', '', '', '', '', '', '1', '0', '', '', '', '', '', '', '', '', '1', '', '2', '3', '4', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '']

 

The pattern r'\d.' is a regular expression consisting of two parts:

  • \d: This matches any single digit (0-9).
  • .: This is a metacharacter that matches any character (except a newline) in the input string.
  • The r prefix before the pattern indicates that it is a raw string, which ensures that backslashes are treated as literal characters and not escape characters.

 

The re.findall() function takes the pattern and the input string as arguments and returns a list of all non-overlapping matches of the pattern in the string. In this case, the input string is '1 little 10 little 1.234 little indians'. The findall() function will return the following list of matches: ['1 ', '10', '1.', '23', '4 '].

 

The pattern r'\d.' matches a digit followed by any character. In the given input string, it will match:

    '1 ' (digit '1' followed by a space)

    '10' (digit '1' followed by digit '0')

    '1.' (digit '1' followed by a period)

    '23' (digit '2' followed by digit '3')

    '4 ' (digit '4' followed by a space)

 

 

In the following example, the re.findall() function is being used to find all occurrences of zero or one digits in the input string '1 little 10 little 1.234 little indians'.

 

re.findall(r'\d??','1 little 10 little 1.234 little indians')

['', '1', '', '', '', '', '', '', '', '', '', '1', '', '0', '', '', '', '', '', '', '', '', '', '1', '', '', '2', '', '3', '', '4', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '']

 

The pattern r'\d??' is a regular expression consisting of two parts:

  • \d: This matches any single digit (0-9).
  • ?: This is a quantifier that matches zero or one occurrences of the preceding pattern element. In this case, it applies to \d. The second ? makes the quantifier non-greedy, meaning it will prefer to match the shortest possible string, which is zero occurrences of a digit in this case.
  • The r prefix before the pattern indicates that it is a raw string, which ensures that backslashes are treated as literal characters and not escape characters.

 

The re.findall() function takes the pattern and the input string as arguments and returns a list of all non-overlapping matches of the pattern in the string. In this case, the input string is '1 little 10 little 1.234 little indians'. Since the pattern r'\d??' matches zero or one digits, and the non-greedy quantifier prefers the shortest match (zero digits), the findall() function will effectively match empty strings at every position between characters in the input string. Consequently, the returned list will consist of empty strings.

 

 

In the following example, the re.findall() function is being used to find all occurrences of exactly two consecutive digits in the input string '1 little 10 little 1.234 little indians

 

re.findall(r'\d{2}','1 little 10 little 1.234 little indians')

['10', '23']

 

The pattern r'\d{2}' is a regular expression consisting of two parts:

  • \d: This matches any single digit (0-9).
  • {2}: This is a quantifier that specifies that exactly two occurrences of the preceding pattern element should be matched. In this case, it applies to \d.
  • The r prefix before the pattern indicates that it is a raw string, which ensures that backslashes are treated as literal characters and not escape characters.

 

The re.findall() function takes the pattern and the input string as arguments and returns a list of all non-overlapping matches of the pattern in the string. In this case, the input string is '1 little 10 little 1.234 little indians'. The findall() function will search for exactly two consecutive digits and return the following list of matches: ['10', '23'].

 

The pattern r'\d{2}' matches the following occurrences in the given input string:

    '10' (two consecutive digits in the second number)

    '23' (two consecutive digits in the third number)'

 

In the following example, the re.findall() function is being used to find all occurrences of exactly three consecutive digits in the input string '1 little 10 little 1.234 little indians'.

 

re.findall(r'\d{3}','1 little 10 little 1.234 little indians')

['234']

 

The pattern r'\d{3}' is a regular expression consisting of two parts:

  • \d: This matches any single digit (0-9).
  • {3}: This is a quantifier that specifies that exactly three occurrences of the preceding pattern element should be matched. In this case, it applies to \d.
  • The r prefix before the pattern indicates that it is a raw string, which ensures that backslashes are treated as literal characters and not escape characters.

 

The re.findall() function takes the pattern and the input string as arguments and returns a list of all non-overlapping matches of the pattern in the string. In this case, the input string is '1 little 10 little 1.234 little indians'. The findall() function will search for exactly three consecutive digits and return the following list of matches: ['234'].

 

The pattern r'\d{3}' matches the following occurrence in the given input string:

    '234' (three consecutive digits in the third number)

 

 

In the following example, the re.findall() function is being used to find all occurrences of one or more consecutive digits in the input string '1 little 10 little 1000 little indians'.

 

re.findall(r'\d+','1 little 10 little 1000 little indians')

['1', '10', '1000']

 

The pattern r'\d+' is a regular expression consisting of two parts:

  • \d: This matches any single digit (0-9).
  • +: This is a quantifier that matches one or more occurrences of the preceding pattern element. In this case, it applies to \d.
  • The r prefix before the pattern indicates that it is a raw string, which ensures that backslashes are treated as literal characters and not escape characters.

 

The re.findall() function takes the pattern and the input string as arguments and returns a list of all non-overlapping matches of the pattern in the string. In this case, the input string is '1 little 10 little 1000 little indians'. The findall() function will search for sequences of one or more consecutive digits and return the following list of matches: ['1', '10', '1000'].

 

The pattern r'\d+' matches the following occurrences in the given input string:

    '1' (a single digit in the first number)

    '10' (two consecutive digits in the second number)

    '1000' (four consecutive digits in the third number)

 

 

 

Find All Floating Numbers

 

In the following example, the re.findall() function is being used to find all occurrences of one or more consecutive digits in the input string '1 little 10 little 1.234 little indians'. The regular expression pattern used is r'\d?\.[0-9]*'.

 

re.findall(r'\d?\.[0-9]*','1 little 10 little 1.234 little indians')

['1.234']

 

Let's break down the regular expression pattern: r'\d?\.[0-9]*'

  • \d? - This part of the pattern will match an optional digit (0-9). The ? means that the preceding character (in this case, a digit) can appear 0 or 1 times.
  • \. - This part of the pattern will match a literal dot/period (.) character.
  • [0-9]* - This part of the pattern will match zero or more digits (0-9). The * means that the preceding character class (in this case, a digit) can appear 0 or more times.

 

 

In this example, the re.findall() function is being used to find all occurrences of the pattern defined by the regular expression r'\d?\.[0-9]*' in the input string '1 little -1.234 little 1.234 little indians'

 

re.findall(r'\d?\.[0-9]*','1 little -1.234 little 1.234 little indians')

['1.234', '1.234']

 

Let's break down the regular expression pattern:

  • \d? - This part of the pattern will match an optional digit (0-9). The ? means that the preceding character (in this case, a digit) can appear 0 or 1 times.
  • \. - This part of the pattern will match a literal dot/period (.) character.
  • [0-9]* - This part of the pattern will match zero or more digits (0-9). The * means that the preceding character class (in this case, a digit) can appear 0 or more times.

 

In this example, the re.findall() function is being used to find all occurrences of the pattern defined by the regular expression r'[-+]?[0-9]+.?[0-9]*' in the input string '1 little -1.234 little 1.234 little indians'.

 

re.findall(r'[-+]?[0-9]+.?[0-9]*','1 little -1.234 little 1.234 little indians')

['1 ', '-1.234', '1.234']

 

The re.findall() function searches for all non-overlapping matches of the given pattern in the input string and returns them as a list. Let's break down the regular expression pattern:

  • [-+]? - This part of the pattern will match an optional plus (+) or minus (-) sign. The ? means that the preceding character class (in this case, either a plus or minus sign) can appear 0 or 1 times.
  • [0-9]+ - This part of the pattern will match one or more digits (0-9). The + means that the preceding character class (in this case, a digit) can appear 1 or more times.
  • .? - This part of the pattern will match an optional dot/period (.) character. The ? means that the preceding character (in this case, a dot) can appear 0 or 1 times.
  • [0-9]* - This part of the pattern will match zero or more digits (0-9). The * means that the preceding character class (in this case, a digit) can appear 0 or more times.

 

 

In this example, the re.findall() function is being used to find all occurrences of the pattern defined by the regular expression r'[-+]?[0-9]+\.?[0-9]*' in the input string '1 little -1.234 little 1.234 little indians'.

 

re.findall(r'[-+]?[0-9]+\.?[0-9]*','1 little -1.234 little 1.234 little indians')

['1', '-1.234', '1.234']

 

The re.findall() function searches for all non-overlapping matches of the given pattern in the input string and returns them as a list. Let's break down the regular expression pattern:

  • [-+]? - This part of the pattern will match an optional plus (+) or minus (-) sign. The ? means that the preceding character class (in this case, either a plus or minus sign) can appear 0 or 1 times.
  • [0-9]+ - This part of the pattern will match one or more digits (0-9). The + means that the preceding character class (in this case, a digit) can appear 1 or more times.
  • \.? - This part of the pattern will match an optional dot/period (.) character. The ? means that the preceding character (in this case, a dot) can appear 0 or 1 times. Notice that we have escaped the dot with a backslash to match the literal dot character.
  • [0-9]* - This part of the pattern will match zero or more digits (0-9). The * means that the preceding character class (in this case, a digit) can appear 0 or more times.

 

 

In this example, the re.findall() function is being used to find all occurrences of the pattern defined by the regular expression r'[-+]?[0-9]+[\.]?[0-9]*' in the input string '1 little -1.234 little 1.234 little indians'.

 

re.findall(r'[-+]?[0-9]+[\.]?[0-9]*','1 little -1.234 little 1.234 little indians')

['1', '-1.234', '1.234']

 

The re.findall() function searches for all non-overlapping matches of the given pattern in the input string and returns them as a list. Let's break down the regular expression pattern:

  • [-+]? - This part of the pattern will match an optional plus (+) or minus (-) sign. The ? means that the preceding character class (in this case, either a plus or minus sign) can appear 0 or 1 times.
  • [0-9]+ - This part of the pattern will match one or more digits (0-9). The + means that the preceding character class (in this case, a digit) can appear 1 or more times.
  • [\.]? - This part of the pattern will match an optional dot/period (.) character. The ? means that the preceding character class (in this case, a dot enclosed in square brackets) can appear 0 or 1 times.
  • [0-9]* - This part of the pattern will match zero or more digits (0-9). The * means that the preceding character class (in this case, a digit) can appear 0 or more times.

 

 

 

Find All Numbers (Integer and Floating numbers)

 

In this example, the re.findall() function is being used to find all occurrences of the pattern defined by the regular expression r'[-+]?[0-9]+.?[0-9]*' in the input string '1 little -1.234 little 1.234 little indians'.

 

re.findall(r'[-+]?[0-9]+.?[0-9]*','1 little -1.234 little 1.234 little indians')

['1 ', '-1.234', '1.234']

 

The re.findall() function searches for all non-overlapping matches of the given pattern in the input string and returns them as a list. Let's break down the regular expression pattern:

  • [-+]? - This part of the pattern will match an optional plus (+) or minus (-) sign. The ? means that the preceding character class (in this case, either a plus or minus sign) can appear 0 or 1 times.
  • [0-9]+ - This part of the pattern will match one or more digits (0-9). The + means that the preceding character class (in this case, a digit) can appear 1 or more times.
  • .? - This part of the pattern will match an optional dot/period (.) character. The ? means that the preceding character (in this case, a dot) can appear 0 or 1 times. Note that this dot is not escaped, so it actually represents any character except a newline.
  • [0-9]* - This part of the pattern will match zero or more digits (0-9). The * means that the preceding character class (in this case, a digit) can appear 0 or more times.

 

 

 

Finding a Time/Date Pattern

 

In this example, the re.findall() function is being used to find all occurrences of the pattern defined by the regular expression r'\d{2}\:\d{2}\:\d{2}\.\d{3}' in the input string 'now time is 11:23:45.123 or 11.23.45:123 or 11-23-45.123'.

 

re.findall(r'\d{2}\:\d{2}\:\d{2}\.\d{3}','now time is 11:23:45.123 or 11.23.45:123 or 11-23-45.123')

['11:23:45.123']

 

The re.findall() function searches for all non-overlapping matches of the given pattern in the input string and returns them as a list. Let's break down the regular expression pattern:

  • \d{2} - This part of the pattern will match exactly two digits (0-9). The {2} means that the preceding character class (in this case, a digit) must appear exactly 2 times.
  • \: - This part of the pattern will match a literal colon (:) character.
  • \d{2} - This part of the pattern, like the first one, will match exactly two digits (0-9).
  • \: - This part of the pattern will again match a literal colon (:) character.
  • \d{2} - This part of the pattern, like the first and second ones, will match exactly two digits (0-9).
  • \. - This part of the pattern will match a literal dot/period (.) character.
  • \d{3} - This part of the pattern will match exactly three digits (0-9). The {3} means that the preceding character class (in this case, a digit) must appear exactly 3 times.

 

In this example, the regular expression pattern is same as previous example.. only the input string is different.

 

re.findall(r'\d{2}\:\d{2}\:\d{2}\.\d{3}','now time is 11:23:45.1 or 11:23:45.12 or 11:23:45.123')

['11:23:45.123']

 

In this example 1:23:45.1 and 11:23:45.12 are not found because of the pattern \.\d{3} which requires 3 digits after dot(.).

 

 

In this example, the re.findall() function is being used to find all occurrences of the pattern defined by the regular expression r'\d{2}\:\d{2}\:\d{2}\.\d+' in the input string 'now time is 11:23:45.1 or 11:23:45.12 or 11:23:45.123'.

 

re.findall(r'\d{2}\:\d{2}\:\d{2}\.\d+','now time is 11:23:45.1 or 11:23:45.12 or 11:23:45.123')

['11:23:45.1', '11:23:45.12', '11:23:45.123']

 

The re.findall() function searches for all non-overlapping matches of the given pattern in the input string and returns them as a list. Let's break down the regular expression pattern:

  • \d{2} - This part of the pattern will match exactly two digits (0-9). The {2} means that the preceding character class (in this case, a digit) must appear exactly 2 times.
  • \: - This part of the pattern will match a literal colon (:) character.
  • \d{2} - This part of the pattern, like the first one, will match exactly two digits (0-9).
  • \: - This part of the pattern will again match a literal colon (:) character.
  • \d{2} - This part of the pattern, like the first and second ones, will match exactly two digits (0-9).
  • \. - This part of the pattern will match a literal dot/period (.) character.
  • \d+ - This part of the pattern will match one or more digits (0-9). The + means that the preceding character class (in this case, a digit) must appear at least 1 time.

 

 

 

Match the pattern

 

In this section I would focus on a few specific specifier as shown below :

    # ^ : start

    # $ : end

    #[...] : group

    # * : zero or more characters

^: This symbol represents the start of a string or a line. When used in a regular expression pattern, it asserts that the following pattern should match at the beginning of the string or line.

$: This symbol represents the end of a string or a line. When used in a regular expression pattern, it asserts that the preceding pattern should match at the end of the string or line.

[...]: Square brackets are used to define a character class, which is a group of characters. A character class matches any one of the characters inside the square brackets. For example, [abc] will match any one of the characters 'a', 'b', or 'c'. You can also use a hyphen - to define a range of characters, such as [a-z] to match any lowercase letter, or [0-9] to match any digit.

*: This symbol is a quantifier that means "zero or more" of the preceding character, character class, or group. When used in a regular expression pattern, it allows the preceding element to repeat any number of times, including zero occurrences. For example, the pattern a* can match an empty string, 'a', 'aa', 'aaa', and so on.

 

 

import re

 

p = re.compile('^[a-z]*$')

print("re.compile('^[a-z]*$');p.match('helloworld')", "\n\t" , p.match('helloworld'))

 

p = re.compile('^[a-z]*$')

print("re.compile('^[a-z]*$');p.match('hello world')", "\n\t"  , p.match('hello world'))

 

p = re.compile('^[a-z\s]*$')

print("re.compile('^[a-z\s]*$');p.match('hello world')", "\n\t" , p.match('hello world'))

 

p = re.compile('^[a-z]*$')

print("re.compile('^[a-z]*$');p.match('HelloWorld')", "\n\t"  , p.match('HelloWorld'))

 

p = re.compile('^[a-zA-Z]*$')

print("re.compile('^[a-zA-Z]*$');p.match('HelloWorld')", "\n\t"  , p.match('HelloWorld'))

 

p = re.compile('^[a-zA-Z\s]*$')

print("re.compile('^[a-zA-Z\s]*$');p.match('Hello World')", "\n\t"  , p.match('Hello World'))

 

p = re.compile('^[a-zA-Z\s]*$')

print("re.compile('^[a-zA-Z\s]*$');p.match('Hello1234 World')", "\n\t"  , p.match('Hello1234 World'))

 

p = re.compile('^[a-zA-Z0-9\s]*$')

print("re.compile('^[a-zA-Z0-9\s]*$');p.match('Hello1234 World')", "\n\t"  , p.match('Hello1234 World'))

 

p = re.compile('^[a-zA-Z0-9\s]*$')

print("re.compile('^[a-zA-Z0-9\s]*$');p.match('###Hello1234 World')", "\n\t"  , p.match('###Hello1234 World'))

 

p = re.compile('[a-zA-Z0-9#\s]*$')

print("re.compile('[a-zA-Z0-9#\s]*$');p.match('###Hello1234 World')", "\n\t"  , p.match('###Hello1234 World'))

 

p = re.compile('.*[a-zA-Z0-9\s]*$')

print("re.compile('.*[a-zA-Z0-9\s]*$');p.match('###Hello1234 World')", "\n\t"  , p.match('###Hello1234 World'))

// ^[a-z]*$ <-- alphabet only and all lower case any number of characters, no space

//'helloworld'  match this criteria

re.compile('^[a-z]*$');p.match('helloworld')

     <_sre.SRE_Match object; span=(0, 10), match='helloworld'>

 

// ^[a-z]*$ <-- alphabet only and  all lower case, any number of characters, no space

//'hello world' doesn't match this criteria since there is a space in it

re.compile('^[a-z]*$');p.match('hello world')

     None

 

// ^[a-z\s]*$ <-- alphabet only and all lower case, any number of characters, any number of white space

//'hello world' match this criteria

re.compile('^[a-z\s]*$');p.match('hello world')

     <_sre.SRE_Match object; span=(0, 11), match='hello world'>

 

// ^[a-z]*$ <-- alphabet only and all lower case any number of characters, no space

//'HelloWorld' doesn't match this criteria since it has capital letters in it.

re.compile('^[a-z]*$');p.match('HelloWorld')

     None

 

// ^[a-zA-Z]*$ <-- alphabet only and lower or upper case, any number of characters, no space

//'HelloWorld' match this criteria.

re.compile('^[a-zA-Z]*$');p.match('HelloWorld')

     <_sre.SRE_Match object; span=(0, 10), match='HelloWorld'>

 

// ^[a-zA-Z\s]*$ <-- alphabet only and lower or upper case, any number of characters, any number of space

//'Hello World' match this criteria

re.compile('^[a-zA-Z\s]*$');p.match('Hello World')

     <_sre.SRE_Match object; span=(0, 11), match='Hello World'>

 

// ^[a-zA-Z\s]*$ <-- alphabet only and lower or upper case, any number of characters, any number of space

//'Hello1234 World' doesn't match this criteria because it has numbers in it.

re.compile('^[a-zA-Z\s]*$');p.match('Hello1234 World')

     None

 

// ^[a-zA-Z0-9\s]*$ <-- alphabet only and lower or upper case, numbers, any number of characters,

//any number of space

//'Hello1234 World' match this criteria .

re.compile('^[a-zA-Z0-9\s]*$');p.match('Hello1234 World')

     <_sre.SRE_Match object; span=(0, 15), match='Hello1234 World'>

 

// ^[a-zA-Z0-9\s]*$ <-- alphabet only and lower or upper case, numbers, any number of characters,

//any number of space

//'###Hello1234 World' doesn't match this criteria because it has non-alphanet characters (#) in it

re.compile('^[a-zA-Z0-9\s]*$');p.match('###Hello1234 World')

     None

 

// ^[a-zA-Z0-9#\s]*$ <-- alphabet only and lower or upper case, numbers, any number of characters,

//any number of #,any number of space

//'###Hello1234 World' match this criteria

re.compile('[a-zA-Z0-9#\s]*$');p.match('###Hello1234 World')

     <_sre.SRE_Match object; span=(0, 18), match='###Hello1234 World'>

 

//.*[a-zA-Z0-9#\s]*$ <-- ignore any number of characters before the specified pattern is found.

//alphabet only and lower or upper case, numbers, any number of characters, any number of #,any number of space

//'###Hello1234 World' match this criteria

re.compile('.*[a-zA-Z0-9\s]*$');p.match('###Hello1234 World')

     <_sre.SRE_Match object; span=(0, 18), match='###Hello1234 World'>

 

 

 

Reference :

 

[1] Quick-Start: Regex Cheat Sheet

[2] Python Regular Expression Tutorial

[3] Python Regular Expressions