Missing part of special group in regular expressionΒΆ
ID: py/regex/incomplete-special-group
Kind: problem
Severity: warning
Precision: high
Tags:
- reliability
- correctness
Query suites:
- python-security-and-quality.qls
Click to see the query in the CodeQL repository
One of the problems with using regular expressions is that almost any sequence of characters is a valid pattern. This means that it is easy to omit a necessary character and still have a valid regular expression. Omitting a character in a named capturing group is a specific case which can dramatically change the meaning of a regular expression.
RecommendationΒΆ
Examine the regular expression to find and correct any typos.
ExampleΒΆ
In the following example, the regular expression for matcher, r"(P<name>[\w]+)", is missing a β?β and will match only strings of letters that start with βP<name>β, instead of matching any sequence of letters and placing the result in a named group. The fixed version, fixed_matcher, includes the β?β and will work as expected.
import re
matcher = re.compile(r'(P<name>[\w]+)')
def only_letters(text):
m = matcher.match(text)
if m:
print("Letters are: " + m.group('name'))
#Fix the pattern by adding the missing '?'
fixed_matcher = re.compile(r'(?P<name>[\w]+)')
ReferencesΒΆ
Python Standard Library: Regular expression operations.
Regular-Expressions.info: Named Capturing Groups.