<< Back to PYTHON
Python re.match, search Examples
Execute regular expressions with re: call match, search, split and findall.Regular expressions. These are tiny programs that process text. We access regular expressions through the re library. We call methods like re.match().
re.sub, subnWith methods, such as match() and search(), we run these little programs. More advanced methods like groupdict can process groups. Findall handles multiple matches. It returns a list.
Match example. This program uses a regular expression in a loop. It applies a for-loop over the elements in a list. In the loop body, we call re.match().
Then: We test this call for success. If it was successful, groups() returns a tuple containing the text content that matches the pattern.
Pattern: This uses metacharacters to describe what strings can be matched. The "\w" means "word character." The plus means "one or more."
Tip: Much of the power of regular expressions comes from patterns. We cover Python methods (like re.match) and these metacharacters.
Python program that uses match
import re
# Sample strings.
list = ["dog dot", "do don't", "dumb-dumb", "no match"]
# Loop.
for element in list:
# Match if two words starting with letter d.
m = re.match("(d\w+)\W(d\w+)", element)
# See if success.
if m:
print(m.groups())
Output
('dog', 'dot')
('do', 'don')
('dumb', 'dumb')
Pattern details
Pattern: (d\w+)\W(d\w+)
d Lowercase letter d.
\w+ One or more word characters.
\W A non-word character.
Search. This method is different from match. Both apply a pattern. But search attempts this at all possible starting points in the string. Match just tries the first starting point.
So: Search scans through the input string and tries to match at any location. In this example, search succeeds but match fails.
Python program that uses search
import re
# Input.
value = "voorheesville"
m = re.search("(vi.*)", value)
if m:
# This is reached.
print("search:", m.group(1))
m = re.match("(vi.*)", value)
if m:
# This is not reached.
print("match:", m.group(1))
Output
search: ville
Pattern details
Pattern: (vi.*)
vi The lowercase letters v and i together.
.* Zero or more characters of any type.
Split. The re.split() method accepts a pattern argument. This pattern specifies the delimiter. With it, we can use any text that matches a pattern as the delimiter to separate text data.
Here: We split the string on one or more non-digit characters. The regular expression is described after the script output.
Tip: A split() method is also available directly on a string. This method handles no regular expressions. It is simpler.
Python program that uses split
import re
# Input string.
value = "one 1 two 2 three 3"
# Separate on one or more non-digit characters.
result = re.split("\D+", value)
# Print results.
for element in result:
print(element)
Output
1
2
3
Pattern details
Pattern: \D+
\D+ One or more non-digit characters.
Findall. This is similar to split(). Findall accepts a pattern that indicates which strings to return in a list. It is like split() but we specify matching parts, not delimiters.
Here: We scan a string for all words starting with the letter d or p, and with one or more following word characters.
Python program that uses findall
import re
# Input.
value = "abc 123 def 456 dot map pat"
# Find all words starting with d or p.
list = re.findall("[dp]\w+", value)
# Print result.
print(list)
Output
['def', 'dot', 'pat']
Pattern details
Pattern: [dp]\w+
[dp] A lowercase d, or a lowercase p.
\w+ One or more word characters.
Finditer. Unlike re.findall, which returns strings, finditer returns matches. For each match, we call methods like start() or end(). And we can access the value of the match with group().
Python program that uses finditer
import re
value = "123 456 7890"
# Loop over all matches found.
for m in re.finditer("\d+", value):
print(m.group(0))
print("start index:", m.start())
Output
123
start index: 0
456
start index: 4
7890
start index: 8
Start, end. We can use special characters in an expression to match the start and end of a string. For the start, we use the character "^" and for the end, we use the "$" sign.
Here: We loop over a list of strings and call re.match. We detect all the strings that start or end with a digit character "\d."
Tip: The match method tests from the leftmost part of the string. So to test the end, we use ".*" to handle these initial characters.
Python program that tests starts, ends
import re
list = ["123", "4cat", "dog5", "6mouse"]
for element in list:
# See if string starts in digit.
m = re.match("^\d", element)
if m:
print("START:", element)
# See if string ends in digit.
m = re.match(".*\d$", element)
if m:
print(" END:", element)
Output
START: 123
END: 123
START: 4cat
END: dog5
START: 6mouse
Pattern details
^\d Match at the start, check for single digit.
.*\d$ Check for zero or more of any char.
Check for single digit.
Match at the end.
Or, repeats. Here we match strings with three letters or three dashes at their starts. And the final three characters must be digits. We use non-capturing groups with the "?:" syntax.
And: We use the "3" codes to require three repetitions of word characters or hyphens.
Finally: We specify digit characters with the code "\d" and the metacharacter "$" to require the end of the string.
Python program that uses re, expressions, repeats, or
import re
values = ["cat100", "---200", "xxxyyy", "jjj", "box4000", "tent500"]
for v in values:
# Require 3 letters OR 3 dashes.
# ... Also require 3 digits.
m = re.match("(?:(?:\w{3})|(?:\-{3}))\d\d\d$", v)
if m:
print(" OK:", v)
else:
print("FAIL:", v)
Output
OK: cat100
OK: ---200
FAIL: xxxyyy
FAIL: jjj
FAIL: box4000
FAIL: tent500
Pattern details
(?: The start of a non-capturing group.
\w{3} Three word characters.
| Logical or: a group within the chain must match.
\- An escaped hyphen.
\d A digit.
$ The end of the string.
Named groups. A regular expression can have named groups. This makes it easier to retrieve those groups after calling match(). But it makes the pattern more complex.
Here: We can get the first name with the string "first" and the groups() method. We use "last" for the last name.
Python program that uses named groups
import re
# A string.
name = "Clyde Griffiths"
# Match with named groups.
m = re.match("(?P<first>\w+)\W+(?P<last>\w+)", name)
# Print groups using names as id.
if m:
print(m.group("first"))
print(m.group("last"))
Output
Clyde
Griffiths
Pattern details
Pattern: (?P<first>\w+)\W+(?P<last>\w+)
(?P<first>\w+) First named group.
\W+ One or more non-word characters.
(?P<last>\w+) Second named group.
Groupdict. A regular expression with named groups can fill a dictionary. This is done with the groupdict() method. In the dictionary, each group name is a key.
And: Each value is the data matched by the regular expression. So we receive a key-value store based on groups.
Here: With groupdict, we eliminate all references to the original regular expression. We can change the data to dictionary format.
Python program that uses groupdict
import re
name = "Roberta Alden"
# Match names.
m = re.match("(?P<first>\w+)\W+(?P<last>\w+)", name)
if m:
# Get dict.
d = m.groupdict()
# Loop over dictionary with for-loop.
for t in d:
print(" key:", t)
print("value:", d[t])
Output
key: last
value: Alden
key: first
value: Roberta
Comment. Sometimes a regular expression is confusing. A comment can be used to explain a complex part. One problem is the comment syntax may be confusing too—this should be considered.
Here: We see that a Regex comment starts with a "#" character (just like in Python itself).
Python program that uses Regex comments
import re
data = "bird frog"
# Use comments inside a regular expression.
m = re.match("(?#Before part).+?(?#Separator)\W(?#End part)(.+)", data)
if m:
print(m.group(1))
Output
frog
Pattern details
(?#Before part) Comment, ignored
.+? As few characters as possible
(?#Separator) Comment, ignored
\W Non-word character
(?#End part) Comment, ignored
(.+) One or more characters, captured
Not-followed-by. We use a negative match pattern to ensure a value does not match. In this example, we match all the 3-digit strings except ones that are followed by a "dog" string.
Tip: This is called a "negative lookahead assertion." It may be clearer to filter out results in Python code after matching.
Python program that uses not-followed-by pattern
import re
data = "100cat 200cat 300dog 400cat 500car"
# Find all 3-digit strings except those followed by "dog" string.
# ... Dogs are not allowed.
m = re.findall("(?!\d\d\ddog)(\d\d\d)", data)
print(m)
Output
['100', '200', '400', '500']
Pattern details
(?!\d\d\ddog) Not followed by 3 digits and "dog"
(\d\d\d) 3 digit value
Benchmark, search. Regular expressions often hinder performance in programs. I tested the in-operator on a string against the re.search method.
Version 1: This version of the code uses the in-operator to find the letter "x" in the string.
Version 2: Here we use re.search (a regular expression method) to find the same letter.
Result: I found that the in-operator was much faster than the re.search method. For searching with no pattern, prefer the in-operator.
Python program that tests re.search
import time
import re
input = "max"
if "x" in input:
print(1)
if re.search("x", input):
print(2)
print(time.time())
# Version 1: in.
c = 0
i = 0
while i < 1000000:
if "x" in input:
c += 1
i += 1
print(time.time())
# Version 2: re.search.
i = 0
while i < 1000000:
if re.search("x", input):
c += 1
i += 1
print(time.time())
Output
1
2
1381081435.177
1381081435.615 [in = 0.438 s]
1381081437.224 [re.search = 1.609 s]
Sub method. The re.sub method can apply a method or lambda to each match found in a string. We specify a pattern and a method that receives a match. And we can process matches in any way.
Re.match performance. In another test I rewrote a method that uses re.match to use if-statements and a for-loop. It became much faster.
re, Performance
Word count. We implement a simple word-counting routine. We use re.findall and count non-whitespace sequences in a string. And then we return the length of the resulting list.
Word CountTip: Implementing small methods, like word counting ones, will help us learn to use Python in a versatile way.
A summary. A regular expression is often hard to correctly write. But when finished, it is shorter and overall simpler to maintain. It describes a specific type of logic.
Text processing. Re handles only text processing, in a concise way. We can search and match strings based on patterns. Performance suffers when regular expressions are excessively used.
Related Links:
- Python global and nonlocal
- Python not: If Not True
- Python Convert Decimal Binary Octal and Hexadecimal
- Python Tkinter Scale
- Python Tkinter Scrollbar
- Python Tkinter Text
- Python History
- Python Number: random, float and divmod
- Python Tkinter Toplevel
- Python Tkinter Spinbox
- Python Tkinter PanedWindow
- Python Tkinter LabelFrame
- Python Tkinter MessageBox
- Python Website Blocker
- Python Console Programs: Input and Print
- Python Display Calendar
- Python Check Number Odd or Even
- Python readline Example: Read Next Line
- Python Anagram Find Method
- Python Any: Any Versus All, List Performance
- Python Filename With Date Example (date.today)
- Python Find String: index and count
- Python filter (Lambda Removes From List or Range)
- Python ASCII Value of Character
- Python Sum Example
- Python make simple Calculator
- Python Add Two Matrices
- Python Multiply Two Matrices
- Python SyntaxError (invalid syntax)
- Python Transpose Matrix
- Python Remove Punctuation from String
- Python Dictionary items() method with Examples
- Python Dictionary keys() method with Examples
- Python Textwrap Wrap Example
- Python Dictionary popitem() method with Examples
- Python Dictionary pop() method with Examples
- Python HTML: HTMLParser, Read Markup
- Python Tkinter Tutorial
- Python Array Examples
- Python ord, chr Built Ins
- Python Dictionary setdefault() method with Examples
- Python Dictionary update() method with Examples
- Python Dictionary values() method with Examples
- Python complex() function with Examples
- Python delattr() function with Examples
- Python dir() function with Examples
- Python divmod() function with Examples
- Python Loops
- Python for loop
- Python while loop
- Python enumerate() function with Examples
- Python break
- Python continue
- Python dict() function with Examples
- Python pass
- Python Strings
- Python Lists
- Python Tuples
- Python Sets
- Python Built-in Functions
- Python filter() function with Examples
- Python dict Keyword (Copy Dictionary)
- Python Dictionary Order Benchmark
- Python Dictionary String Key Performance
- Python 2D Array: Create 2D Array of Integers
- Python Divmod Examples, Modulo Operator
- bin() in Python | Python bin() Function with Examples
- Python Oops Concept
- Python Object Classes
- Python Constructors
- Python hash() function with Examples
- Python Pandas | Python Pandas Tutorial
- Python Class Examples: Init and Self
- Python help() function with Examples
- Python IndentationError (unexpected indent)
- Python Index and Count (Search List)
- Python min() function with Examples
- Python classmethod and staticmethod Use
- Python set() function with Examples
- Python hex() function with Examples
- Python id() function with Examples
- Python sorted() function with Examples
- Python next() function with Examples
- Python Compound Interest
- Python List insert() method with Examples
- Python Datetime Methods: Date, Timedelta
- Python setattr() function with Examples
- Python 2D List Examples
- Python Pandas Data operations
- Python Def Methods and Arguments (callable)
- Python slice() function with Examples
- Python Remove HTML Tags
- Python input() function with Examples
- Python enumerate (For Index, Element)
- Python Display the multiplication Table
- Python int() function with Examples
- Python Error: Try, Except and Raise
- Python isinstance() function with Examples
- Python oct() function with Examples
- Python startswith, endswith Examples
- Python List append() method with Examples
- Python NumPy Examples (array, random, arange)
- Python Replace Example
- Python List clear() method with Examples
- Python List copy() method with Examples
- Python Lower Dictionary: String Performance
- Python Lower and Upper: Capitalize String
- Python Dictionary Examples
- Python map Examples
- Python Len (String Length)
- Python Padding Examples: ljust, rjust
- Python Type: setattr and getattr Examples
- Python String List Examples
- Python String
- Python Remove Duplicates From List
- Python If Examples: Elif, Else
- Python Programs | Python Programming Examples
- Python List count() method with Examples
- Python List extend() method with Examples
- Python List index() method with Examples
- Python List pop() method with Examples
- Python Palindrome Method: Detect Words, Sentences
- Python Path: os.path Examples
- Python List remove() method with Examples
- Python List reverse() method with Examples
- Top 50+ Python Interview Questions (2021)
- Python List sort() method with Examples
- Python sort word in Alphabetic Order
- abs() in Python | Python abs() Function with Examples
- Python String | encode() method with Examples
- all() in Python | Python all() Function with Examples
- any() in Python | Python any() Function with Examples
- Python Built In Functions
- ascii() in Python | Python ascii() Function with Examples
- Python bytes, bytearray Examples (memoryview)
- bool() in Python | Python bool() Function with Examples
- bytearray() in Python | Python bytearray() Function with Examples
- Python Caesar Cipher
- bytes() in Python | Python bytes() Function with Examples
- Python Sum of Natural Numbers
- callable() in Python | Python callable() Function with Examples
- Python Set add() method with Examples
- Python Set discard() method with Examples
- Python Set pop() method with Examples
- Python math.floor, import math Examples
- Python Return Keyword (Return Multiple Values)
- Python while Loop Examples
- Python Math Examples
- Python Reverse String
- Python max, min Examples
- Python pass Statement
- Python Set remove() method with Examples
- Python Dictionary
- Python Functions
- Python String | capitalize() method with Examples
- Python String | casefold() method with Examples
- Python re.sub, subn Methods
- Python subprocess Examples: subprocess.run
- Python Tkinter Checkbutton
- Python Tkinter Entry
- Python String | center() method with Examples
- Python Substring Examples
- Python pow Example, Power Operator
- Python Lambda
- Python Files I/O
- Python Modules
- Python String | count() method with Examples
- Python String | endswith() method with Examples
- Python String | expandtabs() method with Examples
- Python Prime Number Method
- Python String | find() method with Examples
- Python String | format() method with Examples
- Python String | index() method with Examples
- Python String | isalnum() method with Examples
- Python String | isalpha() method with Examples
- Python String | isdecimal() method with Examples
- Python Pandas Sorting
- Python String | isdigit() method with Examples
- Python Convert Types
- Python String | isidentifier() method with Examples
- Python Pandas Add column to DataFrame columns
- Python String | islower() method with Examples
- Python Pandas Reading Files
- Python Right String Part
- Python IOError Fix, os.path.exists
- Python Punctuation and Whitespace (string.punctuation)
- Python isalnum: String Is Alphanumeric
- Python Pandas Series
- Python Pandas DataFrame
- Python Recursion Example
- Python ROT13 Method
- Python StringIO Examples and Benchmark
- Python Import Syntax Examples: Modules, NameError
- Python in Keyword
- Python iter Example: next
- Python Round Up and Down (Math Round)
- Python List Comprehension
- Python Collection Module
- Python Math Module
- Python OS Module
- Python Random Module
- Python Statistics Module
- Python String Equals: casefold
- Python Sys Module
- Top 10 Python IDEs | Python IDEs
- Python Arrays
- Python Magic Method
- Python Stack and Queue
- Python MySQL Environment Setup
- Python MySQL Database Connection
- Python MySQL Creating New Database
- Python MySQL Creating Tables
- Python Word Count Method (re.findall)
- Python String Literal: F, R Strings
- Python MySQL Update Operation
- Python MySQL Join Operation
- Python Armstrong Number
- Learn Python Tutorial
- Python Factorial Number using Recursion
- Python Features
- Python Comments
- Python if else
- Python Translate and Maketrans Examples
- Python Website Blocker | Building Python Script
- Python Itertools Module: Cycle and Repeat
- Python Operators
- Python Int Example
- Python join Example: Combine Strings From List
- Python Read CSV File
- Python Write CSV File
- Python Read Excel File
- Python Write Excel File
- Python json: Import JSON, load and dumps
- Python Lambda Expressions
- Python Print the Fibonacci sequence
- Python format Example (Format Literal)
- Python Namedtuple Example
- Python SciPy Tutorial
- Python Applications
- Python KeyError Fix: Use Dictionary get
- Python Resize List: Slice and Append
- Python String | translate() method with Examples
- Python Copy List (Slice Entire List)
- Python None: TypeError, NoneType Has No Length
- Python MySQL Performing Transactions
- Python String | isnumeric() method with Examples
- Python MongoDB Example
- Python String | isprintable() method with Examples
- Python Tkinter Canvas
- Python String | isspace() method with Examples
- Python Tkinter Frame
- Python Tkinter Label
- Python Tkinter Listbox
- Python String | istitle() method with Examples
- Python Website Blocker | Script Deployment on Linux
- Python Website Blocker | Script Deployment on Windows
- Python String | isupper() method with Examples
- Python String split() method with Examples
- Python Slice Examples: Start, Stop and Step
- Python String | join() method with Examples
- Python String | ljust() method with Examples
- Python Sort by File Size
- Python Arithmetic Operations
- Python String | lower() method with Examples
- Python Exception Handling | Python try except
- Python Date
- Python Regex | Regular Expression
- Python Sending Email using SMTP
- Python Command Line Arguments
- Python List Comprehension Examples
- Python Assert Keyword
- Python Set Examples
- Python Fibonacci Sequence
- Python Maze Pathfinding Example
- Python Memoize: Dictionary, functools.lru_cache
- Python Timeit, Repeat Examples
- Python Strip Examples
- Python asyncio Example: yield from asyncio.sleep
- Python String Between, Before and After Methods
- Python bool Use (Returns True or False)
- Python Counter Example
- Python frozenset: Immutable Sets
- Python Generator Examples: Yield, Expressions
- Python CSV: csv.reader and Sniffer
- Python globals, locals, vars and dir
- Python abs: Absolute Value
- Python gzip: Compression Examples
- Python Function Display Calendar
- Python Display Fibonacci Sequence Recursion
- Python String | lstrip() method with Examples
- Python del Operator (Remove at Index or Key)
- Python String | partition() method with Examples
- Python String | replace() method with Examples
- Python Zip Examples: Zip Objects
- Python String | rfind() method with Examples
- Python String | rindex() method with Examples
- Python String rjust() method with Examples
- Python String rpartition() method with Examples
- Python String rsplit() method with Examples
- Python Area Of Triangle
- Python Quadratic Equation
- Python swap two Variables
- Python Generate Random Number
- Python Convert Kilometers to Miles
- Python Convert Celsius to Fahrenheit
- Python Check Number Positive Negative or Zero
- Python Check Leap Year
- Python Check Prime Number
- Top 40 Python Pandas Interview Questions (2021)
- Python Check Armstrong Number
- Python SQLite Example
- Python Tkinter Button
- Python Find LCM
- Python Find HCF
- Python Tuple Examples
- Python String | rstrip() method with Examples
- Python String splitlines() method with Examples
- Python String | startswith() method with Examples
- Python String | swapcase() method with Examples
- Python Truncate String
- Python String | upper() method with Examples
- Python for: Loop Over String Characters
- Python String | zfill() method with Examples
- Python Sort Examples: Sorted List, Dictionary
- Python XML: Expat, StartElementHandler
- Python Urllib Usage: Urlopen, UrlParse
- Python File Handling (with open, write)
- Python Example
- Python variables
- Python Random Numbers: randint, random.choice
- Python assert, O Option
- Python Data Types
- Python keywords
- Python literals
- Python MySQL Insert Operation
- Python MySQL Read Operation
- Python ascii Example
- Python ASCII Table Generator: chr
- Python Range: For Loop, Create List From Range
- Python re.match Performance
- Python re.match, search Examples
- Python Tkinter Menubutton
- Python Tkinter Menu
- Python Tkinter Message
- Python Tkinter Radiobutton
- Python List Examples
- Python Split String Examples
Related Links
Adjectives
Ado
Ai
Android
Angular
Antonyms
Apache
Articles
Asp
Autocad
Automata
Aws
Azure
Basic
Binary
Bitcoin
Blockchain
C
Cassandra
Change
Coa
Computer
Control
Cpp
Create
Creating
C-Sharp
Cyber
Daa
Data
Dbms
Deletion
Devops
Difference
Discrete
Es6
Ethical
Examples
Features
Firebase
Flutter
Fs
Git
Go
Hbase
History
Hive
Hiveql
How
Html
Idioms
Insertion
Installing
Ios
Java
Joomla
Js
Kafka
Kali
Laravel
Logical
Machine
Matlab
Matrix
Mongodb
Mysql
One
Opencv
Oracle
Ordering
Os
Pandas
Php
Pig
Pl
Postgresql
Powershell
Prepositions
Program
Python
React
Ruby
Scala
Selecting
Selenium
Sentence
Seo
Sharepoint
Software
Spellings
Spotting
Spring
Sql
Sqlite
Sqoop
Svn
Swift
Synonyms
Talend
Testng
Types
Uml
Unity
Vbnet
Verbal
Webdriver
What
Wpf