TheDeveloperBlog.com


Python Re Sub, Subn Methods

Python Re Sub, Subn Methods

Re.sub. In regular expressions, sub stands for substitution. The re.sub method applies a method to all matches. It evaluates a pattern, and for each match calls a method (or lambda).


This method can modify strings in complex ways. We can apply transformations, like change numbers within a string. The syntax can be hard to follow.


This example introduces a method "multiply" that receives a match. It accesses group(0) and converts it into an integer. It multiplies that number by two, and converts it to a string.

Sub: In the main program body, we call the re.sub method. The first argument is "\d+" which means one or more digit chars.

And: The second argument is the multiply method name. We also pass a sample string for processing.

Result: The re.sub method matched each group of digits (each number) and the multiply method doubled it.

Based on:

Python 3

Python program that uses re.sub

import re

def multiply(m):
    # Convert group 0 to an integer.
    v = int(m.group(0))
    # Multiply integer by 2.
    # ... Convert back into string and return it.
    return str(v * 2)

# Use pattern of 1 or more digits.
# ... Use multiply method as second argument.
result = re.sub("\d+", multiply, "10 20 30 40 50")
print(result)

Output

20 40 60 80 100

Re notes. We could not easily do this in regular expressions without a helper method. But regular expressions are not often the fastest, or clearest form of code.


String. Re.sub can replace a pattern match with a simple string. No method call or lambda is required. Here we replace a pattern with the string "ring."

Python program that uses string replacement

import re

# An example string.
v = "running eating reading"

# Replace words starting with "r" and ending in "ing"
# ... with a new string.
v = re.sub(r"r.*?ing", "ring", v)
print(v)

Output

ring eating ring

Subn. Usually re.sub() is sufficient. But another option exists. The re.subn method has an extra feature. It returns a tuple with a count of substitutions in the second element.

Tuple

Tip: If you must know the number of substitutions made by re.sub, using re.subn is an ideal choice.

However: If your program has no use of this information, using re.sub is probably best. It is simpler and more commonly used.

Python program that calls re.subn

import re

def add(m):
    # Convert.
    v = int(m.group(0))
    # Add 2.
    return str(v + 1)

# Call re.subn.
result = re.subn("\d+", add, "1 2 3 4 5")

print("Result string:", result[0])
print("Number of substitutions:", result[1])

Output

Result string: 11 21 31 41 51
Number of substitutions: 5

Lambda. A def method name can be used in re.sub. But a lambda offers a more terse alternative. Here we specify a lambda expression directly within the re.sub argument list.

Here: We add the string "ing" to the end of all words within the input string. Additional logic could be used to make the results better.

Tip: A gerund form of a verb cannot be made this way all the time. Sometimes other spelling changes are needed.

Python program that uses re.sub, lambda

import re

# The input string.
input = "laugh eat sleep think"

# Use lambda to add "ing" to all words.
result = re.sub("\w+", lambda m: m.group(0) + "ing", input)

# Display result.
print(result)

Output

laughing eating sleeping thinking

Dictionary example. The re.sub method can be used with a dictionary. In the method provided to re.sub, we access a dictionary to influence our action.

Here: We replace all known "plant" strings with the string PLANT. On other words, modify() takes no action.

Python program that uses re.sub with dictionary

import re

plants = {"flower": 1, "tree": 1, "grass": 1}

def modify(m):
    v = m.group(0)
    
    # If string is in dictionary, return different string.
    if v in plants:
        return "PLANT"
    
    # Do not change anything.
    return v

# Modify to remove all strings within the dictionary.
result = re.sub("\w+", modify,
    "bird flower dog fish tree")

print(result)

Output

bird PLANT dog fish PLANT

A summary. Re.sub, and its friend re.subn, can replace substrings in arbitrary ways. A method can test the contents of a match and change it using any algorithm.


And with a pattern, we can specify nearly any textual sequence to match. We can change a string to any other string (with a sufficient algorithm).