C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML
Often in parsing text, a string's position relative to other characters is important. For example, a parameter to a method is between two parenthesis.
For simple text languages, these surrounding characters are easy to handle. We can implement a between() method that returns a substring surrounded by two others. This simplifies logic.
Our methods. Let us start. This Python program includes three new methods: between, before and after. We internally call find() and rfind(). These methods return -1 when nothing is found.
And: We use the indexes returned by find() and rfind to get the indexes of the desired string slice.
Finally: We use the string slice syntax in Python to get substrings of the strings. We thus get substrings relative to other substrings.
Based on: Python 3 Python program that locates relative slices def between(value, a, b): # Find and validate before-part. pos_a = value.find(a) if pos_a == -1: return "" # Find and validate after part. pos_b = value.rfind(b) if pos_b == -1: return "" # Return middle part. adjusted_pos_a = pos_a + len(a) if adjusted_pos_a >= pos_b: return "" return value[adjusted_pos_a:pos_b] def before(value, a): # Find first part and return slice before it. pos_a = value.find(a) if pos_a == -1: return "" return value[0:pos_a] def after(value, a): # Find and validate first part. pos_a = value.rfind(a) if pos_a == -1: return "" # Returns chars after the found string. adjusted_pos_a = pos_a + len(a) if adjusted_pos_a >= len(value): return "" return value[adjusted_pos_a:] # Test the methods with this literal. test = "DEFINE:A=TWO" print(between(test, "DEFINE:", "=")) print(between(test, ":", "=")) print(before(test, ":")) print(before(test, "=")) print(after(test, ":")) print(after(test, "DEFINE:")) print(after(test, "=")) Output A A DEFINE DEFINE:A A=TWO A=TWO TWO
An issue. The between, before and after methods here return an empty string literal when no value is found. A None result is probably a better choice for these error cases.
So: If the None constant is desired, the return statements could be easily modified to return None instead.
Simple parsers. I have not needed this code for any Python programs. But often in code I have found that implementing small, domain-specific languages is a good strategy.
And: A program could implement a language, like the one with the DEFINE example, to change settings at runtime.
Tip: This could reduce changes to your Python code. If a variable needs changing, you could just alter the text file.
Finally: Reducing code churn, as with a small configuration file language, can reduce the possibility of new bugs being introduced.
Substring logic can be complex—it is a disaster to be off by one character. Your entire program might fail and ruin your day. With methods that handle this logic, things are easier.