TheDeveloperBlog.com

Home | Contact Us

C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML

Python Split String Examples

This Python article uses the split method to separate strings. It uses rsplit, splitlines and partition.

Split. Strings often store many pieces of data.

In a comma-separated format, these parts are divided with commas. A space is another common delimiter.

With split, and its friends, we extract these parts. Often files must be read. Lines must be split. This is done with readlines() and split.

A program. Here we handle a string that contains city names separated by commas. We call split() with a single comma string argument. We loop over the resulting list.

Note: The split() method with a string argument separates strings based on the specified delimiter.

Note 2: With no arguments, split() separates strings using one or more spaces as the delimiter.

Based on:

Python 3

Python program that uses split

# Input string.
s = "topeka,kansas city,wichita,olathe"

# Separate on comma.
cities = s.split(",")

# Loop and print each city name.
for city in cities:
    print(city)

Output

topeka
kansas city
wichita
olathe

No arguments. Split() can be called with no argument. In this case, split() uses spaces as the delimiter. Please notice that one or more spaces are treated the same.

Python program that uses split, no arguments

# Input string.
# ... Irregular number of spaces between words.
s = "One two   three"

# Call split with no arguments.
words = s.split()

# Display results.
for word in words:
    print(word)

Output

One
two
three

CSV file. This kind of file contains lines of text. It has values separated by commas. These files can be parsed with the split method.

Methods: We combine the open(), readlines(), and strip() methods. The path passed passed to open should be corrected.

Read Files

Info: This CSV parser splits each line of text at the commas. It loops and displays the original data and the extracted values.

Input file: deves.txt

manhattan,the bronx
brooklyn,queens
staten island

Python program that parses CSV file

# Open this file.
f = open("C:\deves.txt", "r")

# Loop over each line in the file.
for line in f.readlines():

    # Strip the line to remove whitespace.
    line = line.strip()

    # Display the line.
    print(line)

    # Split the line.
    parts = line.split(",")

    # Display each part of the line, indented.
    for part in parts:
        print("   ", part)

Output

manhattan,the bronx
    manhattan
    the bronx
brooklyn,queens
    brooklyn
    queens
staten island
    staten island

CSV module. We do not need to use split() to manually parse CSV files. The csv module is available. It offers the csvfile type. We use dialects to detect how to parse files.

CSV

Rsplit. Usually rsplit() is the same as split. The only difference occurs when the second argument is specified. This limits the number of times a string is separated.

So: When we specify 3, we split off only three times from the right. This is the maximum number of splits that occur.

Tip: The first element in the result list contains all the remaining, non-separated string values. This is unprocessed data.

Python program that uses rsplit

# Data.
s = "Buffalo;Rochester;Yonkers;Syracuse;Albany;Schenectady"

# Separate on semicolon.
# ... Split from the right, only split three.
cities = s.rsplit(";", 3)

# Loop and print.
for city in cities:
    print(city)

Output

Buffalo;Rochester;Yonkers
Syracuse
Albany
Schenectady

Splitlines. Lines of text can be separated with Windows, or UNIX, newline sequences. This makes splitting on lines complex. The splitlines() method helps here.

And: We split the three-line string literal into three separate strings with splitlines(). We print them in a for-loop.

Python program that calls splitlines

# Data.
s = """This string
has many
lines."""

# Split on line breaks.
lines = s.splitlines()

# Loop and display each line.
for line in lines:
    print("[" + line + "]")

Output

[ This string ]
[ has many ]
[ lines. ]

Partition. This method is similar to split(). It separates a string only on the first (leftmost) delimiter. It then returns a tuple containing its result data.

Tuple: This has three parts. It has the left part, the delimiter character, and the remaining string data.

Also: The rpartition() method is available. It acts from the right of the string, rather than the left. Partition is "lpartition."

Python program that uses partition

# Input data.
s = "123 Oak Street, New York"

# Partition on first space.
t = s.partition(" ")

# Print tuple contents.
print(t)

# Print first element.
print("First element:", t[0])

Output

('123', ' ', 'Oak Street, New York')
First element: 123

Partition loop. The result tuple of partition() makes it easy to use in a loop. We can continually partition a string, shortening the source data as we go along.

Here: In this example, we continue to consume each word in a source string. We read in each word at a time.

While: We use the while-loop to continue as long as further data exists in the input string.

While, For

Python that uses partition, while-loop

# The input string.
s = "Dot Net Perls website"

# Continue while the string has data.
while len(s) > 0:

    # Partition on first space.
    t = s.partition(" ")

    # Display the partitioned part.
    print(t[0])
    print("    ", t)

    # Set string variable to non-partitioned part.
    s = t[2]

Output

Dot
     ('Dot', ' ', 'Net Perls website')
Net
     ('Net', ' ', 'Perls website')
Perls
     ('Perls', ' ', 'website')
website
     ('website', '', '')

Benchmark. In many cases, the split() default call is the same as a split(" ") call. The calls are not equal if more than one space occurs together.

However: If we can use either syntax, it is best to choose either the faster or clearer one.

Tip: This next benchmark shows that split() with no arguments is faster by about 10% than split with a space argument.

Thus: If your program can use split(), prefer this method. It is faster. And it also uses shorter syntax. It is easier to read.

Python that times split

import time

# Input data.
s = "This is a split performance test"

print(s.split())
print(s.split(" "))

# Time 1.
print(time.time())

# Default version.
i = 0
while i < 1000000:
    words = s.split()
    i += 1

# Time 2.
print(time.time())

# Explicit space version.
i = 0
while i < 1000000:
    words = s.split(" ")
    i += 1

# Time 3.
print(time.time())

Results

['This', 'is', 'a', 'split', 'performance', 'test']
['This', 'is', 'a', 'split', 'performance', 'test']
1361813180.908
1361813181.561   split()    = 0.6530 s
1361813182.307   split(" ") = 0.7460 s

Handle numbers. A string contains numbers separated by a character. We can split the string, then convert each result to an integer with int.

Here: We sum the integers in a string. The float built-in handles numbers with decimal places.

Float

Python that splits string with numbers

numbers = "100,200,50"

# Split apart the numbers.
values = numbers.split(",")

# Loop over strings and convert them to integers.
# ... Then sum them.
total = 0
for value in values:
    total += int(value)

print(total)

Output

350

Strip. Often strings must be processed in some way before splitting them. For leading and trailing whitespace, please try the strip method. The lstrip and rstrip methods are also useful.

Strip

A helpful method. Split() helps with processing many text files and input data, as from websites or databases. We benchmarked split. And we explored related methods like partition.


Related Links

Adjectives Ado Ai Android Angular Antonyms Apache Articles Asp Autocad Automata Aws Azure Basic Binary Bitcoin Blockchain C Cassandra Change Coa Computer Control Cpp Create Creating C-Sharp Cyber Daa Data Dbms Deletion Devops Difference Discrete Es6 Ethical Examples Features Firebase Flutter Fs Git Go Hbase History Hive Hiveql How Html Idioms Insertion Installing Ios Java Joomla Js Kafka Kali Laravel Logical Machine Matlab Matrix Mongodb Mysql One Opencv Oracle Ordering Os Pandas Php Pig Pl Postgresql Powershell Prepositions Program Python React Ruby Scala Selecting Selenium Sentence Seo Sharepoint Software Spellings Spotting Spring Sql Sqlite Sqoop Svn Swift Synonyms Talend Testng Types Uml Unity Vbnet Verbal Webdriver What Wpf