C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML
And these lines have many parts, separated by delimiters. With use split() to break them apart.
Regex. Split in Java uses a Regex. This can be simple, even a single character like a comma, or more complex, involving character codes. This method is powerful.
A simple example. Let use begin with this example. We introduce a string that has two commas in it, separating three strings (cat, dog, bird). We split on a comma.
For: Split returns a String array. We then loop over that array's elements with a for-each loop. We display them.
Based on: Java 7 Java program that uses split public class Program { public static void main(String[] args) { // This string has three words separated by commas. String value = "cat,dog,bird"; // Split on a comma. String parts[] = value.split(","); // Display result parts. for (String part : parts) { System.out.println(part); } } } Output cat dog rat
Split lines in file. Here we use BufferedReader and FileReader to read in a text file. Then, while looping over it, we split each line. In this way we parse a CSV file with split.
Println: Finally we use the System.out.println method to display each part from each line to the screen.
Contents: file.txt carrot,squash,turnip potato,spinach,kale Java program that reads file, splits lines import java.io.BufferedReader; import java.io.FileReader; import java.io.IOException; public class Program { public static void main(String[] args) throws IOException { // Open this file. BufferedReader reader = new BufferedReader(new FileReader( "C:\\programs\\file.txt")); // Read lines from file. while (true) { String line = reader.readLine(); if (line == null) { break; } // Split line on comma. String[] parts = line.split(","); for (String part : parts) { System.out.println(part); } System.out.println(); } reader.close(); } } Output carrot squash turnip potato spinach kale
Either character. Often data is inconsistent. Sometimes we need to split on a range or set of characters. With split, this is possible. Here we split on a comma and a colon.
Tip: With square brackets, we specify the possible characters to split upon. So we split on all colons and commas, with one call.
Java program that splits on either character public class Program { public static void main(String[] args) { String line = "carrot:orange,apple:red"; // Split on comma or colon. String[] parts = line.split("[,:]"); for (String part : parts) { System.out.println(part); } } } Output carrot orange apple red
Count, separate words. We can use more advanced character patterns in split. Here we separate a String based on non-word characters. We use "\W+" to mean this.
Pattern: The pattern means "one or more non-word characters." A plus means "one or more" and a W means non-word.
Note: The comma and its following space are treated as a single delimiter. So two characters are matched as one delimiter.
Java program that counts, splits words public class Program { public static void main(String[] args) { String line = "hello, how are you?"; // Split on 1+ non-word characters. String[] words = line.split("\\W+"); // Count words. System.out.println(words.length); // Display words. for (String word : words) { System.out.println(word); } } } Output 4 hello how are you
Numbers. This example splits a string apart and then uses parseInt to convert those parts into ints. It splits on a two-char sequence. Then in a loop, it calls parseInt on each String.
Java program that uses split, parseInt public class Program { public static void main(String[] args) { String line = "1, 2, 3"; // Split on two-char sequence. String[] numbers = line.split(", "); // Display numbers. for (String number : numbers) { int value = Integer.parseInt(number); System.out.println(value + " * 20 = " + value * 20); } } } Output 1 * 20 = 20 2 * 20 = 40 3 * 20 = 60
Limit. Split accepts an optional second parameter, a limit Integer. If we provide this, the result array has at most that many elements. Any extra parts remain part of the last element.
Pattern.compile, split. A split method is available on the Pattern class, found in java.util.regex. We can compile a Pattern and reuse it many times. This can enhance performance.
Note: A call to Pattern.compile optimizes all split() calls afterwards. But this only helps if many splits are done.
Java program that uses Pattern.compile, split import java.util.regex.Pattern; public class Program { public static void main(String[] args) { // Separate based on number delimiters. Pattern p = Pattern.compile("\\d+"); String value = "abc100defgh9ij"; String[] elements = p.split(value); // Display our results. for (String element : elements) { System.out.println(element); } } } Output abc defgh ij
Performance, Pattern split. We can improve the speed of splitting strings based on regular expressions by using Pattern.compile. We create a delimiter pattern. Then we call split() with it.
Result: When many Strings are split, a call Pattern.compile before using its Split method optimizes performance.
Java that times Pattern split import java.util.regex.Pattern; public class Program { public static void main(String[] args) { // ... Create a delimiter pattern. Pattern pattern = Pattern.compile("\\W+"); String line = "cat; dog--ABC"; long t1 = System.currentTimeMillis(); // Version 1: use split method on Pattern. for (int i = 0; i < 1000000; i++) { String[] values = pattern.split(line); if (values.length != 3) { System.out.println(false); } } long t2 = System.currentTimeMillis(); // Version 2: use String split method. for (int i = 0; i < 1000000; i++) { String[] values = line.split("\\W+"); if (values.length != 3) { System.out.println(false); } } long t3 = System.currentTimeMillis(); // ... Benchmark results. System.out.println(t2 - t1); System.out.println(t3 - t2); } } Results 471 ms, Pattern split 549 ms, String split
Join. This method combines Strings together—we specify our desired delimiter String. Join is sophisticated. It can handle a String array or individual Strings.
With split, we use a regular expression-based pattern. But for simple cases, we provide the delimiter itself as the pattern. This too works. Split is elegant and powerful.