C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML
Add: But if our HashSet does not contain the String, we add it to the HashSet (so it is detected next time).
Result: We also build up our result ArrayList. This collection will retain the ordering of the original, but not duplicates.
ArrayListInfo: RemoveDuplicates() could be easily adapted to work upon Integers or other types. Try changing the String type to Integer.
Java program that removes duplicates
import java.util.ArrayList;
import java.util.HashSet;
public class Program {
static ArrayList<String> removeDuplicates(ArrayList<String> list) {
// Store unique items in result.
ArrayList<String> result = new ArrayList<>();
// Record encountered Strings in HashSet.
HashSet<String> set = new HashSet<>();
// Loop over argument list.
for (String item : list) {
// If String is not in set, add it to the list and the set.
if (!set.contains(item)) {
result.add(item);
set.add(item);
}
}
return result;
}
public static void main(String[] args) {
ArrayList<String> list = new ArrayList<>();
list.add("dog");
list.add("cat");
list.add("dog");
list.add("dog");
list.add("cat");
list.add("bird");
// Remove duplicates from ArrayList of Strings.
ArrayList<String> unique = removeDuplicates(list);
for (String element : unique) {
System.out.println(element);
}
}
}
Output
dog
cat
bird
Note: This code is simpler and shorter. It does not suffer from any huge performance issues.
Reordering: Using this direct and simple approach causes the elements to be reordered. The HashSet will not retain ordering.
Java program that uses HashSet constructor
import java.util.ArrayList;
import java.util.HashSet;
public class Program {
public static void main(String[] args) {
// Create ArrayList with one duplicate element.
ArrayList<String> values = new ArrayList<>();
values.add("cat");
values.add("dog");
values.add("cat");
values.add("bird");
// Create HashSet from ArrayList.
HashSet<String> set = new HashSet<>(values);
// Create ArrayList from the set.
ArrayList<String> result = new ArrayList<>(set);
// The result.
System.out.println(result.toString());
}
}
Output
[cat, bird, dog]
Tip: If an element and a following element are equal, the ArrayList has a duplicate. Otherwise, no duplicates exist.
Tip 2: This loop-based method is sometimes faster than using a set like HashSet. Its performance is poor on large collections.
Thus: Calling hasDuplicates on small lists where no duplicates are likely is often a good optimization.
Java program that checks for duplicates
import java.util.ArrayList;
public class Program {
public static boolean hasDuplicates(ArrayList<String> list) {
for (int i = 0; i < list.size(); i++) {
// Loop over all following elements.
for (int x = i + 1; x < list.size(); x++) {
// If two elements equal, there is a duplicate.
if (list.get(i) == list.get(x)) {
return true;
}
}
}
// No duplicates found.
return false;
}
public static void main(String[] args) {
// This ArrayList has a duplicate.
ArrayList<String> values = new ArrayList<>();
values.add("green");
values.add("blue");
values.add("red");
values.add("pink");
values.add("pink");
// See if duplicates exist.
if (hasDuplicates(values)) {
System.out.println(true);
}
// Remove all pink strings.
values.removeIf(element -> element == "pink");
// No more duplicates.
System.out.println(hasDuplicates(values));
}
}
Output
true
false
Version 1: This code loops over the ArrayList to see if any duplicates exist. It does not try to remove the duplicates.
Version 2: This code removes the duplicates without first checking whether any duplicates are present.
Result: In this benchmark, we find that on a 4-element ArrayList, calling hasDuplicates is much faster than removeDuplicates.
Warning: If lists usually have duplicates, hasDuplicates() will not help performance. And it will be slow on long lists due to looping.
Java program that times duplicate test, duplicate removal
import java.util.*;
public class Program {
static boolean hasDuplicates(ArrayList<String> list) {
// See if duplicates exist.
for (int i = 0; i < list.size(); i++) {
for (int x = i + 1; x < list.size(); x++) {
if (list.get(i) == list.get(x)) {
return true;
}
}
}
return false;
}
static ArrayList<String> removeDuplicates(ArrayList<String> list) {
// Remove Duplicates: place them in new list (see above example).
ArrayList<String> result = new ArrayList<>();
HashSet<String> set = new HashSet<>();
for (String item : list) {
if (!set.contains(item)) {
result.add(item);
set.add(item);
}
}
return result;
}
public static void main(String[] args) {
ArrayList<String> elements = new ArrayList<>();
elements.add("one");
elements.add("two");
elements.add("three");
elements.add("four");
long t1 = System.currentTimeMillis();
// Version 1: test to see if duplicates exist.
for (int i = 0; i < 10000000; i++) {
if (hasDuplicates(elements)) {
System.out.println(false);
}
}
long t2 = System.currentTimeMillis();
// Version 2: process list even if no duplicates.
for (int i = 0; i < 10000000; i++) {
ArrayList<String> copy = removeDuplicates(elements);
if (copy == null) {
System.out.println(false);
}
}
long t3 = System.currentTimeMillis();
// ... Benchmark results.
System.out.println(t2 - t1);
System.out.println(t3 - t2);
}
}
Output
128 ms: hasDuplicates()
1409 ms: removeDuplicates()
Also: Another strategy is to not add duplicates in the first place. A custom method could prevent duplicates from being added.