Golang Remove Duplicates From Slice

These Go programs remove duplicate elements from slices of ints and strings. They use maps and slices together.

Remove duplicates. A slice contains any elements.

It contains different values, but sometimes has duplicate ones. We remove these elements with custom methods.

 

 

With a map, we enforce uniqueness of elements. We can use a map to remove duplicates while preserving element order. If order does not matter, we can ignore it.

 

 

Ints, retains order. Here we introduce a removeDuplicates method for ints that preserves the ordering of ints in a slice. It returns a new slice with just the duplicates removed.

 

Map: We create a map in removeDuplicates. This records ints as we encounter them in the slice.

For: We loop over the original slice in a for-loop. We record each int in the map. We then add all unencountered ints to the result slice.

Result: The slice returned by removeDuplicates has all duplicates removed, but everything else about the original slice is left the same.

Based on:

Golang 1.4

Golang program that removes duplicate elements

package main

import "fmt"

func removeDuplicates(elements []int) []int {
    // Use map to record duplicates as we find them.
    encountered := map[int]bool{}
    result := []int{}

    for v := range elements {
	if encountered[elements[v]] == true {
	    // Do not add duplicate.
	} else {
	    // Record this element as an encountered element.
	    encountered[elements[v]] = true
	    // Append to result slice.
	    result = append(result, elements[v])
	}
    }
    // Return the new slice.
    return result
}

func main() {
    elements := []int{100, 200, 300, 100, 200, 400, 0}
    fmt.Println(elements)

    // Test our method.
    result := removeDuplicates(elements)
    fmt.Println(result)
}

Output

[100 200 300 100 200 400 0]
[100 200 300 400 0]

Strings, ignore ordering. This is an alternative approach. Here we remove duplicate strings in a slice. But we ignore the order of the elements—the resulting slice can be in any order.

First: We add all elements from the string slice to a string map. The value (bool) is not important here.

Finally: We loop over the map and add all keys to a resulting slice. The map may store its keys in any order. This affects the result.

Golang program that removes duplicates, ignores order

package main

import "fmt"

func removeDuplicatesUnordered(elements []string) []string {
    encountered := map[string]bool{}

    // Create a map of all unique elements.
    for v:= range elements {
	encountered[elements[v]] = true
    }

    // Place all keys from the map into a slice.
    result := []string{}
    for key, _ := range encountered {
	result = append(result, key)
    }
    return result
}

func main() {
    elements := []string{"cat", "dog", "cat", "bird"}
    fmt.Println(elements)

    // Remove string duplicates, ignoring order.
    result := removeDuplicatesUnordered(elements)
    fmt.Println(result)
}

Output

[cat dog cat bird]
[cat dog bird]

Nested loops. We do not need a map to detect and remove duplicate elements in a slice. On long slices, a nested loop will perform poorly. But on short slices, this is a fast approach.

Warning: A nested loop has a major algorithmic inefficiency. If a slice becomes long, this Go program will not perform well.

Golang program that removes duplicates with nested loops

package main

import "fmt"

func removeDuplicates(elements []int) []int {
    result := []int{}

    for i := 0; i < len(elements); i++ {
	// Scan slice for a previous element of the same value.
	exists := false
	for v := 0; v < i; v++ {
	    if elements[v] == elements[i] {
		exists = true
		break
	    }
	}
	// If no previous element exists, append this one.
	if !exists {
	    result = append(result, elements[i])
	}
    }
    return result
}

func main() {
    elements := []int{10, 20, 30, 10, 10, 20, 40}
    fmt.Println(elements)

    // Test the nested loop method.
    result := removeDuplicates(elements)
    fmt.Println(result)
}

Output

[10 20 30 10 10 20 40]
[10 20 30 40]

Has duplicates. Often the best optimization in a program is to avoid unneeded work. If a program often has no duplicates, we can first check for duplicates.

And: If no duplicates are found, we avoid creating a new slice. This will optimize programs that rarely need the "dedupe" operation.

 

Often, a program that needs duplicates removed is poorly designed. We can store elements in a map instead of a slice. This will eliminate duplicates as we go along.

 

 

A review. Duplicates can removed in a slice. We can ensure that ordering is retained, or we can discard element orders. A map will keep performance good on large slices.