C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML
Next: We apply the Distinct extension method to the array reference, and then assign the result to an implicitly typed local variable.
VarFinally: We loop over the result and display the distinct elements in the processed array.
C# program that removes duplicate elements
using System;
using System.Linq;
class Program
{
static void Main()
{
// Declare an array with some duplicated elements in it.
int[] array1 = { 1, 2, 2, 3, 4, 4 };
// Invoke Distinct extension method.
var result = array1.Distinct();
// Display results.
foreach (int value in result)
{
Console.WriteLine(value);
}
}
}
Output
1
2
3
4
Note: We can "transform" elements in an IEqualityComparer. Here we treat each int as its parity (whether it is even or odd).
Odd, EvenC# program that uses IEqualityComparer
using System;
using System.Linq;
using System.Collections.Generic;
class EqualityParity : IEqualityComparer<int>
{
public bool Equals(int x, int y)
{
// Consider all even numbers the same, and all odd the same.
return (x % 2) == (y % 2);
}
public int GetHashCode(int obj)
{
return (obj % 2).GetHashCode();
}
}
class Program
{
static void Main()
{
int[] array1 = { 9, 11, 13, 15, 2, 4, 6, 8 };
// This will remove all except the first event and odd.
var distinctResult = array1.Distinct(new EqualityParity());
// Display results.
foreach (var result in distinctResult)
{
Console.WriteLine(result);
}
}
}
Output
9
2
Version 1: We use the Distinct method. Note how the code is short and easy to read. This is a benefit.
Version 2: A nested loop scans following elements for a duplicate. An element is added only if no following elements are the same.
Result: On a short int array, the nested loops are faster. But this will depend on the data given to the methods.
C# program that benchmarks dedupe methods
using System;
using System.Linq;
using System.Collections.Generic;
using System.Diagnostics;
class Program
{
static IEnumerable<int> Test1(int[] array)
{
// Use distinct to check for duplicates.
return array.Distinct();
}
static IEnumerable<int> Test2(int[] array)
{
// Use nested loop to check for duplicates.
List<int> result = new List<int>();
for (int i = 0; i < array.Length; i++)
{
// Check for duplicates in all following elements.
bool isDuplicate = false;
for (int y = i + 1; y < array.Length; y++)
{
if (array[i] == array[y])
{
isDuplicate = true;
break;
}
}
if (!isDuplicate)
{
result.Add(array[i]);
}
}
return result;
}
static void Main()
{
int[] array1 = { 1, 2, 2, 3, 4, 4 };
const int _max = 1000000;
var s1 = Stopwatch.StartNew();
for (int i = 0; i < _max; i++)
{
// Version 1: benchmark distinct.
var result = Test1(array1);
if (result.Count() != 4)
{
break;
}
}
s1.Stop();
var s2 = Stopwatch.StartNew();
for (int i = 0; i < _max; i++)
{
// Version 2: benchmark nested loop.
var result = Test2(array1);
if (result.Count() != 4)
{
break;
}
}
s2.Stop();
Console.WriteLine(((double)(s1.Elapsed.TotalMilliseconds * 1000000) /
_max).ToString("0.00 ns"));
Console.WriteLine(((double)(s2.Elapsed.TotalMilliseconds * 1000000) /
_max).ToString("0.00 ns"));
Console.Read();
}
}
Output
185.44 ns Distinct method
51.11 ns Nested for-loops
Therefore: Heap allocations occur when you invoke Distinct. For optimum performance, you could use loops on small collections.
And: With small data sets, the overhead of using iterators and allocations likely overshadows any asymptotic advantage.