Removing duplicates from an array of integers

Given an unsorted array of integers, remove all duplicates from array. For example if input array is A = [2,3,4,4,1,2,3,3], after removing duplicates,
final output should be A = [2,3,4,1] where duplicates of 2,4 and 1 are removed. Notice that it is not necessary that duplicates are clubbed together in array.

Method 1 : Brute force 
First option which comes to mind is: sort given array using quick sort or merge sort with O(n log n) complexity and then in linear time remove duplicates as all duplicates will then be together. Detailed explanation how duplicates from a sorted array can be removed is given in this post : Strings : Remove duplicate characters from string

How about if the constraint is that we cannot sort the array first, other way to put this constraint is to say that elements in output array should preserve the order they were in input array. Some basic algorithm to remove duplicate elements will be to start from first element till the last element repeat following:

1. Start with index  i = 0 to n and repeat 2 
2. Check from the i +1 index to n, if element matches the current element of step 1, move all the elements on right side of this element by one left.

Complexity of this algorithm to remove duplicates from array will be O(n3).

Method 2 : Use binary search tree
Second method to solve problem is to use binary search tree. Construct a binary search tree using elements of array and discard all elements which are duplicates. Tree will contain only non duplicate values. However, order in original array is not maintained in this solution. Complexity of above code is O(n log n) as each node insertion will take O(log n) and there are n nodes to be added.

Method 3 : Using merge sort

Last but most efficient solution is to use merge sort and remove duplicates in merge step. This solution requires extra book keeping while merge step of merge sort.To understand merge sort please refer : Merge sort algorithm

Code to remove duplicates using merge sort.

This is has complexity of O(n log n) and does not preserve the order. Also, it requires O(n) extra space. However, this method is very useful when there is a big array with lots of duplicates to be removed.

  • Dhanaraj D

    We can use hashing technique right ? It will be O(N) time & space. Better than merge method. Am I missing something ?

    • http://algorithmsandme.com/ Jitendra Sangar

      Yes, we can definitely use that. and it will be better than merge sort method,