Last occurrence of number in sorted array

Last occurrence of number in sorted array

In last post, we discussed how to find first occurrence of number in a sorted array. Today’s problem is : given a sorted array, which contains duplicate elements, find last occurrence of given number in that sorted array. For example if A = [2,3,4,4,4,5,6,7], and last occurrence of 4 has to be found, output should be 4 (index where we saw 4 for the last time).

Classic binary search algorithm

In binary search algorithm, we divide array into two parts and check if the middle element is equal to searched key. If yes, we break and return mid index. In case, it is less than or greater than searched key, we discard either one or other half of array and continue search on other part. In this case, we are assuming that there is only one instance of key. What if duplicates are present in array. Then technically, our output is still correct, if question was to find if any occurrence of key. However, things change when array contains duplicates and interviewer asks you to find last occurrence of element.

Binary search algorithm implementation

#include <stdio.h>
  
int binarySearch(int a[], int start, int end, int key){
  
    if(start <= end) {
        int mid  = (start + end)/2;
  
        if(a[mid] == key) return mid;
  
        if(a[mid] < key) 
            return binarySearch(a, mid+1, end, key) ;
        else
            return binarySearch(a, start, mid-1, key) ;
    }
    return -1;
}
  
int main(void) {
    int a [] = {1,2,3,4,5,6,7,8};
    int size = sizeof(a)/sizeof(a[0]);
  
    printf("Element found at  : %d", 
            binarySearch(a, 0, size, 4));
  
    return 0;
}

For more detailed analysis and implementation of binary search algorithm, please refer: binary search algorithm implementation and analysis.

Last instance of number in sorted array

To find last occurrence of number, we should modify our classic binary search algorithm.
In classic BSA, we exit as soon as mid is equals to searched key. To solve this problem, once we find index of searched key, instead of returning, we should check right subarray. If numbers on right on middle index are same as key, move right till either we see different element or last index of array.

This method to find last occurrence has time complexity of O(n) which as good as linear search and is worst than binary search algorithm.

Once mid element matches key, that element is a candidate for solution to last occurrence. However, same number may exists after to mid too as duplicates are allowed. Since last occurrence is to be found, safely discard all elements on left side of mid (even if they are duplicates of key, they cannot be last occurrence of key. In worst case, current middle index can be last instance). However, still check on right side of mid for last occurrence. So, again repeat binary search on right subarray. Take care to check mid+1 index apart from mid for inequality and based on that either search in one of subarray or return the mid.
Condition is as follows.

if( mid == end && a[mid] == key) || (a[mid] == a[mid+1])

Ideally, mid will never be equal to end unless start is equal to end, in that case, we do not recur. So, we can safely omit first part of condition, but it is good to be defensive.

#include <stdio.h>
 
int findLastInstance(int a[], int start, int end, int key){
	
    if(start < end) {
        int mid  = (start + end)/2;
 
 	/*This condition can be written as 
 	( mid == end && a[mid] == key ) || (a[mid] == key && a[mid+1] > key )
 	however, mid will never be equal to mid, to lower bound of division
 	*/
        if(a[mid] == key && a[mid+1] > key ) 
            return mid; 
 
        else if(a[mid] <= key)
            return findLastInstance(a, mid+1, end, key);
        else if(a[mid] > key )
            return findLastInstance(a, start, mid-1, key);
    }
    return a[start] == key ? start : -1;
}
 
int main(void) {
    int a [] = {1,2,3,3,3,3,3,4,4,4,5,5};
    int size = sizeof(a)/sizeof(a[0]);
 
    printf("last instance of element found at  : %d", 
	        findLastInstance(a, 0, size-1, 5));
 
    return 0;
}

Iterative implementation to find first occurrence of number

#include <stdio.h>
 
int findLastInstance(int a[], int start, int end, int key){
 
	while(start < end) {
		int mid  = (start + end)/2;
 
		if(a[mid] == key && a[mid+1] > key) 
		    return mid;
 
		if(a[mid] <= key)
			start  = mid+1;
        else
			end = mid-1;
	}
    return a[start] == key ? start : -1;
}
 
int main(void) {
	int a [] = {1,2,3,3,3,3,3,4,4,4,5,5};
	int size = sizeof(a)/sizeof(a[0]);
 
	printf("First instance of element found at  : %d", 
	        findLastInstance(a, 0, size-1, 3));
 
	return 0;
}

Let’s take an example and see how it works. A = [1,2,3,3,3,3,4,4,4,5,5] Key = 4.
First of all, we will find the middle element, which is 3, which is not equal to the key rather less than key.
last occurrence of number

We can safely discard left part of array and start looking for element in right part.
binary search questions

Now, we adjusted our first and last element of subarray, new middle will be element 4.
binary search problems

Since element is middle index is equal to key, it is candidate for output. But as explained, we cannot be sure that it is the last instance. We can check the mid+1 if that is greater than key, then this middle index is last occurrence of key. In this case, mid+1 is equal to key, so we can safely discard subarray till mid and find new mid for remaining subarray.
last occurrence of element

Now, middle element is greater than key, hence we discard right subarray including mid.
last occurrence in array

Now, since start and end are equal, we check if a[start] is equal to key, we return index as last occurrence of element in array.
binary search algorithm

Complexity of binary search algorithm to find last occurrence of a number in sorted array is O(log n) which is much better than linear search.

Please share if there is something wrong or missing. If you want to contribute to website, please contact us.