First occurrence of number in sorted array

First occurrence of number in sorted array

Today’s problem is : given a sorted array, which contains duplicate elements, find first occurrence of a given number in that sorted array. For example if A = [2,3,4,4,4,5,6,7], and first occurrence of 4 has to be found, output should be 2 (index where we saw 4 for the first time)

Classic binary search algorithm

In binary search algorithm, we divide array into two parts and check if the middle element is equal to searched key. If yes, we break and return mid index. In case, it is less than or greater than searched key, we discard either one or other half of array and continue search on other part. In this case, we are assuming that there is only one instance of key. What if duplicates are present in array. Then technically, our output is still correct, if question was to find if any occurrence of key. However, things change when array contains duplicates and interviewer asks you to find specific occurrence of element.

Binary search algorithm implementation

#include <stdio.h>
  
int binarySearch(int a[], int start, int end, int key){
  
    if(start <= end) {
        int mid  = (start + end)/2;
  
        if(a[mid] == key) return mid;
  
        if(a[mid] < key) 
            return binarySearch(a, mid+1, end, key) ;
        else
            return binarySearch(a, start, mid-1, key) ;
    }
    return -1;
}
  
int main(void) {
    int a [] = {1,2,3,4,5,6,7,8};
    int size = sizeof(a)/sizeof(a[0]);
  
    printf("Element found at  : %d", 
            binarySearch(a, 0, size, 4));
  
    return 0;
}

For more detailed analysis and implementation of binary search algorithm, please refer: binary search algorithm implementation and analysis.

First instance of number in sorted array

To find first occurrence of number, we should modify our classic binary search algorithm.
In classic BSA, we exit as soon as mid is equals to searched key. To solve this problem, once we find index of searched key, we have to check left of it. If numbers on left on middle index are same as key, move left till either we see different element or first index of array.
This method to find first occurrence has time complexity of O(n) which as good as linear search and is worst than binary search algorithm.Let’s optimize it further.

Once mid element matches key, that element is a candidate for solution. However, same number may exists prior to mid too as duplicates are allowed. Since find first occurrence to be found, safely discard all elements on right side of the mid (even if they are duplicates of key, they cannot be first occurrence of key. In worst case, current middle index can be first instance). However, still check on left side of mid for first occurrence. So, again repeat binary search on left subarray. Take care to include middle index as end of subarray when middle index in candidate for first occurrence of key.

#include <stdio.h>
 
int findFirstInstance(int a[], int start, int end, int key){
 
    while(start < end) {
        int mid  = (start + end)/2;
 
        if((mid == start && a[mid] == key) || 
           ( a[mid] == key && a[mid] > a[mid-1]) ) 
           return mid;
 
        if(a[mid] < key)
            start  = mid+1;
        else
	    end = mid-1;
	}
    return a[start] == key ? start : -1;
}
 
int main(void) {
	int a [] = {1,2,3,3,3,4,4,4,4,4,5,5};
	int size = sizeof(a)/sizeof(a[0]);
 
	printf("First instance of element found at  : %d", 
	        findFirstInstance(a, 0, size-1, 3));
 
	return 0;
}

Recursive implementation to find first occurrence of number

#include <stdio.h>
 
int findFirstInstance(int a[], int start, int end, int key){
 
	if(start < end) {
		int mid  = (start + end)/2;
 
		if(( mid == start && a[mid] == key) ||
		    (a[mid] == key && a[mid] > a[mid-1]))
		     return mid;
 
		if(a[mid] < key)
			return findFirstInstance(a, mid+1, end, key);
        else
			return findFirstInstance(a, start, mid-1, key);
		}
    return start;
}
 
int main(void) {
	int a [] = {1,2,3,3,3,3,3,4,4,4,5,5};
	int size = sizeof(a)/sizeof(a[0]);
 
	printf("First instance of element found at  : %d", 
	       findFirstInstance(a, 0, size-1, 5));
 
	return 0;
}

Let’s take an example and see how it works. A = [1,2,3,3,3,3,4,4,4,5,5] Key = 4.
First of all, we will find the middle element, which is 3, which is not equal to the key rather less than key.
first occurrence of a number

We can safely discard left part of array and start looking for element in right part.

discard for find first occurrence

Now, we adjusted our first and last element of subarray, new middle will be element 4.

first occurrence in sorted array

Since element is middle index is equal to key, it is candidate for output. But as explained, we cannot be sure that it is the first instance. Since, the a[mid-1] is not less than key, that implies, it is equal, we will change the end index to mid-1 and continue.

binary search questions

We find that new middle is again equal to key, hence we can safely discard right part of array.
first instance of number in array

If a[mid] is not equal to key, in that case, we discard the left and look into the right part.
In this case, since middle index is equal to key, and a[mid-1] is not less than key, we will reduce end index to mid-1.

Since, middle index is again equal to key and mid == start, hence we return the middle index.

Complexity of binary search algorithm to find first occurrence of a number in sorted array is O(log n) which is much better than linear search.

Please share if there is something wrong or missing. If you want to contribute to website, please contact us.