Search in sorted matrix

Search element in a sorted matrix

Today’s problem search in sorted matrix, that is to find an element in column and row sorted matrix. For example, given matrix is as below and element to be searched is 11, then output should be (2,2).

1,5,9, 13 
2,6,10,14
3,7,11,15
4,8,12,16

This problem is frequently asked to check understanding of binary search algorithm.

Search in sorted matrix : Algorithm

Brute force solution would be to scan all mXn elements and find the element. This approach has complexity of O(n2). Can we optimize it?
As mentioned in problem statement, each row and column is sorted.How can we use this information? Based on value of key and current element, we can discard some part of matrix.
Start from left-bottom most element of matrix. We know that it is greatest in first column and least in last row.
Check if element is key, if yes return index (i,j)
If it is not equal to key, then check if it is less than key, then move column to right as all above element in this column will be less than current column element and hence Key can not be in this column.
If current element is greater than key, check in row above current row as all elements on below of current element are greater and key can not be there.
Repeat these steps till we either find the key or we have examined all possible candidates.

Search in sorted matrix : implementation

#include <stdio.h>
 
int search_matrix(int a[4][4], int N, int M, int key){
 
    int i, j;
    int element;
    i = M-1;
    j = 0;
    while(i>=0 && j<N){ element = a[i][j]; 
    if(element == key) { 
         printf("Element found at (%d, %d)", i, j); return 0;
    } 
    if(element > key){
        /* If current element is greater than Key,
            we move up in matrix */
            i--;
        }
        else
        /* If current element is less than key,
           we move right of the matrix */
           j++;
    }
 	printf("Nothing found!!");
    return -1;
}
 
int main(void) {
	int a[4][4] = {
		       1,2,3,4,
		       5,6,7,8,
		       9,10,11,12,
		       13,14,15,16
	};
	
   	search_matrix( a, 5,5, 11);
	return 0;
}

Let’s see how it works. We will use the same matrix as in code and we are searching for 11.
We start from left bottom most cell. Element at that cell is 13, which is greater than key. In this case, we should be moving up the column, means decrease the row.
search in sorted matrix

Now, the element to be considered is 9 which is less than key, hence we move in same row to right.
binary search problems

Next is element 10, which is less than key, hence we again move right.
binary search questions

Now when move right, we hit element equals to key and we return the indices.

Complexity of binary search algorithm to find an element in sorted matrix is O(M+N) where M and N are rows and column of the matrix respectively.

Please share if there is something wrong or missing. If you are interested to contribute and earn money, please contact us.

Last occurrence of number in sorted array

Last occurrence of number in sorted array

In last post, we discussed how to find first occurrence of number in a sorted array. Today’s problem is : given a sorted array, which contains duplicate elements, find last occurrence of given number in that sorted array. For example if A = [2,3,4,4,4,5,6,7], and last occurrence of 4 has to be found, output should be 4 (index where we saw 4 for the last time).

Classic binary search algorithm

In binary search algorithm, we divide array into two parts and check if the middle element is equal to searched key. If yes, we break and return mid index. In case, it is less than or greater than searched key, we discard either one or other half of array and continue search on other part. In this case, we are assuming that there is only one instance of key. What if duplicates are present in array. Then technically, our output is still correct, if question was to find if any occurrence of key. However, things change when array contains duplicates and interviewer asks you to find last occurrence of element.

Binary search algorithm implementation

#include <stdio.h>
  
int binarySearch(int a[], int start, int end, int key){
  
    if(start <= end) {
        int mid  = (start + end)/2;
  
        if(a[mid] == key) return mid;
  
        if(a[mid] < key) 
            return binarySearch(a, mid+1, end, key) ;
        else
            return binarySearch(a, start, mid-1, key) ;
    }
    return -1;
}
  
int main(void) {
    int a [] = {1,2,3,4,5,6,7,8};
    int size = sizeof(a)/sizeof(a[0]);
  
    printf("Element found at  : %d", 
            binarySearch(a, 0, size, 4));
  
    return 0;
}

For more detailed analysis and implementation of binary search algorithm, please refer: binary search algorithm implementation and analysis.

Last instance of number in sorted array

To find last occurrence of number, we should modify our classic binary search algorithm.
In classic BSA, we exit as soon as mid is equals to searched key. To solve this problem, once we find index of searched key, instead of returning, we should check right subarray. If numbers on right on middle index are same as key, move right till either we see different element or last index of array.

This method to find last occurrence has time complexity of O(n) which as good as linear search and is worst than binary search algorithm.

Once mid element matches key, that element is a candidate for solution to last occurrence. However, same number may exists after to mid too as duplicates are allowed. Since last occurrence is to be found, safely discard all elements on left side of mid (even if they are duplicates of key, they cannot be last occurrence of key. In worst case, current middle index can be last instance). However, still check on right side of mid for last occurrence. So, again repeat binary search on right subarray. Take care to check mid+1 index apart from mid for inequality and based on that either search in one of subarray or return the mid.
Condition is as follows.

if( mid == end && a[mid] == key) || (a[mid] == a[mid+1])

Ideally, mid will never be equal to end unless start is equal to end, in that case, we do not recur. So, we can safely omit first part of condition, but it is good to be defensive.

#include <stdio.h>
 
int findLastInstance(int a[], int start, int end, int key){
	
    if(start < end) {
        int mid  = (start + end)/2;
 
 	/*This condition can be written as 
 	( mid == end && a[mid] == key ) || (a[mid] == key && a[mid+1] > key )
 	however, mid will never be equal to mid, to lower bound of division
 	*/
        if(a[mid] == key && a[mid+1] > key ) 
            return mid; 
 
        else if(a[mid] <= key)
            return findLastInstance(a, mid+1, end, key);
        else if(a[mid] > key )
            return findLastInstance(a, start, mid-1, key);
    }
    return a[start] == key ? start : -1;
}
 
int main(void) {
    int a [] = {1,2,3,3,3,3,3,4,4,4,5,5};
    int size = sizeof(a)/sizeof(a[0]);
 
    printf("last instance of element found at  : %d", 
	        findLastInstance(a, 0, size-1, 5));
 
    return 0;
}

Iterative implementation to find first occurrence of number

#include <stdio.h>
 
int findLastInstance(int a[], int start, int end, int key){
 
	while(start < end) {
		int mid  = (start + end)/2;
 
		if(a[mid] == key && a[mid+1] > key) 
		    return mid;
 
		if(a[mid] <= key)
			start  = mid+1;
        else
			end = mid-1;
	}
    return a[start] == key ? start : -1;
}
 
int main(void) {
	int a [] = {1,2,3,3,3,3,3,4,4,4,5,5};
	int size = sizeof(a)/sizeof(a[0]);
 
	printf("First instance of element found at  : %d", 
	        findLastInstance(a, 0, size-1, 3));
 
	return 0;
}

Let’s take an example and see how it works. A = [1,2,3,3,3,3,4,4,4,5,5] Key = 4.
First of all, we will find the middle element, which is 3, which is not equal to the key rather less than key.
last occurrence of number

We can safely discard left part of array and start looking for element in right part.
binary search questions

Now, we adjusted our first and last element of subarray, new middle will be element 4.
binary search problems

Since element is middle index is equal to key, it is candidate for output. But as explained, we cannot be sure that it is the last instance. We can check the mid+1 if that is greater than key, then this middle index is last occurrence of key. In this case, mid+1 is equal to key, so we can safely discard subarray till mid and find new mid for remaining subarray.
last occurrence of element

Now, middle element is greater than key, hence we discard right subarray including mid.
last occurrence in array

Now, since start and end are equal, we check if a[start] is equal to key, we return index as last occurrence of element in array.
binary search algorithm

Complexity of binary search algorithm to find last occurrence of a number in sorted array is O(log n) which is much better than linear search.

Please share if there is something wrong or missing. If you want to contribute to website, please contact us.

First occurrence of number in sorted array

First occurrence of number in sorted array

Today’s problem is : given a sorted array, which contains duplicate elements, find first occurrence of a given number in that sorted array. For example if A = [2,3,4,4,4,5,6,7], and first occurrence of 4 has to be found, output should be 2 (index where we saw 4 for the first time)

Classic binary search algorithm

In binary search algorithm, we divide array into two parts and check if the middle element is equal to searched key. If yes, we break and return mid index. In case, it is less than or greater than searched key, we discard either one or other half of array and continue search on other part. In this case, we are assuming that there is only one instance of key. What if duplicates are present in array. Then technically, our output is still correct, if question was to find if any occurrence of key. However, things change when array contains duplicates and interviewer asks you to find specific occurrence of element.

Binary search algorithm implementation

#include <stdio.h>
  
int binarySearch(int a[], int start, int end, int key){
  
    if(start <= end) {
        int mid  = (start + end)/2;
  
        if(a[mid] == key) return mid;
  
        if(a[mid] < key) 
            return binarySearch(a, mid+1, end, key) ;
        else
            return binarySearch(a, start, mid-1, key) ;
    }
    return -1;
}
  
int main(void) {
    int a [] = {1,2,3,4,5,6,7,8};
    int size = sizeof(a)/sizeof(a[0]);
  
    printf("Element found at  : %d", 
            binarySearch(a, 0, size, 4));
  
    return 0;
}

For more detailed analysis and implementation of binary search algorithm, please refer: binary search algorithm implementation and analysis.

First instance of number in sorted array

To find first occurrence of number, we should modify our classic binary search algorithm.
In classic BSA, we exit as soon as mid is equals to searched key. To solve this problem, once we find index of searched key, we have to check left of it. If numbers on left on middle index are same as key, move left till either we see different element or first index of array.
This method to find first occurrence has time complexity of O(n) which as good as linear search and is worst than binary search algorithm.Let’s optimize it further.

Once mid element matches key, that element is a candidate for solution. However, same number may exists prior to mid too as duplicates are allowed. Since find first occurrence to be found, safely discard all elements on right side of the mid (even if they are duplicates of key, they cannot be first occurrence of key. In worst case, current middle index can be first instance). However, still check on left side of mid for first occurrence. So, again repeat binary search on left subarray. Take care to include middle index as end of subarray when middle index in candidate for first occurrence of key.

#include <stdio.h>
 
int findFirstInstance(int a[], int start, int end, int key){
 
    while(start < end) {
        int mid  = (start + end)/2;
 
        if((mid == start && a[mid] == key) || 
           ( a[mid] == key && a[mid] > a[mid-1]) ) 
           return mid;
 
        if(a[mid] < key)
            start  = mid+1;
        else
	    end = mid-1;
	}
    return a[start] == key ? start : -1;
}
 
int main(void) {
	int a [] = {1,2,3,3,3,4,4,4,4,4,5,5};
	int size = sizeof(a)/sizeof(a[0]);
 
	printf("First instance of element found at  : %d", 
	        findFirstInstance(a, 0, size-1, 3));
 
	return 0;
}

Recursive implementation to find first occurrence of number

#include <stdio.h>
 
int findFirstInstance(int a[], int start, int end, int key){
 
	if(start < end) {
		int mid  = (start + end)/2;
 
		if(( mid == start && a[mid] == key) ||
		    (a[mid] == key && a[mid] > a[mid-1]))
		     return mid;
 
		if(a[mid] < key)
			return findFirstInstance(a, mid+1, end, key);
        else
			return findFirstInstance(a, start, mid-1, key);
		}
    return start;
}
 
int main(void) {
	int a [] = {1,2,3,3,3,3,3,4,4,4,5,5};
	int size = sizeof(a)/sizeof(a[0]);
 
	printf("First instance of element found at  : %d", 
	       findFirstInstance(a, 0, size-1, 5));
 
	return 0;
}

Let’s take an example and see how it works. A = [1,2,3,3,3,3,4,4,4,5,5] Key = 4.
First of all, we will find the middle element, which is 3, which is not equal to the key rather less than key.
first occurrence of a number

We can safely discard left part of array and start looking for element in right part.

discard for find first occurrence

Now, we adjusted our first and last element of subarray, new middle will be element 4.

first occurrence in sorted array

Since element is middle index is equal to key, it is candidate for output. But as explained, we cannot be sure that it is the first instance. Since, the a[mid-1] is not less than key, that implies, it is equal, we will change the end index to mid-1 and continue.

binary search questions

We find that new middle is again equal to key, hence we can safely discard right part of array.
first instance of number in array

If a[mid] is not equal to key, in that case, we discard the left and look into the right part.
In this case, since middle index is equal to key, and a[mid-1] is not less than key, we will reduce end index to mid-1.

Since, middle index is again equal to key and mid == start, hence we return the middle index.

Complexity of binary search algorithm to find first occurrence of a number in sorted array is O(log n) which is much better than linear search.

Please share if there is something wrong or missing. If you want to contribute to website, please contact us.

Find first circular tour that visits all petrol pumps

First circular tour that visits all petrol pumps

Assuming there are n petrol pumps situated at circumference of a circle.  Each petrol pump can fill certain amount of petrol to your truck or bike. Given the information capacity of each petrol pump and distance of this petrol pump to next one, find first circular tour that visits all petrol pumps. You cannot move from one petrol pump to next if distance is more than petrol you have. Assumption is truck goes 1 unit of distance for 1 unit of petrol and it has infinite capacity tank.
circular tour that visits all petrol pumps

Example inout to problem be : (4,6),(7,5),(6,5). First set gives capacity of first petrol pump i.e 4 and distance to next petrol pump i.e. 6. As we can see that this cannot be the starting point of tour as distance to travel is more than petrol i.e  net available petrol is 4-6 =2. What if we start from next station which has 7 liter of petrol, and distance to next pump is 5. We can travel to station 2. Available petrol with us is 7-5 = 2.  We get 6 liters more at station 2, petrol = 8 and distance to be travelled is only 5. When we reach at station 0 (mind you it’s a circle), we have available petrol as 8-5 =3. If we want to go to station 1, to complete the circle (we started with station 1), we have petrol  = 3 + 4 = 7 and distance to travel is 6 and hence, we can move. To sum up, station 1 should be our starting point of circular tour which visits all petrol pumps.

Circular tour that visits all petrol pumps

Brute force approach as explained in example will be to start with  a random petrol pump as starting point for circular tour. Our end should be immediately next petrol pump. As we move in circle, our start and end points move too, based on conditions. When end becomes to start, that mean we have finished the circle.

StartPoint = 0; endPoint = 1;

How to decide if our truck can reach to next petrol station? If capacity of start station is more than the distance to be travelled, then yes. We keep starting point as same, and end point moves forward to next to next station. Update available petrol too.

endPoint = (endPoint + 1)%numStations
availablePetrol = petrol[startPoint] - distance[startPoint]

What if that is not the case, i.e distance to be travelled is more than petrol available? Then, we cannot move from current start point to next station. Hence, change start point to next station and start all over again.

startPoint = (startPoint + 1)%numStations
endPoint = (endPoint + 1)%numStations

For each petrol pump, we have to possibilities, first, we have available petrol, and we can move to next station, or we don’t have enough available petrol and we change our starting point.

Brute force approach as discussed above has complexity of O(n2), as we check each petrol pump as start point and find if circular tour that visits all petrol pump is possible or not.

Let’s try optimizing it. We can use a queue to enqueue all the petrol pumps which are already visited while there was available petrol. If we reach a petrol pump where available petrol becomes negative,  that means, we have to travel more distance that petrol available. Start dequeueing from queue station, till available petrol become positive again. The station when available petrol becomes positive is the new start point to consider for circular tour.

Circular tour that visits all petrol pumps : Queue based approach

In queue based implementation, we have to used an auxiliary queue which adds to extra space complexity. We can use same array as queue and get rid of this extra space requirement. We can use start and end stations in tour as front and rear of queue.

Implementation using input array as queue

#include <stdio.h>
 
typedef struct station
{
  int petrol;
  int distance;
}Station;
 
int circularTour(Station stations[], int n)
{
    int startPos = 0;
    int end =  1;
 
    int availablePetrol = stations[startPos].petrol 
                          - stations[startPos].distance;
 
    while (end != startPos || availablePetrol < 0)
    {
        while (availablePetrol < 0 && startPos != end)
        {
            availablePetrol -= stations[startPos].petrol 
                               - stations[startPos].distance;
            startPos = (startPos + 1)%n;
 
            if (startPos == 0)
               return -1;
        }
        
        availablePetrol += stations[end].petrol - stations[end].distance;
 
        end = (end + 1)%n;
    }
 
    // Return starting point
    return startPos;
}
 
// Driver program to test above functions
int main()
{
    int n = 3;
    Station stations[n];
    
    stations[0].petrol = 6;
    stations[1].petrol = 3;
    stations[2].petrol = 7;

    stations[0].distance = 4;
    stations[1].distance = 6;
    stations[2].distance = 3;
 
    printf("Start point  : %d ", circularTour(stations, n));
 
    return 0;
}

There is another good solution to problem which does not use queue at all and is based on greedy algorithm.

1. If sum of petrol at all stations is more than sum of distances between all stations, it imply that there is no solution possible.

2. We start at i, and hit available petrol as negative at j. As we know TotalSum(petrol) – TotalSum(distance) is greater than 0. Discard i to j and start from j + 1 station.

Circular tour that visits all petrol pumps : Greedy algorithm

Complexity to find circular tour that visits all petrol pumps using queues and greedy approach is O(N).

#include <stdio.h>

typedef struct station{
	int petrol;
	int distance;
} Station;

int circularTour(Station stations[], int n) {
	
	int total = 0;
	int availablePetrol = 0;
	int startPos = 0;
    
	for(int i=0; i<n; i++) {
		int delta = stations[i].petrol - stations[i].distance;
		if(availablePetrol >= 0)
			availablePetrol += delta;
		else {
			availablePetrol = delta;
			startPos = i;
		}
		total += delta;
	}
 
        return total >= 0 ? startPos : -1;
}
 
int main() {
	int n = 3;
	Station stations[n];
	
	stations[0].petrol = 6;
	stations[1].petrol = 3;
	stations[2].petrol = 7;

	stations[0].distance = 4;
	stations[1].distance = 6;
	stations[2].distance = 3;

	printf("Start point  : %d ", circularTour(stations, n));
      
        return 0;
}

Please share if there is something wrong or missing. If you are interested in contributing to algorithms and me, please contact us and earn money and respect of fellow coders.

Program to check if a string is a Pangram

What is a Pangram?

A pangram is a sentence containing every letter of the alphabets. Our job today is to try to find if the given string is a pangram or not. This will be an O(n) time and O(2n) space complexity problem with n being the length of the string. Reason being that we need to traverse the string all the way to understand if all the alphabets are in the string or not. One can add an exit condition where if the pangram is found, we ignore the rest of the string but again, that means we put in additional checks after each find of an alphabet. Not worth the extra computations required except if we are talking of very big strings. Lets try to tackle that in part 2.

Our simple code converts all the characters in a given string to lower case first. We use the std::transform function in <algorithms> to do that. We also have initially declared a map of all the alphabets with value set to false. Now our task is pretty simple. We iterate through the string and mark the value of the alphabet in the map (i.e. change it to true) if that alphabet is encountered in the map. Accessing map elements is pretty fast with a runtime complexity of O(1). Similarly, accessing array elements in string is O(1). We need to iterate over the whole string. Lastly, we iterate over our map and see if all the elements are set to true. If yes, the string is a pangram otherwise it is not.

Code to find if sentence is pangram or not?

Remember that we skip constants in calculating space and time complexity. Time complexity in this case is O(n) and space complexity is O(2N).

Visit me at: https://www.naresh.se/
Follow me on Twitter: @wolverine2k

This post is contributed by Naresh, please contact us if you are willing to contribute to algorithms and me.

Reverse a string O(n/2) time, O(n) space

Reverse a string

We all know how to reverse a string right? But the quest to always find something better than existing algorithms makes us quite excited. But before jumping into the code, let’s try and see the best case i.e.(Ω) for this problem. It will allow us to find out the best possible time we can have to solve the problem. Nothing else faster than that can be achieved.

Algorithm to reverse a string

1. Iterate from beginning of string to n/2 
2. Swap the elements boundary elements 
3. If string is of even length, continue till n/2 and swap that one 
4. Else we let the middle element remain as it is (i.e. continue to (n-1)/2)

This algorithm to reverse a string gives best possible runtime of Ω(n/2). Can we do any better? Looks like not, since we will need to traverse the string halfway anyways. So, if we implement an algorithm to have a Big-O of n/2 as well, then our Θ i.e. average time will also be n/2. That’s a relief, we will have a linear time complexity growth. Our space complexity will always be O(n) since we will not allocate any space and do in-place replacements in the string. I have not been able to find anything else faster than this one. But write in the comments if there is something faster. I am thinking for maybe dynamic programming or multi-threading but we address that in the next post.

Let’s see if I can explain it better. Suppose the string is “ABCDE”. Our loop will iterate 2 times (5/4). In the first iteration, A will be swapped with E. And the 2nd iteration B will be swapped with D. E will remain at its place. So the reversed string will be “EDCBA”. Looks like that is the one we want. Try similar with an even string and it will still be the same.

Now the code! I am using c++ and will use c++14 where possible. Of course we can use the std::reverse as well BUT that one is a time complexity O(n). Check the logic here. The code for our logic is as below:

Cheers,
Naresh

visit me at: https://www.naresh.se/
Follow me on twitter: @wolverine2k

Thanks Naresh for contributing to Algorithms and Me, if you want to contribute, please contact us and refer publishing at Algorithms and Me.

Lowest common ancestor with Range Minimum Query

Lowest common ancestor with Range Minimum Query

We already have discussed least common ancestor and range minimum query. In this post, we will discuss how to use RMQ to find Least common ancestor of two given nodes.  This is post will concentrate on finding Lowest common ancestor with Range Minimum Query. LCA of (u,v) is the node which is furthest from root and u and v are descendent of it. For example, LCA of (5,9) is 2.

LCA using RMQ

In earlier solutions, we were scanning tree each time a query if fired to find LCA of two nodes which has complexity of O(n). If this query if fired frequently, this operation may become bottleneck. One way to avoid processing all nodes when LCA is asked, is to preprocess tree and store that information in such a way that LCA can be found instantly, say in O(1).

This pattern is very similar to Range Query Minimum algorithm. Can we reduce lowest common ancestor problem to range query minimum problem?

Reduction of LCA to RMQ

Let’s revise what is RMQ : Given array A of length n.  RMQA(i,j) – returns the index of the smallest element in the subarray A[i..j].  Suppose an algorithm has:  Preprocessing time –  Query time –  Notation for the overall complexity of an algorithm: <f(n), g(n) >.

Let’s find LCA of two nodes 5 and 8 manually in above graph. We notice that LCA(u,v) is shallowest common node (in terms of distance from root) which is visited when u and v are visited using depth first search of tree.  Important thing to note is that we are interested in shallowest, which is minimum depth, node between u and v. Sounds like RMQ?

Implementation wise, tree is traversed as Euler tour. At most there can be 2n-1 nodes in Euler tour of tree, store this tour in an array of  2n-1. Then we require depth of each node while doing Euler tour, so we store this information in another arrays of size 2n-1. Any depth would do as far as consistency is maintained, but we will store depth when node was visited first.

1. Euler Tour [1,..,2n-1] – The nodes visited in an Euler tour of T. Euler[i] is the label of the ith node visited in the tour. 
2. Depth[1,..2n-1] – The level of the nodes tour. Level[i] is the level of node Euler[i]. (level is defined to be the distance from the root).
3. FirstVisited[1,..n] – FirstVisited[i] will hold value when node is first visited.

For example of this computation

lowest common ancestor using range minimum query example

To compute LCA(u,v):  All nodes in the Euler tour between the first visits to u and v are E[FV[u],..,FV[v]] (assume FV[u] < FV[v]). 

The shallowest node in this sub-tour is at index RMQ D(FV[u],FV[v]), since D[i] stores the depth of node at E[i]. 

RMQ will return index of shallowest node between u and v, thus output node will be E[RMQ D(FV[u],FV[v])] as LCA(u,v)

lca using rmq example

lowest common ancestor using range minimum query implementation

Beauty of this algorithm to find lca using rmq is that it can be used to find LCA of any tree not just only binary tree or BST. Complexity of algorithm to find lowest common ancestor using range minimum query is <O(n), O(1)> with additional space complexity of O(n).

Please share if there is something wrong or missing, we would love to hear from you. If you are interest in contributing to algorithms and me,please refer to publisher page.

Range minimum query (RMQ)

What is Range minimum query?

Sometimes we are asked to find index of minimum element within range of an array, this operation is called as range minimum query (RMQ). For example, if given array A = [2,3,4,6,8,1,5,9], ask is to find index of minimum value between index 2 and 7, returned answer would be 5.

range minimum query

Going by the brute force, every time a query if fired, we scan the range and find the minimum in given range in same way as we do for entire array. Complexity of each query being answered is O(N) where in worst case range is  entire array.

Can we preprocess our data, so that our query operations are less costly? If we do so, there are two parts to the solution now, first, preprocessing and second query. Let’s assume complexity of each step is f(n) and g(n) respectively, then complexity of solution can be denoted as ( f(n), g(n)).

What kind of preprocessing can be done? Basic idea is to calculate minimum index of all the ranges possible in array. How many ranges are possible for an array with N elements? It’s N X N ranges. Why?

So, to store minimum index of each range, O(N2) order space is required and time complexity goes to O(N3). However, complexity of query is O(1). So overall complexity of solution is ( O(N3), O(1) ).

With application of dynamic programming, complexity of preprocessing step can be reduced to O(N2).

Can we do better for preprocessing step while trading off query step? If we divide array into smaller chunks and store minimum element index in those chunks, will it help? And what should be size of chunk? Let’s divide array in √n parts, where n is size of part.

Dividing into square root parts rmq

Now, find minimum element index in each of this chunk, and store it. Extra space required is (√n). Finding minimum for each chunk has complexity of (√n * √n) as O(n).

To find minimum element index in given range, follow three steps:

1. Find minimum index of all chunks lying between start and end of given range. max √n operations
2. Find minimum index in chunk where start of range lies max √n comparisons from start of range to end of chunk.
3. Find minimum index in chuck where end of range lies from start of chunk to end of range.
4. Compare these three values and return minimum

No matter, how big or small range is to find minimum index, worst case will be O(√n) as there are only 3*√n operations.

Let’s take an example and see how it works. Find minimum in range (2,7)

range minimum query example

To get RMQ(2,7) in the array, what are the chunks with are lying with in range, it’s just one, chunk 1. , minimum index of chunk 1 is M[1] = 5. Now find minimum index in chunk 0 where start of range lies (starting from start of range which 2). It’s index 2.  Last find minimum from start of chunk where end of range lies to end of range. It’s index 6.

At end compare A[5] (minimum of all chunks between start and end of range ), A[2] (minimum in chunk where start of range lies) and A[6] (minimum in chunk where end of range lies) and we have the answer as 5 as it minimum of three.

Aggregating all thing, we found a way to optimize solution of range minimum query with complexity as ((o(n), O(√n)).

Range minimum query using sparse table

Method 3 uses only O(√n) space however query time complexity is also O(√n). To reduce query time at the expense of space, there is another method called as sparse table method. This method uses features of method 2 (dynamic programming) and feature 3 (find minimums of chunks).

In this approach, we split inout array into chunks of size 2j where j varies from 0 to log n and n is number of elements in array. There will be n log n such chunks and hence the space complexity becomes O(n log n).

spare table implementation of RMQ

After splitting, find minimum in each chunk and store it into look up table. M[i][j] stores minimum in range from i  with size 2j. For example, M[0][3] store minimum index between 0 and 7 (8 elements). Now problem is how to create this lookup table? This table can be created using dynamic programming from bottom up. Specifically, we find minimum in a block of size 2j by comparing the two minima of its two constituent blocks of size 2j-1. More formally,

M[i,j] = M[i, j -1] if A[M[i, j -1]] <= A[M[i+2j-1, j -1]] 
M[i, j] = M[i+2j-1, j -1] otherwise.

How to find minimum index in given range now? Idea is to find two sub ranges which cover the entire range and then find minimum of minimum of these two ranges. For example, find RMQ(i,j). If 2k be size of largest block that fits into the range from i to j, then k = log(j – i + 1).
Now, we have two parts to look in from i to i+2k + 1 (already computed as M[i,k] ) and from j-2k+1 (already computed as M[j-2k+1, k],).
Formally,

    RMQ(i,j) =  M[i][k] if A[ M[i][k] ] <= A[M[j-2k+1, k]]
    RMQ(i,j) =  M[j-2k+1, k]

These two block entirely cover the range and since only once comparison required, complexity of lookup will be O(1).

In this post we discussed various ways to implement range minimum query based on space and time complexity trade off. In future posts, we will discuss about applications of RMQ such as segmented trees and least common ancestor problem.

Please share if something is wrong or missing, we would love to hear from you.

Intersection of two arrays – Three ways

Intersection of two arrays

Today’s problem involves two arrays, which are not sorted. All we need to find is intersection of two arrays.  By finding intersection, we mean find common elements in given two arrays. For example,

Array1 => 1,4,3,2,5,6
Array2 => 3,2,1,5,6,7,8,10

Intersection of array1 and array2 => 1,3,2,5,6

Ways to find intersection of two arrays

Method 1 : Sort array and then use binary search

As arrays given are unsorted, sort one of the array, preferably the larger one. Then search each element of other array in first sorted array using binary search. If element is present, it goes into intersection array else not.

Complexity of this method to find intersection is O(N log N) for sorting and then M log N for searching. Effective time complexity becomes O((N+M) log N) which is not optimal.

Method 2 : Sort and use merge to find common elements

Again this method, sort two arrays first. Then use two pointers to scan both arrays simultaneously. (Please refer to merge part of merge sort). Difference is we have to put in result only those elements which are common, instead of all.

Complexity of this method is O(N log N) + O ( M log M ) for sorting and then O(M+N) for scanning both arrays which is worst then the first method.

Method 3: Use hash

Create a hash with key as elements from smaller array (saves space). Then scan through other other and see if the element is present in hash. If yes, put into intersection array else not.

This method has complexity of O(N) where N is number of elements in bigger elements and extra space complexity of O(M) where M is number of elements in smaller array.

These method to find intersection of two arrays do not work when there are duplicate elements in any of the array as they will be part of intersection array only once. While in other methods, duplicate elements will be present more number of times then they should be.

Please share if there is something wrong or missing. we would love to hear from you. If you want to contribute to website, please refer to publishing at algorithm and me

Minimize maximum lateness : Greedy algorithm

Minimize maximum lateness

Since we have chosen the greed with problem like interval partitioning, coin change problem, let’s continue with it for one more problem at least. Today’s problem is to minimize maximum lateness of a task. Let me clarify problem in detail: We have a processor which can process one process a time and as always given list of processes to be scheduled on that process, with intention that maximum late process should be minimized. Contrary to previous problems, this time, we are not provided with start time and end time, but we are given length of time (ti) process will run and deadline it has to meet (di), fi is actual finish time of process completion.

Lateness of a process is defined as

l(i) = max{0, fi − di}, i.e. the length of time past its deadline that it finishes.

Goal here schedule all jobs to minimize maximum lateness L = max  {l(i)} For example:

minimize maximum lateness

Minimize maximum lateness : Optimization strategy

Let’s decide our optimization strategy. There are some order in which jobs can be decided: shortest job first, earliest deadline first, least slack time first.

Let’s see if any of above strategy works for optimal solution. For shortest processing time first, consider example P1 = (1,100) P2 = (10, 10). If we schedule shortest job first as in order (P1, P2), lateness will be 91, but it we take them as (P2, P1), lateness will be 0. So, clearly taking shortest process first does not give us optimal solution.

Check for the smallest slack time approach. See if you can come up with some counter example that it does not work.

That’s leaves us with only one option, take the process which has most pressing deadline, that is the one with smallest deadline and yet not scheduled. If you have noticed, example given for problem statement is solved using this method. So, we know it works.

Minimize maximum lateness : Greedy algorithm 

1. Sort all job in ascending order of deadlines
2. Start with time t  = 0;
3. For each job in list
   3.1 Schedule the ob at time t
   3.2 Finish time = t + processing time of job
   3.3 t = finish time
4. Return (start time, finish time) for each job

Code for greedy algorithm returns a schedule and maximum lateness which will occur with chosen schedule.

Minimize maximum lateness implementation

Complexity of implementation is dominated by sort function , which if good sorting algorithm is used is O(n log n), rest of processing takes O(n).

Please share your suggestions or if you find something is wrong in comments. We would love to hear what you have to say. I you find this post interesting, please feel free to share or like.