Heaps : Merge K sorted arrays each having N elements in sorted array

Problem statement

Given K sorted arrays each having N elements, we need to merge them all in a N*K size array in sorted order.

Analysis 

Since all the arrays are sorted, candidate for the first result array position will be among the first element in all arrays. How can we find the minimum among all the elements plucked from the first index of each array ? Easy, take those K elements (there are K array, hence K first elements) and build a min heap. The root of the mi heap the list element in all arrays and hence the first element in the resultant array.

Great, now what? 
Now we need to get the next element from the array from which the previous element was selected.
Guess what, we need to maintain the information with the data, which array it came and what was the position of information in that particular array. So we will have a struct as heap element and not as only integers as we seen in previous examples/posts.

Now, based on the information stored in the heap structure, get the next element from the appropriate array (we have stored array no with the info in heap node).

If we have not already considered all elements of that array, put the next element at root of min heap and again heapify. This step will give us second least element in all K elements.

Follow the procedure for N*K times. Once an array elements are considered, we need to replace min heap root with INT_MAX so that it is not considered again 🙂

Algorithm

1. Build min heap with first element of all K arrays.
2. Pick the root of min element and put it in the resultant array.
3. If there are remaining elements in the array,
    Put that element at root of min heap and heapify again
4. If all elements are already considered, put INT_MAX and heapify again.
5. Repeat step 2,3,4 for N*K times.

Code

Complexity analysis

Complexity of code will be O(N * K * logN).

Heaps : Find K smallest element from a given array

Problem statement

Given an unsorted array of N integers, find K smallest element from it.

Analysis

We can do this very easily in O(nlogn) complexity by just sorting the array and returning first K elements. We can even get them without actually sorting all elements of the array.
Refer to following post for detailed explanation : Find Kth smallest element in unsorted array

Here we are looking at an algorithm which is as close to O(N).Heap (min heap) data structure helps us here.
There is two step algorithm.

1. First create a min heap from the given array in place. 
2. Now just perform K deletes from the created min heap and we have K smallest elements. 

Code 

Complexity analysis 

Creation of heap takes O(N) time and deletion of K elements from min heap takes O(KlogN).
So overall complexity of algorithm is O(N+ KlogN).

Heaps : Algorithms

Heap

Heap is a kind of data structures based on complete tree principle. By definition, a complete binary search tree of N levels has at least 2^N-1 nodes in it. 
There two properties heap holds :
1. Structural property : This means every level of heap will be completely full except the last level.
Last level will be filled from left to right order.
2. Heap property : It means parent node will be either greater or smaller than its children nodes.

There are two kinds of heaps based on second property:

Max Heap : It maintains the property that every parent node is greater than its children. Root of the max heap is greatest element.

Heap
Max Heap

Min Heap : It maintains the property that every parent node is less than its children. Root of the min heap is least element.

min heap
Min Heap

Usually, heaps are implemented with an array, which eases traversing from child to parent and parent to child; and viewed as tree. Height of a heap with N elements will be O(logN). 

If we want to find child of a parent node i, it would be 2i or 2i+1.

If we want to find parent of a node say i, it would be [i/2].

Given that multiply and divide by 2 can be very efficiently implemented by left and right shifting, these operations are very efficient.

Maximum elements in a heap with height h will be 2^h -1 while minimum number of elements in heap with same height will be 2^(h-1) +1.(One more than nodes with height h-1)

Watch video on heaps : Insertion and deletion in heaps


Some procedures on heaps

1. Insertion in heap

Insertion in heap happens at the end of heap. Once node is added, heapify property is restored using heapify routines.

Let’s work out an example
Current state of heap is as following, all nodes are satisfying max heap property.

Current Max heap
Now node 10 is added to the heap. We need to check if new node violates max heap property. 
We need to check that parent of currently added node is greater than it. In this case, that does not hold.
So we will swap parent node with current node.
Max heap property violated

After swapping nodes as explained in last step, we should check for if new parent in this case node 14, satisfies heap property. If not, we again swap the largest child and repeat step till we are at root node.. In our case, node 14 satisfies max heap property so we stop there. 

restored heap property

Above mentioned steps are self explanatory and there is no need of algorithm. In above heapify method we are moving down the tree, in insert operation we will move up.

Code

Complexity of this procedure is for O(logN).

2. Deletion in heap

Deletion of node is usually done at root. For deletion of root, replace root node with the last node in heap and then do down shift. As new node replacing root may violate heap property, we need to check with its children. In Max heap, check if new node is greater than both its children. If not then swap it with largest child and then again repeat it till node replacing root finds its valid place.

Algorithm (This is for max heapify)
  1. Let i be the index which we doubt that might violate heap property.
  2. left = 2i, right = 2i+1, largest = a[i]
  3. Check if left child is with in array limits, if yes, check if it is greater than the a[i].
  4. If left is greater than a[i], then largest till now is left, largest <– left.
  5. Check if right is with in array limit, if yes, check if it is greater than largest, then change the largest, largest <– right.
  6. Swap largest and a[i].
  7. Now, repeat step 1 to 6 with largest, till we see an element which does not violate heap property.

 Let’s take an example:

Current heap


Now let’s say we delete node 14. 

First step is to replace it with last node. In this case 6 will replace 14.

Last node replacing root

Node 6 violates max heap property. So, find the largest child of 6, node 12 in this case.Swap node 6 and node 12.

Heap after swapping

Again repeat step with node 6 and node 7. Final heap will be like

Final heap

Code

3. Building a heap from a given array of elements.

This operation can easily be done using heapify methods explained above. 

Algorithm

  1. Start from the middle element of the array, let’s say i
  2. Heapify with given index.
  3. Decrease index by one. Repeat step 2 till we reach first element.

Code

Complexity of this procedure is O(N).How this complexity becomes O(N) while for adding each node will need log N time to heapify and if there are N nodes, it should be O(N log N).
Idea here that nodes at the lower most level do not move at all. Nodes at one level above lower most, move at most by one level and so on.
If height of heap is h, number of node will be 2^h. So 2^h/2 = 2^h-1 nodes never move.
2^h-2 nodes move by one level and so on.

4. Heapsort

Heap sort combines property of both insertion sort (no extra space required) and merge sort (time complexity being O(NlogN)). 
It can be easily achieved by using above two procedures.

Algorithm

  1. Build MAX heap from the given array.
  2. Swap first and last element of the array. Last element is now at its proper position.
  3. Decrease the effective size of heap to be heapify.
  4. Heapify with first element of the array.
  5. Repeat step 2 , 3 and 4 till we reach the second element of the array.

Code

Complexity of above procedure is O(NlogN).

Strings : Permutations and combinations of string

Permutations and combinations of string

Permutation is arrangement of objects in various positions where order of objects will matter i.e. ab is not same as ba.

Analysis

Let us start with an example  : S  = “ABC”  What are the possible permutation?
ABC      ACB     CAB    CBA     BCA     BCA    

Fix the first character and find permutation of N-1 characters.
We fix A, permutation of BC are BC and CB.
Now place A at all the places possible, in this case it will be at start, middle and at the end.
BC leads to ABC, BAC, BCA
CB leads to ACB, CAB, CBA.
It can be easily code with the help of recursion.
Take each character starting with 0, let say current character is Kth character in the string, we need to fit this character in all the substrings of length N-K (N is length of string).  
We first swap the Kth character with the a ith position where i starts from K and reaches to N (position in remaining substring). Before going for all position of Kth character, we will move forward and check for K+1 character for its all positions in N-K-1 length substring.
Base case would be when we reach at the end of string. Hence remaining substring would be of length zero. Return.

So when at penultimate character of string, we have already varied all the positions of last character. Similarly when at Kth character back, we would have adjusted all permutations of N-K characters following it.

If we see closely at code, we would have each character of string as each position of string at some point of time as we are swapping each character with every position when moving forward.
When there are duplicate characters in string, just sort the string and check if are processing duplicate character. If yes then just skip the character.

Code to print permutations of string

Similar logic can be applied to combination problem also where we have to first output the string with the character and then without it. Following code does that.

Code to print combinations of string

Complexity of algorithm to print permutations and combinations of string is O(N!) as we need to generate N! permutations.

Strings : Reverse words in a string

Reverse words in a string

A string can contain many words which are separated by space. We need to reverse words of string.
For example if string is “this is test string” , output should be “string test is this”

Analysis

What do i get if reverse the whole string first?
In above example we get “gnirts tset si siht”. If see closely we get that the first word in the reverse string is exact reverse of the first word of the output string. Similarly second word is exact reverse of the second word of the output string.
So we can just reverse individual words in the string and get the output like string.
Hence the algorithm involves two simple steps:

1. Reverse the whole string first.
2. Reverse individual words of the output string given in step 1.

Same algorithm can be used when we want to rotate an array with a given index as pivot.
Below is code for that too.

Code to reverse words of a string

Test cases

1. A Normal string with multiple words
s = “This is test string”

2. A string with only one word
s =”string”

3. An empty string
s =””

4. A NULL pointer is passed
s =  NULL

Problem 2 : Rotate an array with a given index as pivot

Complexity of above code is O(N) without using any extra space. 

Strings : Removing particular combination of characters

Introduction : Remove particular combination of characters from string.

Given a string, we need to remove a particular combination of characters, for example, 

if s = “aacbbdccaac” and if combination is “ac” then our output will be “abbdcca”.
if s = “aacaaccd” and if combination is “ac”, then output should be “acd”.
Following will be the execution 
aacaaccd ——-> aacd ——>ad 

Note that this has to be done in linear time and without any extra space.

Analysis

Here there are two sub part of the problem:

One, we need to keep track we have already encountered the first character in the combination of characters needed to be removed.

Second we need to keep track of the position where the next character should be placed in the string if it is not to be eliminated.

First problem can be easily solved by maintaining a state machine, where we have two states and one moves from one state to another based on the input.
In current example, we have states like state “INITIAL” and “MOVED”, Whenever we encountered “a”, we move from INITIAL to MOVED. Then if we get “c” in “MOVED” state we are sure that we have encountered the whole pattern and they need to be removed. If we are in “MOVED” state and if we don’t get “c”, we move back to “INITIAL” state.

State machine

For the second part we can simply follow the approach we have used in this post, that is to maintain two pointers, one to point to character to be processed, and other to point position where current character to be placed if it is not be eliminated.
The problem can be extended with multiple characters in pattern, with change in state machine accordingly. 

Same state machine method can be used to count number of words in a given string. Whenever we encounter the separator, we move to OUT state and as soon as we see first character in OUT state we move to IN state till we again see separator.

Code 

#define INITIAL 1
#define MOVED 2

char * remove_pattern(char *s, char * pattern){

 int state = INITIAL;
        
 if(s == NULL) return NULL;
 if(pattern == NULL) return s;

 char * source = s;
 char * destination  = s;
 while(*source != ”){

/*If character is not equal to the first character of the pattern, copy it as such */
 if(state == INITIAL && *source != *pattern){
   *destination = *source;
   destination++;
 }       
 else if( *source == *pattern && state == INITIAL){
  /* Move state to MOVED */
   state = MOVED;
 }       
 else if(state == MOVED && *source != *(pattern + 1)){
 /* If next character is not as per pattern, copy previous character first. */
      *destination =  *pattern;
      destination++;
 /* If next character is first character of the pattern, move state to INITIAL */ 
     if(*source != *pattern){
         *destination  = *source;
          destination++;
          state =  INITIAL;
      }
 }      
  /* If we hit the pattern, start again and move state to INITIAL */
  else if(state == MOVED && *source == *(pattern + 1)){
       state = INITIAL;
  }
/* After moving the characters, check if they don’t make pattern again */
      
if((int)(destination-s) >= 2 && *(destination -2) == *pattern && 
*(destination-1) == *(pattern +1))     {
            destination -=2;
     source++;
}
   *destination = ”;

   return s;
}

Test cases

1. When pattern is present
s= “aaacacacbbb” pattern = “ac”

2. When pattern is not pattern
s =”aaaabdcaaa” pattern  = “ac”

3. When movement leads to pattern again
s= “abacaccdc” pattern =”ac”

4. When string is NULL or pattern is NULL
s= NULL
pattern = NULL

5. When return string is empty
s = “acacacac” pattern =”ac”

6. Just first character in pattern
s= “aaaaaaaaa” pattern =”ac”

Complexity Analysis

Complexity of above code is O(N) and no extra space used.

Strings : Remove duplicates from a string

A string is a collection of characters terminated by a special character ”. String can have duplicate characters in it. Today’s problem requires purging of those duplicate characters from string and return the string.

Problem statement 

Given a string S, remove all duplicate characters from it. 
For example,
S = “aaabbacbccd”, then output should be “abcd”. Note the output is not  “d”, that means we need to maintain once instance of the duplicate character. 

Analysis

There are two parts involved in this problem.
First, we need to keep track whether we have already processed a particular character.
Second we need to properly place the character in destination string.

For the first part hashing is the perfect technique. We can create a hash with key as character (total 256 characters) and set the value when we first encounter the character. Next time when we encounter the character, we would find the value set against that character and hence we can skip that character.

For second part we can have an auxiliary string to store the non-duplicate characters.
We can do it in-place too.
Keep two indexes, one which keeps track the character being processed in input string, other which points to the next place in the input string where the current character can be put if it is not processed already.

Code

Test cases

1. When input string contains duplicate characters
S  = “aaabaaabbbcccd”

2. When input string does not contain duplicate characters
S = “abcdefg”

3. Input string with no characters
S = “”

4 Input string pointer is NULL
S = NULL;

Execution of these test cases can be seen here.

Complexity Analysis 

Complexity of code is O(N) in time and constant memory for hash as it does not depend on the size of the input string.