Lowest common ancestor in binary search tree

Lowest common ancestor in binary search tree

We saw how find Kth smallest element in binary search tree can easily be solve by using BST property which states that all nodes on left subtree of a node are smaller and all nodes on right subtree are greater than node itself. Another problem which uses this property of Binary Search Tree (BST) is finding Lowest Common Ancestor commonly know as LCA.

Problem statement is : given two nodes of BST, find lowest node which is parent of both given nodes, that is least common ancestor (LCA). For example in following tree, LCA of 25 and 37 is 30

lowest common ancestor

Can we use property “all nodes on left subtree smaller and all nodes on right subtree are greater than node” to solve this problem?  Lowest common ancestor of two nodes will be the node when path of two nodes diverge, i.e. one node is greater than current node and other is greater.  For each candidate LCA, there are three cases to be considered:

Case 1 : Two nodes are less than current node.
In this case, there must be a node lower than current node on left subtree where these two nodes diverge. Hence, look for LCA in left subtree of current node.

Case 2 : Two nodes are greater than current node
In this case, there must be a node lower than current node on right subtree where these two nodes diverge. Hence, look for least common ancestor on right subtree of current node.

Case 3: One of the two nodes is parent of other.
In this case, LCA is parent node.

Case 4 : One node is on left and other on right side of current node. 
Then current node is least common ancestor of two nodes.

Implementation of lowest common ancestor algorithm

#include<stdio.h>
#include<stdlib.h>
 
struct node{
	int value;
	struct node *left;
	struct node *right;
};
 
typedef struct node Node;
 
int leastCommonAncestor(Node *root, int val1, int val2){
 
 	if(!root)
       return -1;
 
 	if(root->value == val1 || root->value == val2)
    	return root->value;
 
 	/* Case 3 : If one value is less and other greater 
             than current node Found the LCA return */
 	if((root->value > val1 && root->value <= val2) ||
  		(root->value <= val1 && root->value >val2)){
           return root->value;
 	}
 	/*Case 1 : If Both values are less than current node, 
           look in left subtree */
 	else if(root->value < val1 && root->value <val2){
            return leastCommonAncestor(root->right, val1, val2);
 	}
  	/*Case 2 : If Both values are greater than current node, 
                   look in right subtree */
 	else if(root->value > val1 && root->value > val2){
            return leastCommonAncestor(root->left, val1, val2);
 	}
}
 
Node * createNode(int value){
  Node *newNode =  (Node *)malloc(sizeof(Node));
 
  newNode->value = value;
  newNode->right= NULL;
  newNode->left = NULL;
 
  return newNode;
 
}
 
Node * addNode(Node *node, int value){
  if(!node) return createNode(value);

  if (node->value > value)
        node->left = addNode(node->left, value);
  else
        node->right = addNode(node->right, value);
 
  return node;
}
 
/* Driver program for the function written above */
int main(){
  Node *root = NULL;
  //Creating a binary tree
  root = addNode(root,30);
  root = addNode(root,20);
  root = addNode(root,15);
  root = addNode(root,25);
  root = addNode(root,40);
  root = addNode(root,37);
  root = addNode(root,45);
 
  printf("\n least common ancestor: %d ",
              leastCommonAncestor(root, 15, 25));
 
  return 0;
}

LCA candidates reduce by half with every comparison. In worst case, number of comparisons will be equal to depth of tree. Depth of a tree, given tree is not completely skewed, is roughly log N, hence complexity of this algorithm is logN. If the tree is completely skewed and there is parent child relationship between given nodes, then we need to compare N nodes and our complexity becomes O(N) in worst case.

Please share if there is something wrong or missing. If you are interested in contributing to website or have an interview problem to share, please contact us.

Find Kth smallest element in binary search tree

Finding Kth smallest element in an array is well know problem. Today’s problem is to find Kth smallest element in a binary search tree.

By Kth smallest node/number, we means that there are K-1 nodes in tree which are less value than this node. For example, first smallest node is smallest node in tree, second smallest node is one greater than the smallest and so on.

In below tree, 5th smallest node is : 37

Kth smallest element in binary search tree

Finding Kth smallest node is uses property of Binary Search Tree (BST) that all nodes on left side of a node are less than node and all nodes at right are greater than it.
If number of nodes on left side of current node, we can then decide whether Kth smallest node is present in left subtree or in right subtree of the current node. Based on this information, we can discard one half binary search tree and recursively look for node in left or right subtree only.

Algorithm to find Kth smallest element in BST

1. Start from root, be it currentNode 
2. Check number of nodes on left subtree be it X
   2.1 If X is less than K, then search Kth smallest in right subtree
   2.2 If X is greater than K, search for Kth smallest in left subtree
   2.3 If X == K-1 then return currentNode.

Kth smallest element in binary search tree implementation

#include<stdio.h>
#include<stdlib.h>
 
struct node{
	int value;
	struct node *left;
	struct node *right;
};
 
typedef struct node Node;
 
/* This function calculates the size of tree rooted at Node */
int sizeOfLeftTree(Node *node){
  if(!node)
    return 0;
 
  return 1 + sizeOfLeftTree(node->left)
           + sizeOfLeftTree(node->right);
}
 
int findKthNode(Node *root, int K){
 
    if(!root)
      return 0;
 
    int no_left = sizeOfLeftTree(root->left); 
    /* If there are K-1 nodes in left sub tree */     
 
    if(no_left  == K-1){
      return root->value;
    }
   /* If there are more than K-1 nodes in left sub tree */
    else if(no_left > K-1){
      return findKthNode(root->left, K);
    }
    /* If there are less than K nodes in left sub tree */
    else{
      return findKthNode(root->right, K-no_left-1);
    }
}

Node * createNode(int value){
  Node *newNode =  (Node *)malloc(sizeof(Node));
 
  newNode->value = value;
  newNode->right= NULL;
  newNode->left = NULL;
 
  return newNode;
 
}
 
Node * addNode(Node *node, int value){
  if(node == NULL){
      return createNode(value);
  }
  else{
     if (node->value > value){
        node->left = addNode(node->left, value);
      }
      else{
        node->right = addNode(node->right, value);
      }
  }
  return node;
}
 
/* Driver program for the function written above */
int main(){
  Node *root = NULL;
 
  //Creating a binary tree
  root = addNode(root,30);
  root = addNode(root,20);
  root = addNode(root,15);
  root = addNode(root,25);
  root = addNode(root,40);
  root = addNode(root,37);
  root = addNode(root,45);
 
  printf("\n Kth smallest node : %d ",
            findKthNode(root, 4));
 
  return 0;
}

Another method that can be easily implemented and suggested by skeptic in comments is to do inorder traversal of binary search tree and keep count when we are printing the node. Once count reaches K, we have found our desired node.

#include<stdio.h>
#include<stdlib.h>
 
struct node{
	int value;
	struct node *left;
	struct node *right;
};
 
typedef struct node Node;
 
 
int findKthNode(Node * root, int *n, int K){
    if(!root) return -1;
 
    findKthNode(root->left, n, K);
    (*n)++;
    if(K == *n){
        return root->value;
    }
    findKthNode(root->right, n, K);
}
 
Node * createNode(int value){
  Node *newNode =  (Node *)malloc(sizeof(Node));
 
  newNode->value = value;
  newNode->right= NULL;
  newNode->left = NULL;
 
  return newNode;
 
}
 
Node * addNode(Node *node, int value){
  if(node == NULL){
      return createNode(value);
  }
  else{
     if (node->value > value){
        node->left = addNode(node->left, value);
      }
      else{
        node->right = addNode(node->right, value);
      }
  }
  return node;
}
 
/* Driver program for the function written above */
int main(){
  Node *root = NULL;
  int n = 0;
  //Creating a binary tree
  root = addNode(root,30);
  root = addNode(root,20);
  root = addNode(root,15);
  root = addNode(root,25);
  root = addNode(root,40);
  root = addNode(root,37);
  root = addNode(root,45);
 
  printf("\n Kth smallest node : %d ",
              findKthNode(root, &n, 4));
 
  return 0;
}

There is another method which is more simple and efficient. It requires inorder traversal of tree and keep track of the number of nodes visited. Once K nodes are traversal break the traversal. Same method can be used to find Kth largest element, where we do inorder traversal in reverse way.

For the first approach, at any point of time we would be discarding half of the nodes, so ideally it should be O(logN). However this is not the case. We are traversing the entire left side of the tree for every node to get the number of node s in left sub tree. Considering tree as balanced tree too, we would be traversing N/2 nodes for logN times, hence complexity of this solution is N log N and not logN. Hence, complexity of finding Kth smallest element in binary search tree with first approach will be O(NLogN). In second approach, we will traversing at most K nodes hence the complexity will be O(K).

Please share if there is something is missing or wrong. If you are willing to contribute to website or have an interview experience to share, please contact us.

Find Kth smallest element in array

Kth smallest element in array

Given an set of numbers in non-sorted order, find the Kth smallest element in an array.Kth smallest element means there are N-K-1 elements in set which are greater than the desired element.

We can solve this problem in two simple steps:
1. Sort the input set.
2. Scan the array till Kth element, it gives us the Kth smallest element in array

Thing to ponder upon here is can we do better? Can we avoid the second step? Can we reduce the size of input to be sorted or not sorting the input at all? Answer to all above questions is Yes. Think Quick sort.

Algorithm to find Kth smallest element

Quick sort does not completely sorts two sub arrays when it gives us the correct position of pivot.By property of quick sort we know that all elements which are on left side of pivot are less than pivot.

Let’s say we have pivot swapped with jth element of the input set, so there are j elements which are less than pivot. Can we use this information to find Kth smallest element? Absolutely yes.
If j is less than K, then we have smaller left subset and we need to look into right subset of the input to get the Kth smallest element. So we look (K-j)th smallest element in the right subset.
Here we have not sorted left or right subset of the input and still we reduced the candidate by almost half.
If j is more than K, then we need to look in left subset of input. Here we have discarded the right subset of the input.
As we are doing partitions and calculating the position of pivot, we don’t need to scan the array afterwards. As soon as position of pivot is equal to K, we got element.

Let’s say we have an array with following elements 4,2,1,7,5,3,8,10,9,6 and we need to find fifth smallest element in this array. In this example output will be 5.
kth smallest element in array

We have pivot as 1st element i.e 4 and we divide the remaining array in such a way that we find the correct position of 4.

Once 4 is in correct position, all the elements on left side are less than pivot i.e 4 and on right side, are greater than pivot.
Now check the position of pivot, if it is K-1 (array is represented as zero base index), then pivot is our element.
Here final position of 4 after first execution of quick sort function will be index 3.
kth smallest element in array example

Since pivot position is less than K-1 i.e 4, we need to look for element in right side of the pivot. Notice that we are not reducing K as pivot position is given w.r.t whole array and not w.r.t sub-array under consideration.

In second iteration, 5 will be pivot. After second execution of quick sort array will look as below

Pivot index is now 4 which is equal to K-1. Hence our 5th smallest element in this array is 5. Note that we have not sorted the complete array.

Kth smallest element in array implementation

#include<stdio.h>
#include<stdlib.h>
#include<math.h>

void swap(int *a, int *b){

    int temp = *a;
    *a = *b;
    *b = temp;

}
int partition(int A[], int start, int end){
    int i= start+1;
    int j = i;
    int pivot = start;
    for(; i<end; i++){
        if(A[i]<A[pivot]){
            swap(&A[i],&A[j]);
            j++;
        }
    }
    if(j<=end){
        swap(&A[pivot],&A[j-1]);
    }
    for(i=0;i<10;i++){
        printf("%d  ", A[i] );
    }
    printf("\n");
        return j-1;
}

void quick_sort(int A[], int start, int end, int K){
    int part ;
    if(start <end) {
        part  =  partition(A, start, end);
            if(part == K-1){
                printf("kth smallest element : %d" , A[part]);
            }
        if(part>K-1){
            quick_sort(A, start,part, K);
        }
        else{
            quick_sort(A, part+1, end, K);
        }
    }
    return;
}

/* Driver program for the function written above */
int main(){
    int array[] = {4,2,1,7,5,3,8,10,9,6};
    int n = sizeof(array)/sizeof(array[0]);
    quick_sort(array,0, n, 4);
    return 0;
}

As we are reducing the candidates by half, our equation of complexity becomes :  T(n)  = T(n/2) + T(n/2)

If we directly take Master theorem, complexity becomes O(nlogn).
There is a catch here, if input is in sorted order, then worst case complexity becomes O(n2), So we can first scan input and if it is sorted, we can simply count the elements and find out the Kth smallest element. Please refer to previous quicksort algorithm for details how we can select pivot so that quick sort does not go for worst case performance irrespective of the nature and order of input.

Finding Kth smallest element is typical application of quick sort. We can apply quick sort algorithm in any problem where we need to find out relative position of an element w.r.t others without actually sorting the elements of input.

Please share if there is something wrong or missing, If you want to contribute to algorithms and me, please contact us.

Quicksort algorithm in C

Quicksort Algorithm

Quicksort algorithm is a sorting algorithm which uses divide and conquer technique. Basic idea of algorithm is to divide the inputs around a pivot and then sort the two part one on the left side and other right.

In this article we will learn about the quicksort algorithm, code for it and then how we can apply modified quick sort on problems. Best part of quicksort algorithm is that it removes need of extra space required in merge sort. However, it also comes with a caveat that is input space may not be always divided into into two halves equally which can degrade the performance of algorithm considerably. We will see when this case arises.

Quicksort algorithm analysis

We will understand algorithm using an example. Let’s we have an array of integer A  = [3,4,10,5,6,7,9,10,19,2,1]. There are three parts of algorithm :

1. Select a pivot and then place it at it’s correct position.
2. Once pivot is on it’s place, sort left part of the array.
3. Then sort the right part.

Step 2 and 3 can be easily implemented in recursive function.
Step 1: Partition 
Select first element of input space i.e. A[0] as pivot pivot = A[0]. Put this pivot into correct position such that all the elements on the left side of the pivot are smaller than pivot and on right side are greater than pivot.

a. Start from the next element in input space and find the first element which is greater than pivot which was selected above. Let that be ith position.
b. Start from end of array and find first element from end which is smaller than pivot selected. Let it be jth position.
If i and j have not crossed each other i.e i<j then swap element at ith and jth positions, and repeat a dn b.
Once i and j cross each other, swap pivot with element at jth position.  After this step, pivot will be at its correct position and array will be divided into two parts, one on to the left side of pivot and other on to right side.

For input array A  = [3,2,19,10,6,7,10,1,4,5] below diagram shows the pivot and execution of function.
quicksort algorithm

We can see that all the elements on left side of pivot are less and right are greater than pivot. So our pivot is now at correct position.

int partition(int a[], int start, int end){
    i = start+1;
    j = end;
    pivot =  0;

    while(i<j){
       while(i<=end && a[i] < a[pivot]){
              i++;
        }
        while(j>=0 && a[j] > a[pivot]){
            j--;
        }
        if(i<j)
            swap(A[i],A[j])
    }
    swap(a[pivot], a[j]);
    return j;
}

Step 2 : Sort
Apply quick sort algorithm on the left part of array.Once can repeat the step one till we have a single or only two elements.

Step 3: Sort
This is same step as step two applied on right part of the array.

Quicksort implementation in C

#include<stdlib.h>
#include<stdio.h>

void swap(int *a, int *b){
    int temp = *a;
    *a = *b;
    *b = temp;
}

int partition(int A[], int start, int end){

        int i= start+1;
        int j = start;
        int pivot = start;

        for(; i<end; i++){
            if(A[i]<A[pivot])
                swap(&A[i],&A[++j]);
        }
        if(j<end)
            swap(&A[pivot],&A[j]);

        return j;
}

void quickSort(int A[], int start, int end){
        int part ;
        if(start <end) {
                part  =  partition(A, start, end);
                quickSort(A, start,part);
                quickSort(A, part+1, end);
        }
        return;
}

/* Driver program */

int main(){
        int a[]= {0,1,1,1,0,0,0,3,3,3,4,4};
        int n  =  sizeof(a)/sizeof(a[0]);
        
        quickSort(a,0,n);
}

Thanks Prince Mishra for Python implementation of quicksort

# Classic divide and conquer
# partition and combine

def partition(arr, start, end):
    pivot = start

    # all elements smaller than pivot should be to its left
    # all elements greater than pivot should be to ite right

    i = start + 1
    j = end

    while i < j:
        while i<end and arr[i] <= arr[pivot]:
            # find first element greater than pivot
            i+=1
        while j>start and arr[j] > arr[pivot]:
            # find first element smaller than pivot
            j-=1
        if i < j:
            # swap if they haven't crossed
            arr[i], arr[j] = arr[j], arr[i]
    ret = pivot
    if arr[j] <= arr[pivot]:
        arr[pivot], arr[j] = arr[j], arr[pivot]
        ret = j
    return ret

def quicksort(arr, start, end):
    # terminal condition
    if start >= end:
        return

    # propagation
    partition_index = partition(arr, start, end)
    quicksort(arr, start, partition_index-1)
    quicksort(arr, partition_index+1, end)


test_arr = [8,3,4,1,8,7,4,6,6,9]
quicksort(test_arr, 0, len(test_arr)-1)
print test_arr

Complexity of quick sort depends on the partitions. If partitions done are balanced, then quicksort performs as good as merge sort where as if partitions are unbalanced, then it becomes as bad as insertion sort. 
Worst case behavior of quick sort is notoriously O(n2), when the input set is all sorted and pivot is either first or last element of array. In that case, array will be divided into two parts, one parts has only one element and other part contains n-1 elements. 
Similarly when we work on subarray with n-1 elements it is again divided into two subarrays of size 1 and n-2. In order to completely sort array it will split for n-1 times and each time it requires to traverse n element to find correct position of pivot.Hence the recurrence relation will be T(n) = T (n-1) + T(0) + O(n) where O(n) is partition cost.
Solving this equation T(n)  = O(n2). 
Hence the overall complexity of quicksort comes out as O(n2>).

Worst case of quicksort happens when input array is already sorted in increasing or decreasing order. Even though worst case is O(n2), in average case if pivot is picked wisely, quick sort can perform sorting in O(nlogn). Best part is there is no extra space involved in it. Hidden constants are also very small as compared to merge sort. Also it works very efficiently with virtual memory.

Best case partitioning happens when two partitions are roughly equal in size that is of size n/2. In that case recurrence relation becomes:  T(n) = 2 T(n/2) + O(n); which is solved into T(n) = O(nlogn).

Average case of quicksort is closer to best case than worst case.

Selection of Pivot
If array is completely sorted, then worst case behavior of quicksort is O(n2), so there comes another problem. How can we select pivot so that we have two sub arrays are nearly halves of the input array. There are many solutions proposed.

1. Taking median of array as pivot. So how to select median of an unsorted array. We look into this problem separately, but yes it guarantees two halves of same size.
2. Selecting pivot randomly. This requires heuristics algorithms to select pivot.

This is basic article explaining the concept of quicksort. In next post we would see how we can use this algorithm to find out Kth largest or smallest element in an array of integers.

Things to ponder on

  1. Is quick sort stable? No it is not.
  2. Is it suitable for parallel execution ?
  3. Can we use insert sort when we have small number of elements in a sub array? Left as an exercise.

Yes, we can run left and right sub arrays on parallel machines and then combine results to form the larger array.
Please leave your comment in case you find something wrong or you have some improved version. If you are interested in contributing to website or have an interview experience to share, please contact us.

Count sort : Sorting in linear time

Linear time sorting algorithm : Count sort

Are there any sorting algorithm which has worst case complexity of O(n)? There are a few like count sort and decision tree algorithms. In this post we would discuss about count sort and couple of problems where this counting sort algorithm can be applied.Counting sort was invented by Harold H. Seward
To apply counting sort, we need to keep in mind following assumptions:

  1. There should be duplicate values in the input
  2. There should be at most K different type of values in input.
  3. The input data ranges in 0 to K

Count sort goes for O(n2) if K is very close to n i.e. a very few elements are duplicate and rest all are unique. So above three conditions are necessary to have this algorithm work in linear time.

Count Sort -Approach

Let’s see how does it work?There are three steps involved in algorithm:

  1. Sampling the data, counting the frequencies of each element in input set.
  2. Accumulating frequencies to find out relative positions of each element.
  3. Third step distributes each element to its appropriate place.

Now let’s take one example and go through above three steps:
Input is  an array A = [1,2,3,2,2,3,3,1,1,1,1,3,2,2]  Here we can see that K=3.
Now let’s perform the first step of the method, i.e. count the frequency of each element. We keep a hash and keep track of each element’s count. Since values are upper bound by K, we need at most K sized hash. We initialize this hash as zero. A is input array and n is size of the array.

Count looks likes count =[5,5,4] Complexity of this step is O(n).

Second step involves accumulation of frequencies where we add frequencies of previous elements to the current element.

Second step runs for K times hence the complexity of O(K)
Third step involves distribution. Let’s create an output array called as B of size n
For every element in A we check the corresponding count in count array, and place the element at that position. So, first 1 will be at position 4 (there are total 5 and array is index from 0), first two will be placed at 9 and first three will be at 13. Subsequently, lower positions will be filled as we encounter them in input array.

This step has complexity of O(n)
Total complexity of the algorithm being O(K)+ O(n) + O(K) +O(n) = O(K+n) . Please refer to master theorem, how this is derived.Now if we see, K+n remains in order of n till the time K<n, if K goes on to n2, the complexity of algorithm goes for polynomial time instead of linear time.
Applications
There is immediate application of this algorithm in following problem: Let’s there is an array which contains Black, White and Red balls, we need to arrange these balls in such a way that all black balls come first, white second and Red at last.Hint: assign black 0, white 1 and Red 2 and see.
In next post we would discuss how extra space can be saved, how initial positions of elements can be maintained. We would go through an interesting problem to discuss above optimization.
Complete Code
Output of the above code after every iteration is :
Why this filling from the end of the span? Because this step makes count sort stable sort.Here we have seen a sorting algorithm which can give us o/p linear time given some particular conditions met on the i/p data.
Let’s consider a problem where we have nationals of different countries in a room and our task is to arrange these nationals in such a way that all the nationals of same country are clubbed together.
Simple application of count sort isn’t? Count nationals of each country and place them in order.Now let’s add some complexity. What if we need all nationals of country we discovered early to be before the nationals of country we encountered later? So input be A = [I,I,I, U,U,S, S, E,E,E,M,M,M,I, U,M,E]
Output should be : A = [I,I,I,I, U,U,U,S,S,E,E,E,E, M,M,M]
So there is additional information we need to track that what was the order of the visiting country while traversing the country and rest all follows as per the count sort.How do we keep track of order?We can keep an array of countries and put countries in such a way that index of first encounter country is lower than the country later. We can do this while counting the nationals of each country by checking that if this was the first national of this country we encountered then we can go and add the country in country array.
Code