**Merge K sorted arrays (K-way merge sort)**

Given K sorted arrays each having N elements, merge them all in one N*K size array in sorted order. This process is called as ** K-way merge sort** or merge k sorted arrays and used in external merge sort as explained here:Merge sort algorithm

** Merge K sorted array**

Since all input arrays are already sorted, candidate for first position in output array will be among first elements of input arrays. How can we find the minimum among all elements at first index of each array? Easy, take those K elements (there are K array, hence K first elements) and build min heap. The root of min heap is least element in all arrays and hence should be the first element in output array.

Great, we have first element selected for output. Also, we have K-1 element min heap. Now what? Now, get second element of array from which the minimum first element was selected.

For this to happen, we need to know which element was selected from which array. Maintain this information with element to point to array it came from and what was position of element. To store additional information, we need complex structure rather than primitive integer. We will use struct available in c, heap will be now of structures and not integers. This comes with more problems, we will see later in post.

Now, based on information stored in heap structure, get next element from appropriate array (we have stored array number in information in heap node). If we have not already considered all elements of that array, put next element at root of min heap and again heapify. This step will gives us second least element in all K elements.

Follow this procedure for N*K times. Once an array elements are considered, we need to replace min heap root with INT_MAX so that it is not considered again.

**Algorithm to merge K sorted arrays**

1. Build min heap with first element of all K arrays.

2. Pick root of min element and put it in output array.

3. If there are remaining elements in array, put next element at root of min heap and heapify again

4. If all elements of an array are already considered, put INT_MAX and heapify again.

5. Repeat step 2,3,4 for N*K times.

** Merge K sorted arrays : Implementation**

#include<stdio.h> #include<stdlib.h> #define INT_MAX 1000 typedef struct heapNode{ int data; int arrayNum; int itemNum; } heapNode ; int leftChild(int i){ return (i*2)+1; } int rightChild(int i){ return (2*i)+2; } heapNode * createNode( int data, int arrayNum, int itemNum){ heapNode * newNode = (heapNode *)malloc(sizeof(heapNode)); newNode->data = data; newNode->arrayNum = arrayNum; newNode->itemNum = itemNum; return newNode; } void swap(heapNode * a[], int i, int j){ heapNode * temp = a[i]; a[i] = a[j]; a[j] = temp; } void minHeapify(heapNode * a[], int i, int len){ int smallest =i; int left, right; left = leftChild(i); right = rightChild(i); if(left <= len && a[i]->data > a[left]->data){ smallest = left; } if(right <= len && a[smallest]->data > a[right]->data){ smallest = right; } if(smallest != i){ swap(a, i, smallest); minHeapify(a, smallest, len); } } void buildHeap(heapNode *a[], int len){ int i = len/2 +1; for(; i>=0; i--){ minHeapify(a,i, len); } } void mergeKSortedArrays(int a[5][5], int N, int K, int result[]){ int i; heapNode *minHeap[K]; /* Put all elements in an array */ for(i=0;i<K; i++){ minHeap[i] = createNode(a[i][0], i, 0); } /* Build min heap with those entered elements */ buildHeap(minHeap,K-1); /* Now we have to take every element one by one and replace root with that and heapify again */ for(i=0; i<N*K; i++){ heapNode * minNode = minHeap[0]; /* Put the root of min heap in result array */ result[i] = minNode->data; /* If there are elements to be considered in that array */ if(minNode->itemNum + 1< N){ minHeap[0] = createNode( a[minNode->arrayNum][minNode->itemNum + 1], minNode->arrayNum, minNode->itemNum + 1 ); } /* else put INT_MAX at root */ else{ minHeap[0] = createNode(INT_MAX, minNode->arrayNum, minNode->itemNum + 1); } /* Heapify again */ minHeapify(minHeap, 0, K-1); } } int main(){ int N = 5; int K = 4; int a[4][5] = { 2,3,5,6,7, 4,7,8,9,10, 3,5,11,13,45, 1,4,5,7,8 }; int result[K*N]; mergeKSortedArrays(a, N, K, result); for(int i=0; i<N*K; i++){ printf("%d ", result[i]); } return 0; }

Complexity of algorithm to merge k sorted arrays or __ K-way merge of arrays__ will be O(N * K * logN).

Please share if there is something wrong or missing. If you are interested in contributing to website, or have an interview experience to share, please contact us.

Pingback: Amazon interview experience 3 | Algorithms and Me()

Pingback: Merge sort implementation - Algorithms and Me()