Merge sort algorithm

Merge Sort Algorithm and implementation

Merge sort algorithm is unique sorting algorithm in the sense that it requires additional space for sorting data. However, merge sort is effective while sorting huge amount of data which cannot be brought in main memory. In this post will learn the algorithm and merge sort algorithm implementation in C.

We can classify sorting algorithms based on their complexity : selection sort, bubble and insertion sort have complexity of O(n2) while heap sort, merge sort and quicksort (with some assumptions) have complexity of  O(nlogn) and count and Radix sorts are linear O(n) algorithms.

Merge Sort Algorithm

Merge sort algorithm falls in divide and conquer class of algorithms. In divide and conquer strategy, input space is divided into subproblems, then subproblems are solved individually, combined together to solve original problem.

merge sort algorithm implementation

Split step
In merge sort algorithm, problem is divided into two subproblems repeatedly, to achieve smaller subproblem. When subproblem can not be further divided, it is solved (in this case sorted). Then solution is combined up till all of original input is sorted. (Problem to start with is solved)
Which is smallest or unbreakable subproblem in this algorithm? When input space cannot be further divided, that must be the smallest problem. In case of an sorting on array of integers, smallest subproblem will be with only one element to be sorted.

Merge step
We learned split (divide) and process subproblems, how to combine solutions to split problems to solve original problem? It’s the conquer part of divide and conquer strategy.
Sort smallest problem input space and combine them together with merge operation. In merge operation, scan two sorted inputs (which were made in split step), compare elements one by one and put smaller element in output.

If an input is completely scanned, copy remaining elements of other input to output. Copy this sorted output array back to original input so that it can be combined with bigger problem input.

Let’s see merge sort example step by step.
merge sort example
merge sort algorithm example

Merge sort algorithm implementation


int merge(int a[], int start, int mid, int end){
    int i,j,k;
    int temp[end-start+1];
    i = start; j = mid+1; k =0;
    while(i <= mid && j <= end){
        if(a[i]< a[j]){
            temp[k++]= a[i++];
        else {
            temp[k++]  = a[j++];
    while(i <= mid){
        temp[k++] = a[i++]; 
    while(j <= end){
        temp[k++] = a[j++]; 
    for(i=0; i<k; i++){
        a[i+start] = temp[i];

int mergeSort(int a[], int start, int end ){
    int mid  = (start + end)/2;
        //sort the left part
        mergeSort(a, start, mid);
        //sort right part
        mergeSort(a, mid+1, end);
        //merge them together
        merge(a, start, mid, end);

int main(){
	int a[] = {2,3,4,1,8,7,6,9};
	int size = sizeof(a)/sizeof(a[0]);
	mergeSort(a, 0, size-1);
	for(int i=0; i<size; i++){
		printf("%d ", a[i]);

Merge sort algorithm implementation details

In mergesort() function, three parameters are received, input array a[], start index and end index of array. Start and end index change in every recursive invocation of mergesort() function.
Find middle index from start and end index of array and recursively call mergesort() function for two subarray; one from start to mid index and other being from mid+1 to end index of original array.
When there is only element in input array, it’s the termination condition for recursion. Start winding up and call merge function. Merge function takes four parameters, input array, start, end and middle index. Merge function merges two subarrays (one from start to mid and other from mid+1 to end) into an single output array with size end-start +1. Output array is then returned to calling function which sorts two bigger parts of array.

Complexity of merge sort implementation

With each successive call to function, input space is divided into two parts. Tree of this input space division looks like:
analysis of merge sort algorithm implementation

For each level from top to bottom,  each level calls merge method on n/level elements for level times. For example, at level 2, call to merge method on 2 subarrays of length n/2 each. The complexity here is 2 * (cn/2) = cn.  At level 3, calls merge method on 4 subarrays of length n/4 each. The complexity here is 4 * (cn/4) = cn and so on.

Height of recursion tree is (logn + 1) for a given n. Thus, overall complexity is (logn + 1)*(cn).  Hence, total complexity of merge sort implementation will be O(nlogn).

However, there is one caveat, merge step requires an auxiliary array to temporarily store array elements. It incurs O(n) space complexity on merge sort algorithm.

Thanks Prince Mishra for sharing python code.

# recursive merge sort

def merge(left, right):
    ret = []
    i = 0
    j = 0
    while i<len(left) and j<len(right):
        if left[i] <= right[j]:
            j += 1

    ret += left[i:]
    ret += right[j:]
    return ret

def merge_sort(arr):
    # terminal condition
    if len(arr)<2:
        return arr

    # propagation
    mid = len(arr)/2
    left = merge_sort(arr[:mid])
    right = merge_sort(arr[mid:])
    return merge(left, right)

test_arr = [2,3,4,1,8,7,4,6,6,9]
print merge_sort(test_arr)

There are some improvements which can be done in implementation of merge sort algorithm.
1. When number of elements are less than some threshold, use insertion or selection sort to sort those numbers. Why? Because, when n is small, O(n2) is as good as O(nlogn) and it saves extra overhead of split and merging. All in all, using insertion sort in input array with small size, gives better performance.
2. Before calling merging, check if all elements in right array are greater then left array, if yes, no need of merging. This can be easily checked by comparing a[mid] with a[mid+1]. If a[mid] is less than a[mid+1],  two sub arrays are already sorted and we don’t need to perform merge.

External Merge sort algorithm

Merge sort algorithm is best used with huge volume data as compared to available main memory. For example, sorting 2GB data with 100 MB of physical memory. Data cannot fit in memory, so kept on to disk. During processing, data is fetched in chunks and merged.

There are two passes in the external merge sort algorithm : Sort pass and merge pass.
external merge sort

Sort pass

1. Divide the input in N chunks where N  = Size of total data/Size of available memory.
2. Load each chunk in main memory and sort it with any conventional sorting algorithm. 
3. Load a predefined block of the sorted chunks into memory again. keep a buffer to store sorted output.

Merge Pass

1. Now have N-way merge and put output on to buffer. As buffer get full, write that onto disk. 
2. If any of the small chunk taken if exhausted, one can fill the next block from the same chunk.

External merge sort implementation can be improved significantly using parallelism. If data chunks are written on different disks, read and write can be performed in parallel. Also, different chunk can be sorted on different processors in parallel. If processors are not available, merge routine can take advantage of multithreading.

There is one problem which is classic example of usage of external merge sort is mentioned here in problem number 5 and solved here : External merge sort application.

If you have any other optimization trick or better way for merge sort implementation, please share. Please share your views and suggestions.
If you are interested in contributing to algorithm and me, please refer to publishing on algorithms and me