# Median of stream of numbers

Given a continuous stream of integers,** find median of stream of numbers** received so far. Median of array of integers is middle element of array if number of elements in array are odd and average of middle elements if number of elements is even.

Median of stream of numbers can be queried at multiple times at different point of time. Insertion and median functions can be called in any order. For example:

add(1) add(2) findMedian() -> 1.5 Median of stream of numbers till now add(3) findMedian() -> 2 Median of stream of numbers till now

Since, this algorithm which finds median at a given point of time for a stream of integer is called online algorithm.

First solution be to store the stream received in an unsorted array. Finding median of stream of numbers will take O(N log N) (because we have to sort array first before returning median), time complexity for insertion will be O(1).

How about storing integer in a sorted array? Insertion will take O(N) as we need to shift elements once correct place is identified for new integer. Median can be found O(1).

Storing stream in linked list will have same complexity as of unsorted array.

Consider some advance data structure. Here, we need some kind of order of elements. We can divide integers in two groups: all integers less than median, let it be group 1 and all elements greater than median, call it group 2.

So greatest integer in group 1 (less than median) should be less than least integer in group 2. Now, problem reduces to find greatest integer in one group and smallest on other side. Which data structure gives that kind of information in constant time? Well, answer is heaps.

Maximum element in group 1 can be found using max heap, while minimum element in group 2 can be found using min heap in constant time. We require two heaps. Max heap and min heap. Max heap contains all integers less than current median and min heap contains integers greater than current median.

How to insert an new integer in heaps and what invariant they should always possess. Since, median is middle of all integers if number for elements are odd, we need to keep size difference of two heaps either equal 0 or 1. If size difference becomes more than 1, adjust elements in two heaps.

If size of max heap is 2 more than min heap, extract maximum element from max heap and put it in min heap. If size of min heap is 2 more than max heap, extract minimum element from min heap and put it in max heap.

Now how to calculate median of stream of numbers? It’s simple again!

If number of elements in both heaps are equal, take average of max and minimum elements in heaps and return as median of stream of numbers till now. If sizes of two heaps are different, we select median as root of heap which contains more elements.

## Median of stream of numbers implementation in C

#include <stdio.h> #define MAX_SIZE 10 typedef struct heap { int size; int items[MAX_SIZE]; }Heap; int parent (int i){ return (i-1)/2; } int leftChild(int i){ return (i*2)+1; } int rightChild(int i){ return (2*i)+2; } int swap(int a[], int i, int j){ int temp = a[i]; a[i] = a[j]; a[j] = temp; } void moveUp(int heap[], int size, short type){ int current = size-1; int p = parent(size); if(type == 0){ while(p >=0 && heap[p] < heap[current]){ /*If we parent is greater than the current element, then swap parent with current */ swap(heap,p, current); // move up the tree. current = p; p = parent(current); } } else{ while(p >=0 && heap[p] > heap[current]){ /*If we parent is greater than the current element, then swap parent with current */ swap(heap,p, current); // move up the tree. current = p; p = parent(current); } } } void heapify(int a[], int i, int length){ int largest = i; int left, right; left = leftChild(i); right = rightChild(i); if(left < length && a[i] < a[left]){ largest = left; } if(right < length && a[largest] < a[right]){ largest = right; } if(largest != i){ swap(a, i, largest); heapify(a, largest, length); } } void minHeapify(int a[], int i, int length){ int smallest = i; int left, right; left = leftChild(i); right = rightChild(i); if(left < length && a[i] > a[left]){ smallest = left; } if(right < length && a[smallest] > a[right]){ smallest = right; } if(smallest != i){ swap(a, i, smallest); minHeapify(a, smallest, length); } } int delete(int heap[], int *size, short type){ int currentSize = *size; int deletedValue = -1; if(currentSize < 1) return -1; else{ deletedValue = heap[0]; swap(heap, 0, currentSize-1); currentSize--; if(type == 0){ heapify(heap, 0, currentSize); } else{ minHeapify(heap, 0, currentSize); } } *size = currentSize; return deletedValue; } void insert(int heap[], int *size, int value, short type){ heap[(*size)++] = value; moveUp(heap, *size, type); } void insertInMaxHeap(Heap *maxHeap, int val){ insert(maxHeap->items,&(maxHeap->size),val, 0); } void insertInMinHeap(Heap *minHeap, int val){ insert(minHeap->items,&(minHeap->size),val, 1); } int deleteFromMaxHeap(Heap *maxHeap){ return delete(maxHeap->items,&(maxHeap->size), 0); } int deleteFromMinHeap(Heap *minHeap){ return delete(minHeap->items,&(minHeap->size), 1); } /* 1. Add to max heap. 2. Check if max_heap.count- min_heap.count >=2 remove top of max heap and add it to min_heap */ void addNumber(Heap *minHeap, Heap *maxHeap, int value){ // Insert in max heap insertInMaxHeap(maxHeap,value); /* If difference between two heap size is more than 1 or the maximum element in more than least element in min heap then remove from max heap and add to min heap*/ if((maxHeap->size - minHeap->size) >1 || (minHeap->size != 0 && maxHeap->items[0] > minHeap->items[0])){ int top = maxHeap->items[0]; deleteFromMaxHeap(maxHeap); insertInMinHeap(minHeap,top); } /* If difference between two heap size is more than 1 */ if((minHeap->size !=0 && (minHeap->size - maxHeap->size) > 1)){ int top = minHeap->items[0]; deleteFromMinHeap(minHeap); insertInMaxHeap(maxHeap,top); } } void printMedian(Heap minHeap, Heap maxHeap){ if(maxHeap.size == minHeap.size){ printf("\n Median %d", (maxHeap.items[0] + minHeap.items[0])/2 ); } else{ if(maxHeap.size > minHeap.size){ printf("\n Median : %d ", maxHeap.items[0]); } else{ printf("\n Median : %d ", minHeap.items[0]); } } } //Driver program int main(){ Heap maxHeap, minHeap; maxHeap.size = 0; minHeap.size = 0; for(int i=1; i<=20; i++){ if(i%4 == 0){ printMedian(minHeap, maxHeap); } addNumber(&minHeap,&maxHeap,i); } return 0; }

Execution trace of program to find median of stream of numbers. Please take pen and paper and try to go through the flow.

============================================ Processing element : 1 Max heap now: 1 ============================================ Processing element : 2Max heap now: 2 1Size of max heap greater than min heap Removing max element 2 from max heap Adding max element 2 from max heap to min heapMin heap now :2============================================ Processing element : 3Max heap now: 3 1Top of max heap is greater than top of min heap Removing max element 3 from max heap Adding max element 3 from max heap to min heapMin heap now :2 3Median : 2============================================ Processing element : 4Max heap now: 4 1Top of max heap is greater than top of min heap Removing max element 4 from max heap Adding max element 4 from max heap to min heapMin heap now :2 3 4Size of min heap greater than max heap Removing min element 2 from min heap Adding min element 2 from min heap to max heapMax heap now: 2 1============================================ Processing element : 5Max heap now: 5 2 1Top of max heap is greater than top of min heap Removing max element 5 from max heap Adding max element 5 from max heap to min heapMin heap now :3 4 5============================================ Processing element : 6Max heap now: 6 2 1Top of max heap is greater than top of min heap Removing max element 6 from max heap Adding max element 6 from max heap to min heapMin heap now :3 4 5 6Size of min heap greater than max heap Removing min element 3 from min heap Adding min element 3 from min heap to max heapMax heap now: 3 2 1============================================ Processing element : 7Max heap now: 7 3 1 2Top of max heap is greater than top of min heap Removing max element 7 from max heap Adding max element 7 from max heap to min heapMin heap now :4 6 5 7Median : 4 ============================================ Processing element : 8Max heap now: 8 3 1 2Top of max heap is greater than top of min heap Removing max element 8 from max heap Adding max element 8 from max heap to min heapMin heap now :4 6 5 7 8Size of min heap greater than max heap Removing min element 4 from min heap Adding min element 4 from min heap to max heapMax heap now: 4 3 1 2============================================ Processing element : 9Max heap now: 9 3 4 2 1Top of max heap is greater than top of min heap Removing max element 9 from max heap Adding max element 9 from max heap to min heapMin heap now :5 6 8 7 9============================================ Processing element : 10Max heap now: 10 3 4 2 1Top of max heap is greater than top of min heap Removing max element 10 from max heap Adding max element 10 from max heap to min heapMin heap now :5 6 8 7 9 10Size of min heap greater than max heap Removing min element 5 from min heap Adding min element 5 from min heap to max heapMax heap now: 5 3 4 2 1============================================ Processing element : 11Max heap now: 11 3 5 2 1 4Top of max heap is greater than top of min heap Removing max element 11 from max heap Adding max element 11 from max heap to min heapMin heap now :6 7 8 10 9 11Median : 6

Complexity of finding **median of stream of numbers **is O(1) and insertion will be O(log N).

Please share if something is wrong or missing. Also, if you are interested to contribute to site, please refer to Publishing at Algorithms and Me and contact us. We would love to have you as contributor.