# Interval partitioning problem

In continuation of greedy algorithm problem, (earlier we discussed : even scheduling and coin change problems) we will discuss another problem today. Problem is known as interval partitioning problem and it goes like : There are n lectures to be schedules and there are certain classrooms. Each lecture has a start time si and finish time fi. Task is to schedule all lectures in minimum number of classes and there cannot be more than one lecture in a classroom at a given time. For example, below partitioning needs us to have 4 classes for scheduling 9 lectures from 9:30 AM to 5:00 PM.

10 lectures can also be scheduled using three classrooms as shown below.

So, second solution optimizes the output.

Another variant of interval partitioning problem is :  You want to schedule jobs on a computer. Requests take the form (si , fi) meaning a job that runs from time si to time fi. You get many such requests, and you want to process as many as possible, but the computer can only work on one job at a time.

First thing to note about interval partitioning problem is that we have to minimize something, in this case, number of classrooms. What template this problem fits into? Greedy may be? Yes, let’s try our greedy strategy, but to apply it, we need some parameter on which particular lecture will be selected for a classroom. There are some parameters for each lecture like start time, end time, shortest lecture, or lecture with minimum number of conflicts with others. Problem statement is bit similar to event scheduling problem but here optimization is done on resource where processing can be done simultaneously on different resources, where as in event scheduling, one one event can be processed at a time as we had only one resource.

By hit and trial method we can come to conclusion the best strategy to follow is to pick lecture based on earliest start time. So, here goes the algorithm : At any given time, pick up a lecture with least start time and yet not scheduled and then assign it to first available class. Will it work? Sure it does.

## Interval partitioning problem: greedy algorithm

1. Sort all lectures based on start time in ascending order.
2. Number of initial classrooms = 0
3. While lecture to be scheduled:
3.1 Take first lecture yet not scheduled,
3.2 If there a already a class available for lecture's start time
Assign lecture to the class.
3.3 If not, then allocate a new classroom
number of classroom = number of classroom + 1
4. Return number of classrooms.

Before jumping into the code, let’s discuss some data structures which we can use to implement this algorithm.

Understand that we have to find a compatible classroom for a lecture. There are many classrooms, we need to check if the finish time of lecture in that classroom is less than start time of new lecture. If yes , then classroom is compatible, if there is no such class, allocate a new class. If we store our allocated classrooms in such a way that it always gives classroom with least finish time of last lecture scheduled there, we can safely say that if this classroom is not compatible, none of the others will be.(Why?) Every time we assign a lecture to a classroom, sort the list of classroom, so that first classroom is with least finish time.

Think a bit hard, what I explained above is typical use case of min heap. Every time finish time of last lecture changes for a classroom, heap is readjusted and root gives us classroom with min finish time. Complexity (n log n). In other words, we can say we will be using priority queue keyed on finish time of last lecture.

• To determine whether lecture j is compatible with some classroom, compare sj to key of min classroom k in priority queue.
• When a lecture is added to a classroom,  increase key of classroom k to fj.

Well know we have algorithm and data structure to implement in, so let’s code it.

PrioritityQueue implementation is given below:

Classroom class is implemented as follows.

This algorithm takes overall time of O(n log n) dominated by the sorting of jobs on start time. Total number of priority queue operations is O(n) as we have only n lectures to schedule and for each lecture we have push and pop operation.

Reference :

# Coin change problem

Today, we will discuss a very common problem which can be solved using greedy algorithm. If you are not very familiar with greedy algorithm, here is the gist : At every step of the algorithm, you take the best available option and hope that everything turns optimal at the end which usually does. The problem at hand is coin change problem, which goes like : Given coins of denominations 1,5,10,25,100; find out a way to give a customer an amount with fewest number of coins.   For example, if I ask you to return me change for 30, there are more than two ways to do so like

Amount :  30
Solutions : 3 X 10  ( 3 coins )
6 X 5   ( 6 coins )
1 X 25 + 5 X 1 ( 6 coins )
1 X 25 + 1 X 5 ( 2 coins )

The last solution is the optimal one as it gives us change only with 2 coins, where as all other solutions provide it in more than two coins. Hope now that problem is clear, let jump on to solution.

Solution for coin change problem using greedy algorithm is very intuitive and called as cashier’s algorithm. Basic principle is : At every iteration for search of a coin, take the largest coin which can fit into remain amount to be changed at that particular time. At the end you will have optimal solution. 🙂 That’s greedy algorithm for you.

## Coin change problem : Greedy solution

1. Sort n denomination coins in increasing order of value.
2. Initialize set of coins as empty. S = {}
3. While (amount) is not zero:
3.1 Ck is largest coin such that x > Ck,
if there is no such coin return "no viable solution"
else:
amount = amount - Ck
S =  S  U { Ck }

## Coin change problem implementation

What will complexity of the code? First we are sorting an array of coins of size n, hence complexity with O(n log n), and then while loop, worst case of it is O(amount), if all we have is coin with 1-denomination. Overall complexity for coin change problem becomes  : O(n log n) + O(amount).

Will this algorithm work for all sort of denominations? Answer is no. It will not give any solution if there is no coin with denomination 1. So be careful while applying this algorithm.

Please share if you have any suggestion or if you want me to write on specific topic. If you liked the post, share it!

Some other greedy algorithm problems:
Interval partitioning
Minimize maximum lateness

# Stable marriage problem

Stable marriage problem is an interesting problem and there are so many variants which are usually asked in interviews. The aim is not to check if you know the exact solution and code it, but to check if you can think of something and prove that what you have suggested will always give solution.

Given a set of n men and a set of n women, a pair of (man, woman) is called as a match. Problem is to  find a “stable” matching with below information given

• Participants rank members of opposite sex.
• Each man lists women in order of preference from best to worst.
• Each woman lists men in order of preference from best to worst

Example data could be like:

A matching is called as perfect matching when number of pairs = number of men = number of women.

What is a stable match in the problem? To understand stable match, let’s see what is unstable match. A pair is said to be unstable condition below is true (given m is man and w is woman in some other pairs):

m prefers w to his current partner and w prefers m to her current partner

If there is no unstable pair in a perfect matching then matching is called as stable matching.

Let’s see an example. We will take above example data for preferences. If make pairs like (Alice => Amy), (Brajesh => Barkha) and (Prasad => Cally). It’s a perfect matching as number of pairs (3) =  number of men (3) = number of women (3). But is it stable matching?

To check if it is stable matching, check if all pairs stable? In this case answer is No. Why? Because Cally prefers Brajesh more than Prasad (her current partner) and Brajesh also prefers Cally over Barkha (his current partner). So, Brajesh and Cally are part in unstable pairs.

Let’s make pairs like (Alice, Amy), (Brajesh, Cally) and (Prasad, Barkha). There is no unstable pair there is no man or woman meeting conditions mentioned above.

How to solve for stable pairs or all matches which are stable? There is very simple but perfect algorithm to solve this problem known as Gale-Shapley algorithm.

## Stable marriage problem :Gale-Shapley algorithm

1. Assign each man as free
2. While some man (m) is free
2.1 Let w be the first woman on list whom m has not proposed.
2.2 if w is free
Assign w to m, they are now a pair
2.3 else if w is engaged, and w prefers m over her current partner,
Assign w to m (m,w) are pair now, w's current partner is free
2.4 else
w rejects m and m remains free
3. Return all the pairs, they will be definitely stable

To understand the proof of termination and correctness of this algorithm, please refer http://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15251-f10/Site/Materials/Lectures/Lecture21/lecture21.pdf

## Stable marriage problem implementation

Implementation wise, we need two two-dimensional arrays, to represent preference of each man and woman. For each woman, we need to find one thing which is if she prefers new man over her current partner. So, we need to store current pairs, indexed on woman. Also, to find if she prefers new man over current pair, we may need to traverse through all her preferences. To avoid that, there is one concept called as inverse. In this we don’t store man at given precedence, but store precedence for a given man.

inverse(w):
for each index i in preference list
inverse[preference[i]] = i;
return inverse

Once inverse array is in place, to find if woman prefers other man over current partner, just check if inverse[current partner] > inverse[new man].

Now, that we have algorithm and data structures ready, let’s code it.

There are several other variants of this problem, we will discuss them in future posts. Stay tuned, please share your thoughts if something can be improved or if something is wrong.

# Celebrity Identification Problem

Consider a case where you enter into a party, there are all unknown people. All you need to know is if there is some celebrity in the party as you are a newspaper reporter for Page 3. There are few things which can help you : a) If there is a celebrity in the party, all the people know him/her. b) Celebrity does not know anybody in the party. This problem is known as celebrity identification problem.

To make things more simpler, people at party always tell the truth if they know or don’t know other person when asked about it. Now, your task is to find celebrity in party by asking these people only one question : Do you know person X? Person replies in either ‘Yes’ or ‘No’. You can ask this question as many times you want. Only thing is you want to minimize the number of questions for your own sake. 🙂

Now, that problem is clear, let’s think of a solution. Typically, when there is some connection involved between entities, graph is the first thing to think about. Entities usually become nodes of graph and there will be an edge between two entities if there is some connection between them. In this particular problem, each person in the party becomes a node. What should be the edge? If a person knows other person, then there is an edge between two persons.  Well, should it be directed or undirected edge? I leave it to you think over. Cool, we got our graph ready!

How to get celebrity out of this graph? Hint 1 : Every body in the party knows celebrity. So there must be N-1 edges (incoming) for N-1 persons in party. Is it necessary but is it sufficient condition for determining celebrity?  No. Because, there is another condition which says celebrity does not know anybody. In this case there should not be any outgoing edge from the celebrity node. (Hint for whether edge should be direct or undirected). Typical graph for a party with celebrity will be like :

So look for such a node in graph and then you are done!! If there is no such node, there is no celebrity in party and you are wasting your time.

How number of questions did you ask to solve this? To create graph, you need to ask each and every person in party about each and every person present in party. If there are N persons, you would ask N * (N-1) questions to create graph, which will make complexity of your solution as O(n2). Can you please do better?

## Celebrity identification problem by elimination method

There are two things to note about any candidate celebrity in party. If there are two person u and v in party and you ask u if he knows v. There are two outcomes : either u know v in that case u is definitely not celebrity as celebrity does not know anybody. Second, if u does not know v,  then v is not celebrity as everybody in party knows celebrity. Any which way, you are eliminating one person of two, other goes back to list of candidates for celebrity. To start with all present at party are candidates.

At the end you will be left with one candidate. Will that be surely a celebrity? Answer is no. Because, we still need to check if he does not anybody else. Consider a case, that we never asked our last candidate the question and he survive till end, however, he knows somebody at the party. To avoid this case, let ask this last candidate the same question for each person. If he does know somebody, he is not celebrity or else he is.

How many questions did you ask? With each question, you just eliminated one persons, saving potential (n-1) questions. Total number of questions asked during elimination phase is n-1. We ask n-1 questions again to last candidate. Therefore, total number of questions asked are  2 (n-1) in worst case.

## Celebrity identification problem algorithm

Elimination algorithm
1. Create a list of all persons at the party.
2. While there are at least two persons in list:
2.a Take out two persons from list, call them u and v
2.b Remove u from list and insert v in list if u knows v
2.c Else remove v from list and put u in list again.
3. Return list.

Verification algorithm
1. For each person u at party, check if candidate c know u.
1.a. If yes, then return false
2. Return true

## Celebrity identification problem implementation

Reference : Princeton Course problem

# First non repeating character in string

Given a string, find first non repeating character in a string. For example, string  = “abcbdbdebab”, the first non repeating character would be c. Even though ‘e’ is also non repeating in string, ‘c’ is output as it is first non repeating character.

The first point which should be taken into account here is that there needs to be some kind of counting. Counting of characters. For a non repeating character, this count should be exactly 1. For keeping count against a character, some kind of mapping is required. This mapping will come from a hash. So, let’s have a hash with character as keys and count as value. But wait, for ASCII characters, do we need an hash? Well no. We can do it with an array with characters used as indices. This, however, puts a restriction that input can have only ASCII characters, for UNICODE, this solution won’t work and we map have to go for actual hash.

Now, that we have counted number of occurrence of each character in string, how to use it to find first non repeating character? Let’s go back to our string, for each character, check what is count for that character. If the count is 1, return that character. Simple as that!!

## Code to find first non repeating character in a string

Complexity of algorithm is O(n) with O(1) space complexity. There is always some confusion here about space complexity, we think as 256 characters are used, should it not be counted as space complexity. Definitely. But in asymptotic notations, this space is independent of size of input, so space complexity remains O(1).

One more thing, even though time complexity is O(n), input string is scanned twice, first time to get count of characters and second time to find first non repeating character in that string.

And little better approach

In case, string is too long, i mean millions of characters, most of them repeated, very few non repeated, this solution may turn slow in the last step. In other words, we need to find some mechanism so that we don’t need to scan input string again.  In this case, some information needs to be stored with count, so that we can figure out if the character is first non repeating or not.

To achieve above mentioned solution, the extra information to be stored is index of first occurrence of each character. Now, once we have already scanned through string, scan through the array. Check for characters with count as 1. If character has count as 1, see if index of it’s occurrence in string is less than index of previous character with count as 1. If yes, change character to be returned to new character. Once, we are done with scanning count array, we will have first non repeating character.

## Find first non repeating character in string : Algorithm

1. Start with 0 index of string till end
2. Count occurrence of each character and store them in array.
With count,if the char is seen first time, store it's index too.
3. Now, scan the count array again.
4. If a character has count as 1, check if index of this character.
5. If index is less than previous or previously char does not exist, change non repeating character to this character.
6. return non repeating character.

## Find first non repeating character in string : Implementation

Complexity of this algorithm remains same, however, second scan of string is now not required. In other words, second scan is now independent of size of input.

Python code
Learning it so bear with some non optimized code or silly things 🙂

Implementation of second method in python

Hope this explains the doubts around this problem. Please share your views in comments if you have some different method or idea. If you liked the post, please share it and spread the knowledge.

# 3-way quicksort

In post here : Dutch National Flag Problem, we discussed how can we sort and array in linear time by partitioning it in four parts, red, white, unknown and blue. Can we apply the same technique on quick sort. As we all know that quick sort relies on partitioning of input. Usually, input is partitioned in  two parts. What if we divide input space in three parts? Then it becomes 3-way quicksort.

The 3-way partition variation of quick sort has slightly higher overhead compared to the standard 2-way partition version. Both have the same best, typical, and worst case time bounds, but this version is highly adaptive in the very common case of sorting with few unique keys.

## Quicksort basics and limitation

Before going ahead with 3-way partition method, I would strongly recommend that you go through usual quick sort algorithm : Quick sort algorithm in C

A big limitation of quick sort is that it has O(n2) worst case running time. Some improvements have been suggested as given below:

• Cutoff to insertion sort. Switch to insertion sort for arrays with size less than a pre-decided limit. We follow quick sort and partition array. Once, the size of partitioned array goes lower than limit, apply insertion sort on that array. Limit varies from system to system and typically it is between 5 to 15.
• Median-of-three partitioning. Use median of a small sample of items taken from the array as the partitioning item. Doing so will give a slightly better partition, but at the cost of computing the median.

There is another optimization which is called as Entropy-optimal sorting or 3-way partition quick sort. This method is useful when inout array contains lot of duplicate values which is very frequent in real world. Idea is to divide array in three parts rather than two. Let’s say P be pivot. First part contains all numbers which are less than p, second part contains number equal to p and last part contains numbers which are greater than p.

## 3-way quicksort algorithm

3-way quicksort is optimization which works in specific cases like when input to be sorted contains few unique keys, in this case having traditional approach of one pivot does not perform well compared to 3-way quicksort.
Some of the properties of 3-way quicksort are:
It is not stable, when stability is the requirement, avoid using quicksort in general.
It uses O(lg(n)) extra space, why? Because of the recursion.
Worst case run time is as same as classic quicksort, O(n2), but typically O(nlog(n)) time
Best part is it takes O(n) time when O(1) unique keys.

This algorithm partitions array into three parts:
1)one with is less than pivot
2)equal to pivot
3)one with elements greater than pivot

## 3-way quicksort Implementation

#include <stdio.h>

int min(int a, int b){
if(a > b)
return b;
return a;
}

void swap(int a[], int i, int j){
int temp = a[i];
a[i] = a[j];
a[j] = temp;
}

void quicksort(int a[], int low, int high){
int pivot = a[high];

int lt = low;
int gt = high;

lt++;
}
gt--;
}
else{
}
}

int minimum = min( gt-lt, high-gt+1);
for(int i=0; i<minimum; i++){
swap(a, lt+i, high-minimum+1+i);
}

if( low < high){
quicksort(a, low, lt-1);
quicksort(a, high-gt+lt, high);
}
return ;
}

int main(void) {
int a[] = {4,3,3,2,7,9,2,3,5,6,7,4};
int size = sizeof(a)/sizeof(a[0]);

quicksort(a, 0, size-1);

printf("\n");
for(int i=0; i<size; i++){
printf("%d ", a[i]);
}

return 0;
}


Complexity of 3-way quicksort is O(n2).

Please share if something is wrong or any other suggestion or query. We would love to hear what you have to say.

# Dutch National Flag Problem

Given an array with 0s,1s and 2s, sort array such that 0s,1s and 2s are together. Other way to pose this problem is to say : There are three colors Red, Blue and White. Arrange these colors in such that Red comes first, White next and Blue last. This is typically know as Dutch national flag problem. Example:

A = [0,1,0,1,1,2,0,2,0,1]
Output = [0,0,0,0,1,1,1,1,2,2]

A = [R,B,B,W,B,R,W,B,B]
Output = [R,R,W,W,B,B,B,B]

## Count to sort an array of 0s,1s and 2s

We have already seen similar problem before: Segregate 0s and 1s in an array. There also, we explored how to count elements and then re-write them again on to array. Let’s apply same method here too. Take an array of three integers, index store corresponding count for that number. E.g count[0] stores count of 0 and count[1] stores count of 1. Scan through array and count each element. Last, again re-write those numbers back on to array starting from index 0, according to their count. For example, if there are 4 zeros, then starting from 0 index, write 4 zeros, followed by 1s and then by 2.

Complexity of counting method is o(n), notice that we scan array twice.

## Dutch national flag problem algorithm

In this method, keep three pointers : reader, low and high. Reader and low start with one of array and high starts from opposite end of array. Follow below three steps and at end, array will be in required position.

1. If number pointed by reader is 0, swap with number at low index and move low one step ahead.
2. If number pointed by reader is 1, do nothing, increment reader.
3. If number pointed by reader is 2. swap it with high index and decrement high.

Actually, three pointers divide array into four parts.  Red, white, unknown and Blue. Every element is taken from unknown part and put into its right place. So all three other parts expand while unknown part shrinks.

## Dutch national flag problem implementation

Complexity of algorithm to solve Dutch National Flag problem is O(n). In this case, we scan the array only once.

Please share if you have some suggestions or comments. Sharing is caring.

# Inversions in array

Let A[0…n – 1] be an array of n distinct positive integers. If i < j and A[i] > A[j] then the pair (i, j) is called an inversion of A. Given n and an array A, find the number of inversions in array A. For example:

A = [2,4,1,3,5]
There are three inversions : (2,1) (4,1) and (4,3)

A = [2,5,1,7,9]
There are two inversions (2,1) and (5,1)

There are no inversions in completely sorted array and nC2 inversions in completely inverted array.

## Methods to find inversions in array

Brute force

Scan through array elements, and check all elements on right of it. If any element on right side is greater than the element being scanned, increase inversion count. Implementation is given below.

Complexity of this method to find inversion in array is O(n2).

Using merge and count

In this method, we will use merge step of merge sort algorithm with a bit modification. Algorithm will be as follows

1. Split array into two parts and merge sort two. Split will happen as it is done in merge sort. Base condition is also same.
2. While merging count number of inversions.

Once, we know inversions in n/2 sized array, then those can be combined to get inversions for n sized array. Below figure explains this.

What kind of inversions are not counted here? Yes, the inversions which are occurring when merge is being done. Every time A[i] is appended to the output of merge sort, no new inversions are encountered, since A[i] is smaller than everything left in array B.  If B[j] is appended to the output, then it is smaller than all the remaining items in A, we increase the number of count of inversions by the number of elements remaining in A.

Complete method is shown in figure.

## Algorithm to count inversions in array

MergeAndCount(int A, int B):
1. Let A,B two input sorted arrays
2. Output list
3. i,j current pointers to A and B array, start at 0
4. Count number of inversion, initially 0
5. While i < length(A) and j < length(b)
5.a if b[j] < a[i] : count += number of element remaining in A
5.b Increment j; j++
5.c else increment i; i++
6. Return count

Above method is replaced as last merge part of merge.

SplitAndCount(L):
1. if L has one element return 0
2. else
2.a divide L into A, B
2.b inversionLeft  = MergeSortAndCount(A)
2.c inversionRight = MergeSortAndCount(B)
3.d inversionMerge = MergeAndCount(A,B)
return inversionLeft + inversionRight + inversionMerge

## Inversions in array implementation

Complexity of finding inversion of arrays using merge sort method is       O(n log n).

References: Lecture 8

# Segregate 0s and 1s in an array

Given an array of 0s and 1s, segregate 0s and 1s in such as way that all 0s comes before 1s. This problem is very close to Dutch national flag problem. For example:

A = [1,0,0,0,1,1,0]
Output = [0,0,0,0,1,1,1]

A = [0,0,1,0,1,1]
Output = [0,0,0,1,1,1]

## Methods to segregate 0s and 1s in array

Counting 0s and 1s.
Count occurrence of 0s and 1s in array and then rewrite o and 1 onto original array those many times.

Complexity of this method is O(n) with no added space complexity.

Using two indices.

Take two indices, one starting from index 0 (left) and other from n-1 (right) where n is number of elements in array. Move left forward till it encounters 1, similarly decrement right till 0 is encountered. If left is less than right, swap two and continue again.

## segregate 0s and 1s implementation

Complexity of this method to segregate 0s and 1s in an array is O(n).

## Segregating odd and even numbers in array

Method 2 can also be used to solve this problem. Algorithm is given below:

1. Set left = 0 and right = n-1
2. While left < right
2.a if a[left] is even left++
2.b if a[right] is odd right--;
2.c if left < right, swap a[left] and a[right]

Implementation is very similar to earlier version of segregating 0s and 1s.

# Median of two sorted array

In Find median of sorted arrays, we discussed how to find median of two sorted arrays by calculating medians of two arrays individually and comparing them. In this post, let’s discuss another method to find median of two sorted arrays using merge method.

To start with, how to find median of a sorted array? It’s O(1) operation, we return either mid element or (mid element+ ( mid + 1 element )) /2, depending on if array has odd number of elements or even number of elements. For example for array A = [1,5,9,12,15], median is 9.

To find median of two sorted arrays in no more simple and O(1) operation. For example, A = [ 1,5,9,12,15] and B = [ 3,5,7,10,17], median is 8. First, we merge two sorted arrays into one sorted array, in this example, it will be C = [1,3,5,5,7,9,10,12,15,17]. Now, problem reduces to find median in a sorted array. In current example, since number of elements is even, we take average of two middle elements. Although to find median in a sorted array is O(1), merge step takes O(N) operations. Hence, overall complexity would be O(N).

## Median of two sorted arrays : Merge

Merge method to find median of two sorted array as explained is quite simple. Merge part is as same as of merge sort algorithm. We start from beginning of two arrays and advance the pointer of array whose current element is smaller than other. This smaller element is put on to output array which is sorted merge array. Merge will use an additional space to store N elements (Note that N is here sum of size of both sorted arrays). Best part of this method is that it does not consider if size of two arrays is same or different. It works for all size of arrays.

This can be optimized, by counting number of elements, N, in two arrays in advance. Then we need to merge only N/2+1 elements if N is even and N/2 if N is odd. This saves us O(N/2) space.

There is another optimization, do not store all  N/2 or N/2+1 elements while merging, keep track of last two elements in sorted array, and count how many elements are sorted. Average of last two, when N/2+1 elements are sorted is our median if N is even, else N/2 element is median . With this optimization, time complexity remains O(N), however, space complexity reduces to O(1).

## Median of two sorted arrays implementation

Complexity to find median of two sorted arrays using merge operation is O(n). Also, O(n) extra space. There is another method, using binary search which has complexity of O(log n), let’s discuss it in next post.

Optimized version to find median of two sorted arrays