Find Minimum Cost Path in a 2D Matrix: DP

Problem statement is : Given a 2D matrix, Cost[][], where Cost[i][j] represent the cost of visit cell Cost[i][j], find minimum cost to reach a given cell Cost[n][m], where a cell can be reach from it’s left (by moving one step right) or from top (by moving one step down).

Finding Minimum Cost Path in a 2D Matrix

This problem is very similar to what we saw in Finding possible paths in grid. As mentioned there, grid problem reduces to smaller subproblems one we make a choice at a cell. In this problem, to reach a cell (i,j), we can either come from cell ( i-1, j ) from the top, or from cell (i, j-1) from left, whichever is minimum.

CostToMove(i,j) = Min(CostToMove(i-1,j), CostToMove (i, j-1)) 
                 + Cost(i,j)

This equation gives hint at recursion. What should be terminating condition for recursion. It’s obvious, when we reach desired cell.

findCost(i,j, cost) = cost(i,j) + Min( findCost(i-1,j, cost),
                                       findCost(i,j-1, cost))

Recursive implementation of recursion equation to find minimum cost to reach a given cell in matrix.

#include <stdio.h>
#include <stdlib.h>

int min( int a, int b){
    if(a > b) return b;
    return a;
}

int findMinimumCostPath(int Cost[3][3], int i, int j){
	
    if( i == 0 && j == 0 )
        return Cost[0][0];
	
    if( i == 0 )
	return findMinimumCostPath(Cost,i, j-1)
               + Cost[i][j];
    if( j == 0) 
    	return findMinimumCostPath(Cost,i-1, j)
               + Cost[i][j];
    	
    return min(findMinimumCostPath(Cost,i-1, j), 
    		   findMinimumCostPath(Cost,i, j-1)
                   + Cost[i][j]);
}
int main()
{
    int M,N; 
    
    M = N = 3; 
    int Cost[3][3] = {
        1,3,4,
    	5,3,2,
        3,4,5
    };
    printf("Minimum cost of path : %d" ,
         findMinimumCostPath(Cost, M-1, N-1));
    
}

Complexity of recursive method is exponential.

Notice that this problem satisfies two basic conditions to apply dynamic programming.

Optimal Sub-structure:- Optimal solution to bigger problem involves optimal solutions to subproblems.

Overlapping Subproblems:- Subproblems once computed can be stored in a table for further use. This saves the time needed to compute the same sub-problems again and again.

To find more about these properties, please refer : Dynamic programming basics.

Solution to this problem depends on solution to subproblems and we can see that there are many subproblems which are calculated again and again. So, this problem is ideal candidate for applying dynamic programming.

Let’s create a two dimensional matrix and fill it up bottom up. Let’s consider the top most row in matrix. Any cell in top most row can be reach only from left.

MinCost(0,j) = MinCost(0,j-1) + Cost[0][j]

Similarly, cell in leftmost column can only be reached from top.

MinCost(i,0) = MinCost(i-1,0) + Cost[i][0]

For all other cells,

MinCost(i,j) = Min( MinCost(i-1),j), MinCost(i, j-1)) + Cost[i][j]

Implementation of above equation is below:

#include <stdio.h>
#include <stdlib.h>

int min( int a, int b){
    if(a > b) return b;
    return a;
}

int findMinimumCostPath(int Cost[3][3], int M, int N){
	
    int MinCost[M][N]; //declare the minCost matrix

    MinCost[0][0] = Cost[0][0];

    // initialize first row of MinCost matrix
    for (int i=1; i<N; i++){
        MinCost[0][i] = MinCost[0][i-1] + Cost[0][i];
    }

    for (int i=1; i<M; i++){
        MinCost[i][0] = MinCost[i-1][0] + Cost[i][0];
    }
    
    for (int i=1;i<M; i++){
    	for (int j=1; j<N; j++){
           MinCost[i][j] = min(MinCost[i-1][j],
                           MinCost[i][j-1]) + Cost[i][j];
        }
    }

    return MinCost[M-1][N-1];
    
}
int main()
{
    int M,N; 
    M = N = 3; 
    int Cost[3][3] = {
    	1,3,4,
    	5,3,2,
    	3,4,5
    };
    printf("Minimum cost of path : %d" ,
            findMinimumCostPath(Cost, M, N));
    
}

Complexity of dynamic programming approach to find minimum cost path in grid is O(n2) with additional space complexity of O(n2).

There is extension to the problem which is to find actual path that lead to the destination. Solution is simple, start from the destination cell, as that will be part of path anyways, start moving either to cell to left or top whichever is minimum till you reach origin cell.

There is one more variant of this problem, when we can move from left to right, top to down and diagonally as well. Nothing changes in solution except that we need to take minimum of three cells (left, top and diagonal).

Please share if there is something wrong or missing. If you want to contribute, please write to us 

Boolean Parenthesization Problem

Counting Boolean Parenthesis

Let’s say there is a boolean expression, a string which contains True or False as operands and between each pair of operand we have boolean operator (and, or and xor).  An example of a boolean expression is T and F or T xor F. Ask of boolean parenthesization problem is to find number of ways in which this boolean expression can be parenthesized so that expression evaluates to True. For example :

Expression : T ^ F & T
Two ways : ((T ^ F) & T) and (T ^ (F & T))

T | T & F ^ T
Four ways : ((T|T)&(F^T)), (T|(T&(F^T))), (((T|T)&F)^T) 
and (T|((T&F)^T))

Let’s understand the concept first for solution and then we will see how can we implement it. If we start with an expression with only one boolean value T or F, how many way we can parenthesized it? Well, there are two answers to it. For T, there is one way, (T), whereas for F, there no way we can parenthesize to evaluates True. First take away for this insight is that we got the base case for recursion, if our solution is recursive. Second take away is that there can be two different outputs an expression or subexpression can evaluates and we have to store them both.

if T(i,j) is number of ways expression from i to j can be parenthesized so that it evaluates to True. Similarly, F(i,j) is number of ways expression evaluates to False. From above expression we know that :

T(i,i) = 1 if operand is T
         0 if operand is F

F(i,i) = 0 if operand is T
         1 if operand is F

How to calculate T(i,j)?  Let’s try to split this problem into smaller subproblems and reach at base case. If take an index k such that i<k<j, we have two smaller subproblems to solve now, T(i,k) and T(k+1,j). How can solution to these subproblems can be combined to get solution to bigger problem T(i,j). This is tricky part and depends on operator (and, or, xor) after k operand.

If there is “and” operand between two subexpressions (i,k) and (k+1,j) then how can expression (i,j) be True? Yes, only if expression (i,k) and expression (k+1,j) are True. For “and” operator, T(i,j) is:

T(i,j)  = Summation ( T(i,k) * T(k+1,j)) for all k such that i < k < j

How about expression (i,j) being evaluates to False? Simple enough, one of the two expression should evaluate to False.

F(i,j) = Sum ( Total (i,j) - T(i,k)* T(k+1)) for all k for i< k< j

What is Total(i,j) here?

Total(i,j) =   Total(i,k) *  Total( k+1, j) )
Total(i,k) =   T(i,k) + F(i,k)
Total(k+1,j) = T(k+1,j) + F(k+1,j)

In short, Total(i,j) = T(i,j) + F(i,j)

When operand between (i,k) and (k+1, j) is or:

T(i,j) = sum ( Total(i,j) - F(i,k)* F(k+1,j)) for k such i<k

In the same vein, T(i,j) and F(i,j) when operand is xor will be

T(i,j) = sum(T(i,k)*F(k+1,j) + F(i,k)* T(k+1,j)) for k such i<k

To find solution to problem all we need to find is T(1,n). Now, let’s see how to implement the approach.

Dynamic programming implementation for boolean parenthesization problem

#include<stdio.h>
#include<string.h>
 
int countNumberOfWaysToParenthesize(char operands[], char operators[], int n)
{
    int F[n][n], T[n][n];
 
    for (int i=0; i<n; i++){
        F[i][i] = (operands[i] == 'F')? 1: 0;
        T[i][i] = (operands[i] == 'T')? 1: 0;
    }
 
    for (int L=1; L<n; L++) {
    	for (int i=0, j=L; j<n; ++i, ++j){
            T[i][j] = F[i][j] = 0;
            for (int count=0; count<L; count++){
                int k = i + count;
                int totalIK = T[i][k] + F[i][k];
                int totalKJ = T[k+1][j] + F[k+1][j];
                if (operators[k] == '&') {
                    T[i][j] += T[i][k]*T[k+1][j];
                    F[i][j] += (totalIK *totalKJ - T[i][k]*T[k+1][j]);
                }
                if (operators[k] == '|'){
                    F[i][j] += F[i][k]*F[k+1][j];
                    T[i][j] += (totalIK*totalKJ - F[i][k]*F[k+1][j]);
                }
                if (operators[k] == '^'){
                    T[i][j] += F[i][k]*T[k+1][j] + T[i][k]*F[k+1][j];
                    F[i][j] += T[i][k]*T[k+1][j] + F[i][k]*F[k+1][j];
                }
            }
        }
    }
    return T[0][n-1];
}
 
// Driver program to test above function
int main()
{
    char operands[] = "TTFT";
    char operators[] = "|&^";
    int n = strlen(operands);
 
    printf ("\n Number of ways to parenthisize expression : %d", 
                 countNumberOfWaysToParenthesize(operands, operators, n));
    return 0;
}

We will have two strings, one string operands[] represents all operands (T or F) and other string operations[] representing all operator (‘&’, ‘|’ , ‘^’).  Operands[i] is inserted at operands[i+1] to generates the expression. Below is the code which uses this definition and provides implementation of method discussed earlier.

Complexity of this approach to find ways to parenthesize a boolean expression so that it evaluates to True is O(n3). There is also O(n2) space complexity.

Please share if there is something missing or wrong. If you want to contribute to algorithms and me, please contact us.

Dynamic Programming : Matrix chain multiplication

Matrix chain multiplication using dynamic programming

What is matrix multiplication? To read on that please refer to Wiki.
However, today’s problem is not about actually multiplying matrices, but to find out an order in which we multiply matrices so that number of multiplication done are minimum.

There are two basic properties of a matrix : number of rows and number of columns in matrix. Row and columns are called as dimensions of matrix.
Given N matrices of (MxN dimensions). Find an order in which these matrices are multiplied, so that total number of scalar multiplications are least. and dimensions of each matrix given in an array P where P[i-1] and P[i] denote rows and column respectively of ith matrix.

Matrix chain multiplication is a typical problem which is used to explain dynamic programming. Approach learnt here can be easily applied to many other problems.

Before going further, lets understand some points about matrix multiplication

1. Matrix multiplication is associative i.e.  A* (B*C) = (A*B) *C
2. It is not commutative i.e  A * (B*C) not equal to A * (C * B)
3. To multiply two matrices, they should be compatible i.e. no of columns in first matrix should be equal to number of rows of second matrix.
No of columns of first matrix = No. of rows of second matrix

Let’s get back to problem at hand.  Look at this problem as this : Figure out a way to put parenthesis around matrices so that total number of multiplication are least.

Brute force way is to find out every possible combination of parenthesis and check which one is minimum.
This is approach can be implemented in recursive way, once we put a parenthesis, problem is reduced putting parenthesis around N-1 matrices to be multiplied. However, code will be exponential in time and hence of no use for every large input.

Let’s see if dynamic programming fits in. We can see that cost of multiplying matrices Ai to Aj  is cost of

Cost (Ai, Aj) = Cost(Ai,Ak) + Cost(Ak+1,Aj )+(P[i-1] * P[k] * P[j])

Idea is to find out K such that cost(Ai, Aj) becomes minimum. If M[i,j] represents the cost to multiply matrix i to matrix j, then,

M[i,j]  = M[i,k] + M[K+1,j] + ((P[i-1] * P[k] * P[j])

When calculating M[i,j]; M[i,k] and M[k+1,j] should be already available. Also, M[i,i] = 0 as cost of multiplying a single matrix will be 0.

Looking at the requirement to calculate M[i,j], it depends on length of chain. Length of chain from matrix i to matrix j would be i+j-1. We start with length L = 2 and go on to solve the table entry for length N. M[1, N] will give us the final cost.

We will start with L =2, chain we are considering is of length of 2 as j =  i + L -1  = 1 +2 -1 =2.  We will start with i =1 and loop runs to i = N-L +1 and considering two matrices at time.

Length being considered will vary for L =2 to L = N. Based on L, j will also change and we will be finding minimum cost(i,j) for each i and j. To find minimum cost(i,j), we need to find a K such that expression

Cost (Ai, Aj) = Cost(Ai,Ak) + Cost(Ak+1,Aj )+(P[i-1] * P[k] * P[j])

becomes minimum. This is top down filling of table where M [1, 2] will be filled first and then going up to fill till M [1,N].

Implementation for multiplication of matrix problem

#include<stdlib.h>
#include<stdio.h>

#define MAX_INT 10000000

int matrixMultiplication(int p[], int N){

  int L,i, j, temp;

  int M[N][N];

  for(i=0; i<N; i++){
  	for(j=0; j<N; j++){
       M[i][j] = 0;
  	}
  }       

  for(L=2; L<N; L++){
    /* For every position i, we check every chain of len L */
    for(i=1; i<N-L+1; i++){
    	
        j = i+L-1;
        M[i][j] = MAX_INT;
        /* For matrix i to j, check every split K */
            for(int k=i; k<j; k++){
                temp = M[i][k] + M[k+1][j] + p[i-1] * p[k] * p[j];
               /* Check if the current count is less than minimum */
                if(temp < M[i][j]){
                    M[i][j] = temp;                 
                }
            }
        }
    }
    for(i=1; i<N; i++){
    	for(int k=1; k<N; k++){
    		printf("%d ", M[i][k]);
    	}
    	printf("\n");
    }
    
 return M[1][N-1];
}
/* Driver program to run above code */
int main(){

    int p [] ={10, 20, 30, 40, 30};
    int n = sizeof(p)/sizeof(p[0]);
    
    printf("%d\n", matrixMultiplication(p,n));
    
    return 0;
}

Let’s run through an example and understand how does this code work?

P = {10, 20,30,40,30}, 
dimensions of matrix [1] = 10X20, 
dimensions of matrix [2] = 20X30, 
dimensions of matrix [3] = 30X40,
dimensions of matrix [4] = 40X30

Function will be called with array P and size which is 5.
Table of N size is declared. M[6][6] because we will need to store M[1][4].

Diagonal of the table is set to zero, as cost of multiplication of single matrix is 0.

M[0][0] = M[1][1] =M[2][2]=M[3][3]=M[4][4]= M[5][5]=M[6][6] =0

Start with for loop with L =2. Inner for loop will run from i =1  to N-L+1 = 5-2+1 =4, j will be i +L -1. which is 2.
Inner most loop runs with K=1 (i) till k = 2 (j). M[1,2] is INT_MAX.

temp = M[1,1] + M[2,2] + P[0] *P[1]*P[2] = 6000.

Since it is less than INT_MAX M[1,2] = 6000.
Similarly, for i =2, j = 2 + 2 -1 =3. Above process is followed and M[2,3] = 24000 and so on.

M[3,4] = P[2] * P[3] * P[4] = 36000.

Coming to L =3. i =1, j = 4-1 =3. K =1 to K <3.
For K =1,

temp =  M[1,1] + M[2,3] + P[0] * [1] * P[3] = 24000 +8000 = 32000

For K=2,

temp = M[1,2] + M[3,3] + P[0]*[2]*P[3] = 6000 + 12000 = 18000

Hence M[1,3] = min(32000, and 18000) = 18000 with i = 1.

With i =2. j =2+3-1 =4. K =2 to K <4
For K =2,

temp = M[2,2] + M[3,4] + P[1] * P[2]*P[4] =  36000 + 18000 = 54000.

For K =3,

temp = M[2,3] + M[4,4] + P[1] * P[3]*P[4] =  32000 + 24000 =  66000.

Hence M[1,3] remains 18000.
Same process is followed and the entries are filled. Finally, matrix will look like this :

0 6000 18000 30000 
0 0 24000 48000 
0 0 0 36000 
0 0 0 0

M[1,N-1] will be solution to our problem.

Time complexity of matrix chain multiplication using dynamic programming is O(n2). Also space complexity is O(n2).
Reference
http://www.personal.kent.edu/~rmuhamma/Algorithms/MyAlgorithms/Dynamic/chainMatrixMult.htm

If you want to contribute to website, please contact us. Please share if you there is something wrong or missing.

Longest common substring

Given two string A and B, find longest common substring in them. For example, A = “DataStructureandAlgorithms” and B=“Algorithmsandme”, then longest common substring in A and B is “Algorithms”

Brute force solution is to find all substrings of one string and check any of these substring are substring of second string, while comparing, keep track of the longest one we found. There can be n2substring for a string with length n and to find if a string is substring of another, it takes another m operations, where m is length of second string. Hence, overall complexity of this method is O(n2m).

Can we do better than that?

Longest common substring using dynamic programming

This solution is very similar to Longest common subsequence. Difference between two problems is that a subsequence is collection of characters, which may or may not be contiguous in string, where for a substring, characters must be contiguous. Based on this difference, out solution will vary a bit.

Let’s create a two dimensional array called LCS with dimensions as n and m. LCS[i][j] represents the length of longest common substring in A[0..i] and B[0..j].

As in case of longest common subsequence, we will start with smaller case first. What if one of the string is empty?  If one of the string is empty then, LCS = 0. Hence, LCS[i][0] = 0 and LCS[0][j] = 0.

How to fill LCS[i][j]?

1. Check if A[i] is equal to B[j] 
   1.1 If yes, LCS[i][j] = 1 + LCS[i-1][j-1]
( Because new character is added to already common substring, 
     if any, till A[0...i-1] and B[0,,j-1])
   1.2 if both characters are not same, LCS[i][j] = 0,
       ( Because if characters are not same, there cannot be any
         common substring including A[i] and B[j]. 
2. Traverse the matrix and find the maximum element in it, that will be the length of Longest Common Substring.
(This step can be optimized by keeping track of max length while calculating LCS[i][j]).

Implementation

#include <stdio.h>
#include <string.h>
 
int max(int a, int b){
	return a>b ? a:b;
}
 int longestCommonSubstring(char * A, char * B){
 	int lenA = strlen(A);
 	int lenB = strlen(B);
 
	int LCS[lenA+1][lenB+1];
 
	for (int i=0; i <= lenA; i++){
		LCS[i][0] = 0;
	}
 
	for (int j=0; j <= lenB; j++){
		LCS[0][j] = 0;
	}
 
	int maxLength = 0;
 
	for (int i=1; i<= lenA; i++){
		for (int j=1; j <= lenB; j++){
			if (A[i] == B[j]){
				LCS[i][j] = 1 + LCS[i-1][j-1];		
				maxLength = max( maxLength, LCS[i][j] );
			} 
			else {
				LCS[i][j] = 0;
			}
	    }
	}
	return maxLength;
}
 
int main(void) {
	char *a = "ABCDEFGSE";
	char *b = "EBCDEFGV";
 
	printf("\n Longest common substring : %d",
			longestCommonSubstring(a,b));
 
	return 0;
}
 

Time complexity of dynamic programming approach to find length of longest common substring in two string is O(n*m) and space complexity is O(n*m) where n and m are lengths of two given strings.

In next post, we will discuss suffix tree method to find LCS which is more optimized than DP solution and can be easily be generalized for multiple strings.

Please share if you find something wrong or missing. If you want to contribute to site, please refer contact us. We would be happy to publish your work and in turn will pay you too.