Check if string is interleaved of two strings

Given three strings A,B and C, find if C string is interleaved of two strings. A and B. C is said to be interleaved if it contains all the characters of A and B and order of characters in respective strings is maintained. For example string C in figure is interleaved string of A and B.

check string is interleaved of two strings

Find if string is interleaved of two strings

Simple algorithm to find if a string is interleaved of two other string is to start checking length of C as length of A + length of B. If it is not true return false (why?).

If length criteria passes, check first character of C and A. If they match, move to second character of C and A. Keep B at first character. If character in A does not match with character in C, check if first character of C and B match, if they do, then move to second character of C and B. A remains at first character.

Once done, repeat above steps with next characters of strings, while character in C matches character in A or B. If both current character in C does not match with any of the current characters in A and B, return false.

With above algorithm, with each comparison, problem reduces to smaller size, with C with length N-1, one of A or B with M-1. From the description, it is clear that recursion can be used to solve interleaving problem.

isInterleaved(A,B,C)  = isInterleaved(A+1, B, C+1) 
                        || isInterleaved(A, B+1,C+1)

What shall be base case for above recursion?
If end of C is reached, all characters in C are checked, return true if all characters in other two strings are also considered. If not returned false.
Other way, if both string are A and B are fully considered and there are still characters left in C to be considered, then return false. (This case is anyways caught at length check).

Recursive Implementation of interleaved strings algorithm


#include <stdio.h>
#include <string.h>
#define true 1
#define false 0; 
 
int isInterleaved(char *c, char *a, char *b){
 
    if(!(*c) && !(*a) && !(*b))
        return true;
 
    if(*c == '\0'){
        return false;
    }
	 // if character of a and c match
    return ((*c == *a) && isInterleaved(c+1,a+1,b)) || 
    		// if character of b and c match
            ((*c == *b) && isInterleaved(c+1,a,b+1)); 
}
 
int main(void) {
	char *c = "abxycdz";
	char *a = "abcd";
	char *b = "xyz";
	if(isInterleaved(c,a,b)){
	    printf("\n String is interleaved");
	}
	else{
	    printf("\n String is not interleaved");
	}
	return 0;
}

Iterative implementation is as follows

#include <stdio.h>
#include <string.h>
#define true 1
#define false 0; 
 
int isInterleaved(char *c, char *a, char *b){
 
    while(*c != '\0'){
    	if(*c == *a){
            a++;
        }
        else if(*c == *b){
            b++;
        }
        else{
            return false;
        }
        c++;
    }
    if(*a != '\0' || b != '\0')
        return false;
 
    return true;
}
 
int main(void) {
	// your code goes here
	char *c = "abxycdz";
	char *a = "abcd";
	char *b = "xyz";
	if(isInterleaved(c,a,b)){
		printf("\n String is interleaved");
	}
	else{
		printf("\n String is not interleaved");
	}
	return 0;
}

Iterative implementation will not work with input with common characters in string A and B, for example A = XXY and B = XXZ and if C = XXZXXY will return false.

Complexity of iterative implementation will be linear O(N), N being length of string C, where as complexity of recursive solution will b O(2N) but it does not fail in when two strings have common character.

Dynamic programming approach to interleaved strings problem

Look closely and notice that many subproblems are being calculated again and again. Recursion tree for input A = XXY and B = XXZ and C = XXZXXY

interleaved strings with dynamic programming

Looking at the recursion tree, it is clear that if results of subproblems are stored, they need not be calculated again and again. Create a two dimensional table. Table(i,j) = true only if C[i+j-1] is interleaved string if A[i] and B[j]. Empty string is interleaved of two other strings so, Table[0][0] = true
If one of the strings was empty: Table(i,0) = A[i] == C[i] && Table(i-1, 0) that is to say if till i-1 characters C was interleaved of A, then for ith character it will be true if ith character matches ith character of A. Note that B is empty. Again if string A is empty, then Table(0,j) = Table(0, j-1). With these base cases, we can fill table bottom up as follows

Table(i,j) = Table(i-1,j)  if (A[i] == C[i+j]) && (B[j] != C[i+j])
Table(i,j) = Table(i,j-1) (B[i] == C[i+j])&& (A[i] != C[i+j])
Table(i,j) = Table(i-1,j) || Table(i, j-1) if (A[i] == C[i+j]) && (B[j] == C[i+j])

Dynamic programming solution to find if string is interleaved of two strings

#include <stdio.h>
#include <string.h>
#define true 1
#define false 0; 
int isInterleaved(char *c, char *a, char *b){
 
    int lenA = strlen(a);
    int lenB = strlen(b);
    int i,j;
 
    int Table[lenA+1][lenB+1];
    // Initialization
	for(i=0; i<=lenA; i++){
        for(j=0; j<=lenB; j++){
        	Table[i][j] = false;
        }
	}
    for(i=0; i<=lenA; i++){
        for(j=0; j<=lenB; j++){
        	// Both strings are empty
            if(i==0 && j==0)
                Table[i][j] = true;
    		// string A is empty, compare characters in C and B
            if(i==0 && c[j-1] == b[j-1]){
                Table[i][j] =  Table[i][j-1];
            }
            // string B is empty, compare characters in C and A
	        else if(j==0 && c[i-1] == a[i-1]){
                Table[i][j] =  Table[i-1][j];
            }
            // Both strings are not empty
            //1. If character of A matches with character of C
            // but not of B
            else if (a[i-1] == c[i+j-1] && b[j-1] != c[i+j-1]){
                Table[i][j] = Table[i-1][j];
            }
            //2. If character of B matches with character of C
            // but not of A
            else if (a[i-1] != c[i+j-1] && b[j-1] == c[i+j-1]){
                Table[i][j] = Table[i][j-1];
            }
            //1. If character of A matches with character of C
            // and character of B also matches with C
            else if (a[i-1] == c[i+j-1] && b[j-1] == c[i+j-1]){
                Table[i][j] = Table[i-1][j] || Table[i][j-1];
            }
        }
    }
    return Table[lenA][lenB];
}
 
int main(void) {
	// your code goes here
	char *c = "abxycdz";
	char *a = "abcd";
	char *b = "xyz";
	if(isInterleaved(c,a,b)){
		printf("\n String is interleaved");
	}
	else{
		printf("\n String is not interleaved");
	}
	return 0;
}

Complexity of dynamic programming algorithm to find if a string is interleaved of two strings will be O(n2).

For more dynamic programming problems, please refer : Dynamic programming problems

Please share if there is something wrong or missing. We would love to hear what you have to say. If you are interested in contribute to website, please contact us.

Strings : Ransom Note from Magazine

Ransom note from magazine

Kidnapper kidnaps you and writes a ransom note. He does not write it with hand, as handwriting can put him in, so reaches to a magazine at table and creates ransom note. We need to find out given ransom string and magazine string, is it possible to create given ransom note. Kidnapper can use individual characters of words.

Analysis

We need to check if magazine string contains all characters and in equal or greater number, present in ransom note.
How can we keep track which characters are present and how many instances of characters are present in magazine string? Simple standard technique. Create a hash table with 256 values, where key is character, and value as the number of instances it occurs.
Now go through each character in ransom note, and decrement corresponding value in hash table. If we try to decrease a value already zero,  we return false. If we reach at the end of ransom note, we return true.

Code ransom note problem

Complexity of code is O(M) where M is length of magazine string.

We can better this. In that method we don’t scan magazine string and ransom note separately but simultaneously. We scan character from ransom note, and check in hash table, if we find good. If not, we scan magazine string till we find the desired character. 
If we reach end of magazine string, return false.
If we reach end of ransom note, return true.
 

Variation

What if kidnapper cannot use individual characters, but he can use complete words?
 
Simple again, store all words in magazine in trie with leaf node containing the count which words occurs in it.
Every time we need a word for ransom note, we find word in trie and decrement the count if word is present. If we are trying to decrement a count which is already zero, return false.
 
I am leaving implementation for now. Basic of trie can be read here. Trie Basics

Break string into dictionary words

Break string in meaningful dictionary words

This problem is commonly asked in Google and Amazon interview.  We all know that if typed string in Google search box does not make sense, Google breaks that into meaningful words and asks us back if we meant those words instead of single word. This post discusses how Google breaks a long string into meaningful dictionary words and gives suggestions. Formal statement for this problem is :Given a string and dictionary of words, we need to break given string into words which are present in dictionary.

For example, if given string is “dogcatruck” and dictionary is {“dog”, “cat”, “truck”}, then result should contain all three words.

Analysis

Here, we are assuming that dictionary is implemented and has an exact look up operation, i.e. if we pass a word to dictionary look up operation, it will return true if word is present and false if not.

First let’s start with breaking string into two meaning full words. Break the string at every possible location i and then see if the string 0..i and i+1 to end are meaningful words present in dictionary? If yes, return those words. Simple code will be like:


Simple right? Let’s now concentrate on the generic solution which gets any number of words.

In above solution, we are printing prefix and suffix if both are present in dictionary, what if suffix contains two words which are individually present in dictionary but suffix as whole is not present. 
Break the suffix further and add it to the prefix.

Rings bell? Yes we need to recursively break the suffix in meaningful words and associate them with prefix in consideration.

Code to break string in dictionary words

Worst case complexity of the above algorithm is O(2^N). Consider a case when we have a dictionary having all words like ‘a’,’aa’, ‘aaa’, ‘aaaa’ …. and we have an input string as n-1 a’s followed by b.
There is dynamic solution available which runs in acceptable complexity but used extra space. 

Reference

http://thenoisychannel.com/2011/08/08/retiring-a-great-interview-problem/

Longest Palindromic Substring

Longest Palindromic Substring

Given a string S, find longest palindromic substring in S. For example, if S = ‘ABDCBABCBAA’, longest substring which is palindrome is “CBABC”.

Longest palindrome substring in a string can easily be found using brute force. Find all possible substrings for given string and check if that substring is palindrome or not. There are C(N,2) substring possible for a string of length N. Complexity of this method is O(N3).

With slight optimization, brute force solution can be O(N2).
Starting from each character in string and check on left and right side of the character till both character at left and right same. Once they differ, check if number of characters in substring centered character is greater than length of earlier palindrome substring. If current length is greater, update longest palindrome substring length and look for subsequent characters till the end of string.

Finding longest palindromic substring algorithm

MaxLength = 0
For each index c in string
    i = c-1
    j = c+1
    currentLength = 0
    While i greater than or equal to 0 and j less than length of string
       if string[i] == string[j]
          currentLength++
          j++;
          i-- 
       else
         if maxLength less than currentLength
            maxLength = currentLength

Repeat all steps in above Pseudo code, starting with 1 for i (not with zero). (For taking into account even length string).

#include <stdio.h>
#include <string.h>
 
#define true 1
#define false 0
 
int longestPalindrome(char *s){
 
  int i,j,k, n;
  int longestEnd =0, longestStart=0;
 
  n = strlen(s);
/* This is case which handles odd length string */
   for(i=0; i<n; i++){
       for(j=i-1, k=i+1; j>=0 && k<=n; ){
   /* If characters are equal, update left and right index. */
           if(s[j] == s[k] ){
                k++;
                j--;
           }
           else
                break;
       }
  /* Check if current sub-string length is greater 
    than earlier max, If yes, update it */
      if(longestEnd - longestStart < k-j)
      {
          longestEnd = k;
          longestStart = j;
      }
 }
/* This is case which handles even length string */
  for(i=1; i<n; i++){
       for(j=i, k=i+1; j>=0 && k<=n; ){
            if(s[j] == s[k] ){
                k++;
                j--;
            }
            else
                break;
       }
       if(longestEnd - longestStart < k-j)
       {
            longestEnd = k;
            longestStart = j;
       }
  }
  return longestEnd - longestStart - 1;
}
int main()
{
    char str[] = "ABCDCBEA";
    printf("\nLength is: %d\n", longestPalindrome( str ) );
    return 0;
}

Complexity of the above code is O(N2).

Finding longest palindromic substring with dynamic programming 

Can we do better than brute force solution? Fundamental for applying Dynamic programming to any problem is that it should satisfy two basic conditions : First, problems should be solved by solving it’s subproblems and second, solutions to those subproblems must overlap.

Coming back to over problem, can finding longest palindrome substring be sub-divided into smaller problems? If we find a palindrome substring in string with length N-1, what effect it has on string N? What if N =1?
Every character in itself is a palindrome string, string with length one is palindrome.

P(i,i) = TRUE

If N =2, string with two characters is palindrome if both characters are equal.

P(i,i+1) = TRUE if str[i] == str[i+1]

For each length L greater than 2 and less than N, find substring of length L starting with each index of substring.
So Palindrome(i,j) represents substring with L, with starting index as i and end index as j. How can we say that this substring is palindrome? If substring from i+1 to j-1 was palindrome i.e Palindrome(i+1,j-1) = TRUE and str[i] == str[j], then Palindrome(i,j) = TRUE. Store L as length and compare it with maximum length found so far.

P(i,j) = ( P(i+1, j-1) and str[i] == str[j] )

Table is filled starting from (0,0) to (n,n) and when table if filled, we have longest palindrome substring length in a given string. For sake of information, we can also store where does this palindrome substring starts in string.

#include <stdio.h>
#include <string.h>
 
#define true 1
#define false 0
 
int longestPalindrome(char * s) {
  int n = strlen(s);
 
  int longestBegin = 0;
  int maxLen = 1;
  int table[n+1][n+1];
  for (int i = 0; i <= n; i++) {
  	for (int j = 0; j <= n; j++) {
  		table[i][j] = false;
  	}
  }
  for (int i = 0; i < n; i++) {
    table[i][i] = true;
  }
  for (int i = 0; i < n-1; i++) {
    if (s[i] == s[i+1]) {
      table[i][i+1] = true;
      longestBegin = i;
      maxLen = 2;
    }
  }
  for (int len = 3; len <= n; len++) {
    for (int i = 0; i < n-len+1; i++) {
      int j = i+len-1;
      if (s[i] == s[j] && table[i+1][j-1]) {
        table[i][j] = true;
        longestBegin = i;
        maxLen = len;
      }
    }
  }
  return maxLen;
}

int main()
{
    char str[] = "ABCDCBE";
    printf("\nLength is: %d\n", longestPalindrome( str ) );
    return 0;
}

Complexity of dynamic programming approach is same as brute force algorithm and is O(N2), It also uses extra memory though in order O(N2).

Please share if there is something wrong or missing, we would love to hear from you.
If you want to contribute to the website, please contact us.

Strings : Permutations and combinations of string

Permutations and combinations of string

Permutation is arrangement of objects in various positions where order of objects will matter i.e. ab is not same as ba.

Analysis

Let us start with an example  : S  = “ABC”  What are the possible permutation?
ABC      ACB     CAB    CBA     BCA     BCA    

Fix the first character and find permutation of N-1 characters.
We fix A, permutation of BC are BC and CB.
Now place A at all the places possible, in this case it will be at start, middle and at the end.
BC leads to ABC, BAC, BCA
CB leads to ACB, CAB, CBA.
It can be easily code with the help of recursion.
Take each character starting with 0, let say current character is Kth character in the string, we need to fit this character in all the substrings of length N-K (N is length of string).  
We first swap the Kth character with the a ith position where i starts from K and reaches to N (position in remaining substring). Before going for all position of Kth character, we will move forward and check for K+1 character for its all positions in N-K-1 length substring.
Base case would be when we reach at the end of string. Hence remaining substring would be of length zero. Return.
 
So when at penultimate character of string, we have already varied all the positions of last character. Similarly when at Kth character back, we would have adjusted all permutations of N-K characters following it.
 
If we see closely at code, we would have each character of string as each position of string at some point of time as we are swapping each character with every position when moving forward.
When there are duplicate characters in string, just sort the string and check if are processing duplicate character. If yes then just skip the character.

Code to print permutations of string

Similar logic can be applied to combination problem also where we have to first output the string with the character and then without it. Following code does that.

Code to print combinations of string

Complexity of algorithm to print permutations and combinations of string is O(N!) as we need to generate N! permutations.

Reverse words in a string

Reverse words in a string

A string can contain many words which are separated by space. We need to reverse words in a string. For example if string is “this is test string” , output should be “string test is this”

Reverse words in a string

What do i get if reverse the whole string first? In above example, we get “gnirts tset si siht”. Notice that first word in reverse string is exact reverse of first word of the output string. Similarly second word is exact reverse of the second word of the output string.

So we can just reverse individual words in the string and get the output like string. Hence the algorithm involves two simple steps:
1. Reverse the whole string first.
2. Reverse individual words of the output string given in step 1.
Same algorithm can be used when we want to rotate an array with a given index as pivot.

Reverse words of a string implementation

#include <stdio.h>
#include <stdlib.h>

/* This function reverses the whole string */
void reverse(char s[]){
        int i=0, j;
        int end =  strlen(s);

        for(i=0, j=end-1; i<j; i++, j--){
                char temp = s[i];
                s[i] = s[j];
                s[j] = temp;
        }

}
/* This function reverses a given word, we need to 
provide stat and end of the word */
void reverse_word(char * start, char * end){
        while(start < end){
                char temp = *start;
                *start++ = *end;
                *end-- = temp;
        }
}

void reverse_words(char *string){
        char * source = string;
        char *begin = string;

        if(string == NULL) return;

        /* Step 1 : Reverse the whole string */
        reverse(string);

        /* Now reverse individual words in reversed string */
        while(*source != '\0'){
                if(*(source +1) == ' ') {
                        reverse_word(begin, source);
                        begin =  source + 2;
                        source++;
                        
                }
             /*Case when it is last word of string */
                else if(*(source + 1) == '\0'){
                        reverse_word(begin, source);
                }
                source++;
        }       
}

/* driver program to run above code */
#define MAX_LENGTH 100
int main(){
    char string[MAX_LENGTH] = "This is a test string";
    reverse_words(string);
    return 0;
}

Test cases

1. A Normal string with multiple words
s = "This is test string"

2. A string with only one word
s ="string"

3. An empty string
s =""

4. A NULL pointer is passed
s =  NULL

Rotate an array with a given index as pivot

void reverseArray(int a[], int start, int end){

    while(start < end){     
        int temp =  a[start];
        a[start] = a[end];
        a[end] = temp;
        start++;
        end--;
    }
}

void rotateArray(int a[], int pivot, int len){
        int i;
        /*Reverse the whole array */
        reverseArray(a, 0, len);

        /* Reverse from 0 to pivot and pivot to end */
        reverseArray(a,0, pivot);
        reverseArray(a,pivot+1,len);
}

int main(){
   int a[] = {1,2,3,4,5,6,7};
   int size = sizeof(a)/sizeof(a[0]);

   rotateArray(a, 3, size);

   return 0
}

Complexity of method to reverse words in a string or rotate an array is O(N) without using any extra space. 

Please share if there is something missing or wrong. If you are interested in contributing to website please contact us.

Strings : Removing particular combination of characters

Remove particular combination of characters from string.

Given a string, we need to remove a particular combination of characters, for example, 
if s = “aacbbdccaac” and if combination is “ac” then our output will be “abbdcca”.
if s = “aacaaccd” and if combination is “ac”, then output should be “acd”.
Following will be the execution 
aacaaccd ——-> aacd ——>ad 
Note that this has to be done in linear time and without any extra space.

Analysis

Here there are two sub part of the problem:

One, we need to keep track we have already encountered the first character in the combination of characters needed to be removed.
Second we need to keep track of the position where the next character should be placed in the string if it is not to be eliminated.
First problem can be easily solved by maintaining a state machine, where we have two states and one moves from one state to another based on the input.

In current example, we have states like state “INITIAL” and “MOVED”, Whenever we encountered “a”, we move from INITIAL to MOVED. Then if we get “c” in “MOVED” state we are sure that we have encountered the whole pattern and they need to be removed. If we are in “MOVED” state and if we don’t get “c”, we move back to “INITIAL” state.

State machine

For the second part we can simply follow the approach we have used in this post, that is to maintain two pointers, one to point to character to be processed, and other to point position where current character to be placed if it is not be eliminated.

The problem can be extended with multiple characters in pattern, with change in state machine accordingly.  Same state machine method can be used to count number of words in a given string. Whenever we encounter the separator, we move to OUT state and as soon as we see first character in OUT state we move to IN state till we again see separator.

Code 

Test cases

1. When pattern is present
s= “aaacacacbbb” pattern = “ac”
 
2. When pattern is not pattern
s =”aaaabdcaaa” pattern  = “ac”
 
3. When movement leads to pattern again
s= “abacaccdc” pattern =”ac”
 
4. When string is NULL or pattern is NULL
s= NULL
pattern = NULL
 
5. When return string is empty
s = “acacacac” pattern =”ac”
 
6. Just first character in pattern
s= “aaaaaaaaa” pattern =”ac”
Complexity of above code is O(N) and no extra space used.

Strings : Remove duplicates from a string

Remove duplicates from a string

A string is a collection of characters terminated by a special character ”. String can have duplicate characters in it. Today’s problem requires purging of those duplicate characters from string and return the string. Given a string S, remove all duplicate characters from it.
For example, S = “aaabbacbccd”, then output should be “abcd”. Note the output is not  “d”, that means we need to maintain once instance of the duplicate character.

Analysis

There are two parts involved in this problem.  First, we need to keep track whether we have already processed a particular character.
Second we need to properly place the character in destination string.
 
For the first part hashing is the perfect technique. We can create a hash with key as character (total 256 characters) and set the value when we first encounter the character. Next time when we encounter the character, we would find the value set against that character and hence we can skip that character.
For second part we can have an auxiliary string to store the non-duplicate characters.
We can do it in-place too. Keep two indexes, one which keeps track the character being processed in input string, other which points to the next place in the input string where the current character can be put if it is not processed already.

Code

Test cases

1. When input string contains duplicate characters
S  = “aaabaaabbbcccd”
 
2. When input string does not contain duplicate characters
S = “abcdefg”
 
3. Input string with no characters
S = “”
 
4 Input string pointer is NULL
S = NULL;
 
Execution of these test cases can be seen here.
Complexity of code is O(N) in time and constant memory for hash as it does not depend on the size of the input string.

Strings : Toeknize a string separated by delimiter

Tokenize a string

Continuing the previous post, let’s look at the second problem. Problem statement is Find all tokens in a string which are separated by given delimiter.

Analysis

We have to traverse the string and for each character, check if that character is present in delimiter string. If it is, then it is end of the token started after previous encounter of the delimiter. Here forward we should scan all characters till the time we first encounter a character which is not a part of given delimiter string, that would be start of the next token.
Let’s look at it step by step.
Step 1 : Assume we have string as s and delimiter string as delim.
Search in s the first character which is not present in delim.
If there is no such byte, we are done, return NULL.
If there is such byte, then that would be start of our first token.
Step 2: Scan through the found token.
From the character found in step 1, scan through s till we again encounter a character present in delim.
If there is no such character, we are done, we have only one token and return the start position of that token.
If we have such character, that would be end of the token.
Great we have found one token, How about next one? 
For next token to be generated, we need to keep track where to start looking from in the string as we would have scanned a part of it in the previous tokens. So we keep track of the pointer where to start looking from.
Step 3 : Once we have found the end of token in step 2, again scan out all subsequent characters which are part of delim. Take the index of first non-delimiter character. This would be the starting point of our next token search.
 
Implementation note : To distinguish between whether it is first call to the function or subsequent call, we pass NULL pointer to the source string parameter indicating that it is subsequent call and not initial call. Similar approach can be used to remove all characters which are present in a string from another string.

Code

Test cases

Complexity of above code is O(N * M), where N is length of string and M is length of delimiter string.

Strings : Search functions

Find substring in a string

Many a times we need to find something or the other in a given string like find all words punctuated by given delimiter or find the location of a given sub string in bigger string etc. Under this topic we will discuss finding position of sub string in a given string.

Given a searched string S, and a substring s, check if s is present in S and if yes return the position in searched string.

Analysis

This is classic problem of traversing two string simultaneously and keeping track of characters in each.

One of the well known algorithm for this is needle and haystack algorithm. Steps of the algorithm are as follows.

1. Take begin of substring to be search.
2. Starting from begin of larger string S, do following
2.a For each position pos in S, check if substring s can start from that position.
2.b If the substring s can start, keep traversing both strings till either we find mismatch          or the substring is traversed completely.
2.c. If substring is traversed completely, return the position where we started  i.e pos.
2.d  If mismatch occurs, try matching from the next position.
2.e  Repeat step 2.a to 2.d till we have traversed whole bigger string S.

Code to find substring in given string

There is one optimization where we can stopping matching substring once we know that the number of characters left in search string to match are less than the characters in substring. Test cases

1. When substring is present
Complexity of above code is O(N^2).


There are other algorithms like KMP pattern matching algorithms which do this is linear time, we would discuss that too in separate post.