Graphs : Minimum Spanning Tree Prim’s Algorithm

Prim’s Algorithm for Minimum Spanning Tree

Second most used algorithm to find minimum spanning tree is Prim’s algorithm.
Basic idea is to have two sets of nodes of graph, one which are already included into minimum spanning tree and other which are not included yet. Now find an edge with least weight which connects any node or vertex in set 1 to any vertex in set 2. Such an edge is commonly called as cut.
Let two sets be A which includes all vertices of graph which are included in MST and V is set which contains all vertices which are not included in MST yet. Let S be set of all vertices of a graph.
Relationship between these three would be
A = SV
To start with A will be empty and V will be equal to S. Start with one vertex and add it to A. Also cost of all nodes initially is INFINITE. Once we add the vertex to A, update the cost of all neighbor vertex based on cost to reach vertex added in A and cost of edge between this vertex and neighboring vertex.
Remove the vertex from V.
Once update is done, select the edge which is minimum and connects one vertex in A and other in V, more simply said does not form a cycle. (If both the ends of an edge are in same component of the graph there are bound form a cycle.)

Prim’s algorithm for minimum spanning tree

Let’s see prim’s algorithm and then we will see how can we implement it.
  1. Initialize cost of each vertex with INFINITE. Except from the first vertex where we will start.
  2. For each neighbor, if cost of reaching neighbor node from current vertex is less than current cost, update the cost of neighboring node.
  3. Select the minimum cost neighbor vertex among these vertex and see if edge forms a cycle.
  4. If it forms a cycle take another vertex till we considered every neighboring vertices.
  5. If it does not form cycle, add it to MST and go to step 2.
  6. Repeat all above steps till all the vertices are not included in tree.
See the working of algorithm.
minimum spanning tree using prim's algorithm
Working of Prim’s algorithm

Let’s see minimum spanning tree algorithm’s implementation.
First thing we need to do is to keep track of two sets. We can do so by keeping track of each node in MST. Create an array and every time we add an node in MST, mark it as added. While considering any edge in future, just check if MST[i] is true, if it is then node is already added in MST and should not be added again.
Second thing is to keep track of distance of each node. Initially all nodes are at INFINITE distance.
As we traverse the graph, update each node’s distance from current node. Keep an array of integers initialized with INFINITE and every time we add a node in MST, update distance of neighboring nodes based on cost of edge from current node to them.
Last step is to find the minimum cost edge between two sets. We can scan through the distance array which we discussed in above paragraph and get a node which has least cost and not already in MST (using the array values in discussed in paragraph 1)
Finding minimum can be implemented using min heap or priority queues, that would reduce the complexity of overall algorithm.
Add edge logic can be found at Kruskal algorithm

Code

Complexity Analysis

Complexity of Prim’s algorithm to find minimum spanning tree in graph is O(V^2).

Graphs : Minimum Spanning Tree : Kruskal Algorithms

Problem statement

Given a weighted connected undirected graph, find minimum spanning tree in that graph.

Analysis

Let’s first understand what is spanning tree. Spanning tree is tree which connects each vertex of the graph. When graph is weighted i.e each edge of the graph has some weight to move from one node to another, spanning tree with minimum cost is called as minimum spanning tree. If all the edges contain distinct weights, there will be unique minimum spanning tree for that graph, however, if two edges can have same weight then there can be more than one minimum spanning tree for given tree.
For graph shown below, second figure shows spanning trees for that graph.

Given undirected graph

Spanning trees for given graph

Let’s see some of the applications of minimum spanning trees:

1 . Broadcast tree preparation for broadcasting in internet. It is implemented using spanning tree protocol.
2.  Preparing telecommunication networks, water supply networks etc.
There are two widely used algorithms to find minimum spanning tree for a graph, both based on greedy algorithm. Basic idea of greedy algorithm is that at every step we chose the option which gives us maximum profit at that step without worrying about future steps. 
Let’s see these greedy algorithms.

1. Kruskal’s algorithm

Basic idea is that we take each edge one by one in increasing order of weight. For each edge check if it makes a cycle in existing tree. If it does not add it to minimum spanning tree formed till now. If it forms a cycle discard the edge. Repeat above steps till all nodes are added in tree. Resulting tree will be minimum spanning tree.
  1. Sort all edges based on weights
  2. Start with minimum cost edge. Add it to T.
  3. For each edge in graph, repeat following steps.
  4. If current edge forms a cycle, discard the edge.
  5. If current edge does not form a cycle, add it to T.

Code 

Complexity Analysis

Complexity of the above algorithm is dominated by the sorting which takes O(E log E) where E is number of edges. To detect whether edge forms a cycle or not, we can use Union and Find method.

Prim’s Algorithm is discussed here :  Graphs : Minimum Spanning Tree Prim’s Algorithm

Graphs : Connected components

Connected components in graph

Given an undirected graph, find out number of connected components in it. Connected component of an undirected graph is subgraph in which every node is connected to other node in that subgraph and not connected to any other vertex in supergraph. Every graph has at least one connected component that is graph itself. For example in below graph, there are two connected components {1,2,3,4} and {5, 6}.
Connected components in graph
Input graph

Analysis

There are two methods to find connected components. First way is to do depth first traversal of graph. If we start afresh after terminating the recursion for any node, that means that node is not connected to any node in nodes traversed yet. Increment the count. To keep track of nodes in each connected component, we can color different components of graph with colors. To check if two nodes belong to same connected component, we just check if color of both nodes is same.
Complexity of above code will be complexity of depth first search O(V+E) with extra space of O(N) to store color of connected components.

Code to find connected components in graph

Code fro graph implementation can be found at: Graphs : Basics and representation

Second Method : Union and Find method

Second method is to use Union and find method as discussed in previous post : 
We will go through all the nodes of graph and create disjoint sets. Once we are done with all nodes, just check how many disjoint sets are formed. If we need to check if nodes are connected, just check if both nodes are in same disjoint sets. Since we have already seen the concept and implementation of disjoint set we can directly go to code.

Code to find connected components in graph using Union and Find method

Complexity Analysis

Complexity of Union and find method is again O(V+E) with extra space to store representative element of each disjoint set.

Graphs : Detecting cycle in undirected graph

Detect cycle in undirected graph

Given an undirected graph, check if there is a cycle in it or not.
For example below graph has cycles as 3->6->5->3, 1->4->3->6->2->1

Detecting cycle in a graph
Graph with cycle

Analysis

Simple definition of a cycle in undirected graph would be : If while traversing graph, we reach at a node which we have already traversed to reach current node, then there is a cycle in graph.
How can we detect the above condition? It’s simple application of Depth First Search.
Do depth first traversal of a graph, if we detect any back edge while traversing, then there is a cycle. Back edge in a graph is an edge which is from a decedent to ancestor. For example in above graph, edge from 2 to 1 is a back edge.
While depth first traversing a graph, keep track of the nodes which are on the stack at present. If there is an edge from current node to any one of the node which is on recursion stack, we say cycle is detected in graph.

Code

Complexity Analysis

Complexity of algorithm to find cycle in graph code is equal to complexity of DFS which is O(V+E) where V is number of vertices and E is number of edges.

Detecting cycles in graph using disjoint sets

Another method to detect cycle in an undirected graph is Union-Find method. To understand that method, we need to first understand a data structure called disjoint-set data structure.
It is a data structure, which keeps track of all the sets of elements partitioned in disjoint subset. Subsets are said to be disjoint if there is intersection between them is NULL. For example set {1,2,3 } and {4,5,6 } are disjoint sets, but {1,2,3 } and {1,3,5} are not.
Another important thing about disjoint set is that every set is represented by a member of that set called as representative.
There are some operations which are useful on this data structure:
1. Make Set : creates a new set with one element {x}
2. Union : Merges two sets containing X and Y and destroys the original sets.
3. Find : Returns the representative of the set which element belongs to.

Let’s take an example and see how it can be used to detect cycle in graph.
Principle here is that if two nodes forming an edge are in same disjoint set, then there is a cycle in graph.

Make set will create a set with each node with representative element as -1, which means there is only one element in each disjoint subset. Now one by one we check each edge and update our data structure accordingly. For edge 0-1, both are in different subsets, we need to merge them together. So sets become {0,1} and {2}. Representative of set {0,1} be {1}. Now consider another edge, 1-2. Node 1 belongs to set represented by 1 and 2 belongs to set represented by -1. Both are different subsets, we will merge them together. So set becomes {0,1,2} represented by 2. 
Now take edge 0-2, where 0 and 2 are in set which is represented by 2, hence there is a cycle in the graph.

How can we implement it? There are two ways to do so. One is to have a linked list of all elements of a set for each set. Each element in turn points to the representative element. So find operation will be O(1), however union or merge operation will be O(N) as we need to modify representative element in each element of the list.

Other method is to use array. This auxiliary array stores the parent of node i at array[i]. Initially all elements in array are -1. Now whenever we merge two set, we update the corresponding index with representative element.


Find operation

function Find(x)
   if x.parent == -1
return x
else
return Find(x.parent)

Union Operation
 function Union(x, y)
xRoot := Find(x)
yRoot := Find(y)
xRoot.parent := yRoot
Make set operation

function MakeSet(x)
x.parent := -1

Code

add_edge definition can be found at post : Graphs Basics and representation

Stacks : Next greater number for every element

Problem statement

Given an array of integers, for each a[i] element print a[j] such that a[j] > a[i] and j>i. Also j should be as close to i as possible. For the rightmost element, value will be -1. If for any i if there is no such j, then result for that too will be -1. In simple terms, we need to print nearest greater number for each element in given array.

Analysis

Simple brute force method would be to scan array in two loops. For each element in array, scan remaining right side elements and find out the greater number. Complexity of such code will be O(N^2).

How can we do better?
If we keep track of all the numbers for which we have yet not encountered a greater number, we keep improve. Idea is that whenever we examine a number, we check if there are elements which are already visited in array and yet there is no greater number assigned to them. If there are such elements, we check if current element is greater than them. If yes, then for all those numbers which are less than current element, next greater element will be current number.

To keep track of the numbers which have yet not assigned greater number, we can use stack. Steps are simple.

  1. If stack is empty, push the element on to stack.
  2. If stack is not empty, pop all elements from the stack which less than current element.
  3. Print these numbers paired with current number.
  4. If top of the stack is not less than current element or stack is empty, push the current element.(All the elements present in stack will be greater than current element why?)
  5. After scanning all elements in array, print all elements in stack paired with -1 as there is no greater element on right side of these elements.

Let’s workout an example:
A  = {5,6,3,35,23,6,8}
Output should be {(5,6), (6,35), (3,35), (6,8), (8, -1), (23, -1), (35, -1)}

Code

Stack implementation can be referred in this post : Stack implementation

Complexity Analysis

Complexity of above code will be O(N) with extra space complexity of O(N) for stack.

Graph Traversals

There are two kinds of traversals which are usually done on graph.
1. Depth First traversal, commonly called DFS.
2. Breadth First traversal, commonly called as BFS.

Depth First Traversal (DFS)

Depth First is a technique in which we explore the graph depth wise, that is we move down the graph from one node to another till the time we are out of all unexplored edges at a node, once we reach that condition, we backtrack and move to node one before and so on.

In this traversal, we start with a node S, and mark it as visited. Now S be our current node u. 

Move to v where there is an edge (u,v).
If v is already visited, we move back to u.
If v is not visited, mark v as visited and make v as current node and repeated above steps.
At some point all the edges at u will lead to already visited node, then we drop u and move to node which was visited before u. (remember we need to keep track of that :))
Once with this backtracking, we reach at start node S again and there is no edge to be explored at S, that would be the end of traversal.
Depth First traversal of a graph
Applications of DFS

  1. Minimum spanning tree
  2. To check if graph has a cycle.
  3. Topological sorting
  4. To find strongly connected components of graph
  5. To find bridges in graph

Code

Graph code

Driver code

Above code will work only if there graph has only one connected component. If there are multiple connected components we need to do dfs function call for every node, we will see that part once we do connected component problem of graph.

Breadth First Search

In this traversal, we start with node S as first node. Let S be node u. Mark it as visited
Now, first visit all the nodes which are at one edge distance from u.
Once all nodes are visited, make each neighbor node as u one by one and repeat above steps till the time we have some unexplored edges.
Breadth First Traversal of graph
Applications of BFS

  1. To find shortest path between two nodes u and v
  2. To test bipartite-ness  of a graph
  3. To find all nodes within one connected component

Code
Driver program
Graph creation code can be used from DFS.

Complexity Analysis

Complexity of Depth First Search is O(m+n) where m is number of vertex and n is number of edges.

Complexity of Breadth First Search is O(m +n) where m and n have same meaning as above.