## Find number of connections of person till nth level

This is typical social network problem, where one has to find all friends of a person who are at max n distance away from person. In other words, friends of person are at 1st level, friends of friends of p are at 2nd level and so on. This question was asked recently in Microsoft interview.

For example, in below graph, if we have to find connections of node 1 till 3rd level, result will :

```Level 1 : 2,3,4 (friends)
Level 2 : 5,6,7,8 (friends of friends)
Level 3 : 9,10,11```

## Find friends of a person till Kth level

The standard way to find connections or friends in graphs is to do breadth first search on graph. Start BSF from the person given, till you have visited nodes with max k distance from starting node. Only extra thing needs to be done other than standard breadth first search is to keep track of number of levels we have moved.

Implementation is quite simple.

Complexity of algorithm to find friends of person till nth level is O( |V| + |E|) where V are number of vertices and E is number of edges in graph representation of person’s social network.

## Breadth First Search (BFS) on a graph

In last post Depth first search, we discussed recursive and iterative way to traverse a graph depth first. In this post, we are going to learn another traversal method called Breadth First Search (BFS).

First of all, what is breadth first search? In BFS, all neighbors of a node are visited before any of neighbor of those neighbors. Basically, we traverse graph layer by layer. For example, BFS of below graph would be 1->2->6->3->5->4

In depth first search, we go deep into the graph till there is not further move possible and then backtrack and again do the same process with another neighbor of parent and so on till we don’t have any further node to visit.

In breadth first search, as we traverse layer by layer, there is no backtracking required. Before reaching to layer i, all the nodes till layer   i-1 would have been already visited.

Start from node S. Let S be node u. Mark it as visited. Visit all nodes which are at one edge distance from u. Once all nodes are visited, make each neighbor node as u one by one and repeat above steps till the time there is some unexplored edges. Below figure explains step by step how BFS done.

Implementation wise, BFS uses queue unlike DFS which uses stack. While processing node, all neighbors of that node are enqueued in queue and processed one after the other in FIFO order. This ensures that all nodes at same level are processed before next level is started.

```Breadth-First-Search(Graph,root):
for each node u in Graph:
create empty queue Q
Q.enqueue(root)
while Q is not empty:
current=Q.dequeue()
print current->value
visited[v] = true
for each v that is adjacent to current and not visited
Q.enqueue(v)
```

Below figure explains how breadth first implementation works

Applications of BFS
1. To find shortest path between two nodes u and v in an unweighted graph.
2. In Peer to Peer Networks like BitTorrent, Breadth First Search is used to find all neighbor nodes.
3. Search engine crawlers, each page leads to more pages linked to current page and so on.
4. Social networks, BFS is used to find degree of separation, friends within K distance etc.
6. Broadcasting protocol in networks uses BFS to broadcast packets to all neighboring devices.
7. To test bipartite-ness of a graph
8. To find all nodes within one connected component of graph.
9. Search for connected components in the graph for O(E+V). To do this, run start BFS from each vertex, except for vertices visited in previous runs. Do not reset each time the array visited[], due to which every time we get a new connected component, and the total running time will be still O(E+V).
10. Finding solution to problem defined by statement with the least number of moves,  each state of game is represented by a vertex of graph, and the transitions from one state to the other represented as edges of graph.

Finding the shortest cycle in a directed unweighted graph: produce a breadth-first search from each vertex; as soon as we try to bypass the process to go from the current peaks on some edge to an already visited vertex, then it means that we have found the shortest cycle, and stop bypassing wide found among all such cycles (one from each startup bypass) choose the shortest.

Complexity of Breadth First Search is O(|V| +|E|) where V is number of vertices in graph and E is number of edges in graph.

Please share if something is wrong or missing. If you are interested in contributing to website and share you knowledge, please refer to Publishing on Algorithms and Me and contact us. We would love to publish you articles and pay you too.

# Depth First Search (DFS)

In last post, we learned graph basics and how graphs are represented in programming model. We studied various advantages and disadvantages of adjacency matrix and adjacency list representation of graphs. Today, let’s us discuss traversals of graph.  Two kinds of traversals are possible on graph : Depth First search, commonly known as DFS and Breadth First Search, commonly called as BFS.

Depth First Search or traversal explores a given graph depth wise, that means we go down the graph from one node to another till there is no unexplored node down. Once we reach the end of a path going down, backtracking starts and parent of current node is looked for any other unexplored child nodes, and so on.

For above graph depth first search will be 1->6->5->2->3->4

## Depth First Search Algorithm

```1. Start with a node S, and mark it as visited.
2. Current node u = S
3. While there is edge to be explored
3.1 Move to v where there is an edge (u,v).
3.2 If v is already not visited, mark v as visited
3.3 change u to v and repeat.
4. Move to parent of u.```

Recursive implementation of above algorithm of depth first search on graph is very simple as shown below

```1  procedure DFS(G,v):
2      label v as discovered
3      for all edges from v to w in G.adjacentEdges(v) do
4          if vertex w is not labeled as discovered then
5              recursively call DFS(G,w)```

Iterative implementation of depth first search

```1  procedure DFS-iterative(G,v):
2      let S be a stack
3      S.push(v)
4      while S is not empty
5          v = S.pop()
6          if v is not labeled as discovered:
7              label v as discovered
8              for all edges from v to w in G.adjacentEdges(v) do
9                  S.push(w)```

Below figure explains how iterative depth first traversal works using stacks.

This implementation works only if graph has only one connected component. If graph has more than connected components, depth first search is done starting with each node. For more details on connected components, refer : connected components in graphs

Complexity of depth first search of graph is O(V+E) where V is number of vertices and E is number of edges.

Applications of depth first search
Minimum spanning tree
To check if graph has a cycle.
Topological sorting
To find strongly connected components of graph
To find bridges in graph

# Degree of separation

We all are present on one or other social network like Facebook, LinkedIn, Google+ etc. On these there are friends of ours, then friends of friends and then friends of friends. Connections between people are represented as graphs where persons are nodes and if two persons are connected, then there will be an edge between them. Number of edges between two nodes is called as degree of separation.

## What is degree of separation?

“Degrees of separation” is a concept that refers to how many people it takes to connect you and another person. Implementation wise, it is minimum number of edges between two persons. Simply understood, It’s the shortest path between two persons where each edge has weight of 1. Foe example, in above graph, degree of separation between Sarah and Alice is 2, as Sarah is connected to Ivana and Ivana in turn is connected to Alice.

Find degree of separation between two persons

First of all, we will store information of persons and their connections in graph. To find degree of separation between persons, breadth first traversal of graph  is used.

Start from one person and check it’s all neighbors. If one of neighbor is second person, then degree of separation is 1. If not, increment degree by 1 and then look in neighbors of each neighbor in last step. This goes on till either we find second person or there is no node to be visited in graph (that means other person does not exist in our network ).

There is small optimization to above algorithm, that is to start breadth first traversal from both the nodes at same time. While traversing levels, check if neighbors of one person contains other person.

```1. Given person1 = (A); person2 = (B); Init degree = 1
2. While there are nodes to be traversed
2.1 Find all neighbors of person 1
neighborsOfPerson1 = neighbors(person1)
2.2 If neighborsOfPerson1 contains person2 return degree
2.3 else increment degree
2.4 Find all neighbors of person2 = neighbors(person2)
neighborsOfPerson2 = neighbors(person2)
2.5 If neighborsOfPerson2 contains person1 return degree
2.6 else increment degree```

Simplistic implementation would be to use adjacency matrix to store neighbor information. To understand how a graph can be represented as adjacency matrix, please refer to post : Graphs basics However, adjacency matrix wastes lot of space, so it is good to represent information in graph built based on adjacency list.

In implementation, a sentinel is inserted every time a level is finished and new level has to be searched. At same time, we increment the degree too.

Complexity of this algorithm to find degree of separation between two persons is O(N) as we have to scan all nodes in words case. Also, there is space complexity involved of order O(N).

After from this Breadth first traversal,  Dijkstra’s shortest path algorithm between two nodes can be used . Below is implementation using Dijkstra’s algorithm to determine degree of separation.

Please share if something is wrong or missing, we would love to hear from you.

# Clone graph

Problem is given a graph, clone graph as a new graph. New graph should have same vertices and same edges as original one. This problem is very simple when we have access to values of node, we can just read one value, create a a node with corresponding value, scan all neighbors and then add them one by one to neighbor list. Once node is done, take the next node and go to the list corresponding to value and create new node. Scan all neighbors and add it to new node. No difficulties, very simple solution.

Problem arises when value of node is not available and nodes are not store in array of adjacency lists. Every node contains list of pointers of it’s neighbors. Neighbors of neighbor can be accessed using pointers. This is in fact true representation of graph.

Representation of graph node is as follows:

```class graphNode{
public:
int value;
vector<graphNode *>neighbors;
graphNode(int val){
this->;value = val;
}
};
```

## Clone a graph : Algorithm

To clone a graph, first we have to traverse the graph. There are two types of traversals on graph, depth first and breadth first traversal. For this particular problem, let’s take BFS. In BFS, node is visited and all the neighbors of it are visited before any of the child of that node is visited. Take a node and create a corresponding node for clone graph. Again scan all the neighbors from list, and create corresponding nodes and add them to neighbor list of new node. Only thing we need to take care of is that we should not allocate a node which is already allocated. For that we will use hash. Whenever, we allocate a new node corresponding to a node in original graph, we will store that mapping in a hash as

`hash[old_node] = new_node;`

## Clone graph : Implementation

```graphNode * clone(){
queue<graphNode *>q;
q.push(this);
map<graphNode *, graphNode *>hashMap;

graphNode *graphClone = new graphNode(this->value);
hashMap[this] = graphClone;

while(!q.empty()){
graphNode *node = q.front();
q.pop();
int n = node->neighbors.size();
for (int i = 0; i < n; i++) {
graphNode *neighbor = node->neighbors[i];
/* Node is already node allocated.
allocate it and set the mapping. */
if (hashMap[neighbor] == NULL) {
graphNode *p = new graphNode(neighbor->value);
hashMap[node]->neighbors.push_back(p);
hashMap[neighbor] = p;
q.push(neighbor);
}
else {
/* a copy already exists, then just put that
to neighbor list */
hashMap[node]->neighbors.push_back(hashMap[neighbor]);
}
}
}
return graphClone;
}
```

Complexity to clone a graph from original graph is O(V + E), where we need to visit each node and edge at least once.

## Ford Fulkerson Algorithm for Maximum Flow

Ford Fulkerson Method for Maximum Flow in graphs

# Ford Fulkerson Method for Maximum Flow

Before jumping directly on to Ford Fulkerson algorithm,we will have to understand concepts of network flow. So, we will start by some definitions and stating some important properties and then go on to the Ford Fulkerson Algorithm for Maximum Flow.
This particular topic requires the previous knowledge of Graph Theory. So, it is recommended to study basic algorithms of graph theory first.

Network Flow

Let us consider a situation, where we have to transfer water from source to the sink in a network of pipes. Figure 1 shows the situation. Each pipe has some capacity associated with it. The flow through the pipe cannot be more than its capacity. The source ‘s’ supplies water and the sink ‘t’ consumes it. We want to transfer maximum water through the network form source to the sink.
Now, talking in terms of graph theory, we are given a network – a directed graph, in which every edge has a capacity ‘c’ associated with it, a starting vertex source ‘s’ and an ending vertex sink ‘t’.
we have to associate some flow ‘f’ to the edges such that f ≤ c for every vertex. Note that, all the vertices other than source and the sink, must have the sum of the flow entering into the vertex equal to the sum of the flow leaving from the vertex i.e. there is no net flow coming out of any node and no net flow is consumed by the node other than source and sink. The flow coming out of the source  and flow going into the sink must be greater than zero. Figure 2 shows the graph network.
Mathematically :
f(x, y) ≤ c(x, y)

y∈V(x,y) = 0 = ∑y∈V(y,x) ∀ x∈V-{s,t}         where ‘V’ is a set of all vertices
f(s,y) ≥ 0 ∀ y
f(y,t) ≥ 0 ∀ y

There is a notion of  Residual Capacity. Suppose there is an edge between nodes x and y with capacity 10 and flow 7, as shown in figure 2. There is also an edge between the same nodes from y to x with capacity 20 and flow 0. So, here we have capacity of 3 to add in the direction from x to y. Therefore residual capacity from x to y is 3. We have residual capacity of 20 in the direction form y to x, but whatever 7 units of flow is in x to y direction can also be neutralized, so that can also be added in the direction y to x which makes the net residual capacity in the direction y to x 27. So we define residual capacity Cf  as :

Cf(x,y)= C(x,y) – f(x,y).

If Cf(x,y) = 0 then, we say that edge (x,y) is saturated because we cannot send more flow through this edge. Let U and W be the subsets of V, then flow f(U,W) is donated by sum of the flow of all the edges with one vertex in U and the other one in W.
f(U, W) = ∑x∈Uy∈Wf(x,y)

The figure shows the graph of the network flow.Note that the flow in all the vertices other than source and the sink  is conserved i.e. net flow going into a vertex is equal to the net flow coming out of the vertex.

Cut
A cut is basically a partition of the vertices of a graph into two subsets such that the source and the sink are in different subsets.
So, a cut is defined as a touple (S, V-S), where S is a set of vertices such that source s∈S and t∈(V-S). Therefore
C = (S, V-S) such  that  s∈S, t∈(V-S)

Edge Set E of a cut is the set of all the edges such one of the vertex of the edge is in S and the other one is in V-S. It contains the edges that point form the set containing the source to the set containing the sink. Therefore
E(S, V-S) = {(x, y) | x∈S, y∈(V-S)}
Capacity C of a cut (S, V-S) is the sum of capacities of all the edges of the in the Edge set of the cut.
C(S, V-S) = ∑ C(x, y) | x∈S, y∈(V-S)
Let us understand a very important property (let us call it Property A), which states that :
Property A : An Edge Set E’⊂E (where E is the complete set of edges) is an edge set of a cut if and only if E’ is the minimal set with the property that any path from source ‘s’ to sink ‘t’ passes through atleast one edge of E’. Here, by “minimal” i mean to say that this property breaks down if even one edge of E’ is withdrawn.
Proof : Let us say E’ is the edge set of  some cut (S, V-S). Now, let us take some arbitrary path from source s to sink t. The figure below shows the path from s to t.

So, if we travel from any path s to t, there has to be an edge (Vi, Vi+1) that has vertex Vi in set V and vertex Vi+1 in set (V-S). Therefore, if Vi+1 is the first vertex from left that belongs to set (V-S), then vertex Vi∈S and edge (Vi, Vi+1) belongs to edge set (V, V-S),
Now, we have to prove that this is a minimal such set, let us take off some edge (x,y) from this set and call the new set (after taking off an edge from E’) E”. So
E” = E’ – {(x, y)}                 where x∈S and y∈(V-S)
So, now consider a path  s, x, y, t. In this path, edge (s, x)∈S and edge (y, t)∈(V-S). It is clear that neither of the edges (s,x) and (y, t) belongs to the edge set of the cut and since edge (x,y) also does not belong to the edge set, the edge set E” fails to satisfy the property because now there exists a path the does not pass through the edge set. Hence, E’ is a minimal set having the Property A.
The converse of this property is also true, which says that if E’ is a set that satisfies Property A and also it is minimal, then the set E’ has to be an edge set E(S, V-S) of a cut.
Since, set of vertices ‘S’ which contains source vertex ‘s’ does not contain any edge that is in E’ which is the minimal Edge set of a cut, we can say that every vertex in set ‘S’ is reachable from source ‘s’ without passing through any edge of E’. So
S = {x | x’ is reachable from source ‘s’ without passing through any edge of E’}
Now let us consider an edge (x, y) belongs to edge set E(V, V-S) but does not belong to minimal edge set E’, where x∈S and y∈(V-S), Mathematically :
(x, y)∈E(S, V-S)-E’ where  x∈S and y∈(V-S) …………………(1)
So, we must have a path from source ‘s’ to vertex ‘x’ without passing through any edge of E’, now, because we have assume that edge (x, y) does not belong to E’, so that path can be extended to y, which makes ‘y’ also a part of set ‘S’, which contradicts our assumption above in (1). This means, there is no edge that belongs to edge set E(V, V-S) but does not belong to minimal edge set E’. Therefore, E’ contains every edge of the edge set of the cut E(S, V-S).
E(S, V-S) ⊆ E’.
This proves that E(S, V-S) also satisfies the Property A. and if E(S, V-S) is strictly smaller than E’, then E’ cannot be a minimal edge set of the cut. Therefore E’ must be equal to E(S, V-S). Mathematically
E’ = E(S, V-S)
Hence the converse is also proved that if an Edge set of a cut follows Property A, then it has to be minimal. That also means that no proper subset of E’ satisfies Property A.
Net Flow
Net Flow of a floe network is defined as the total flow supplied by the source ‘s’, it is also equal to the total flow consumed by the sink ‘t’. It is denoted by |f|. Therefore,
|f| = ∑ f(s, y)   where y∈V.
Also, for any cut (S, V-S), capacity of the the cut is greater than or equal to the flow of that cut which is equal to the net flow. Mathematically
C(S, V-S) ≥ f(S, V-S) = |f|
The above result states that whatever net flow has emerged out of the source ‘s’ is equal to the net flow that goes from S to (V-S).

Problem
We are given a flow network with a source ‘s’ and sink ‘t’, we have to compute a flow ‘f ‘, such that |f| is maximum.
Now, to calculate the result we will have to consider Max flow Min Cut Theorem.

Max flow Min Cut Theorem : This theorem states that if ‘f ‘ is the maximum flow for a given network, then the value of the flow |f|  is the minimum cut capacity.
So, value of |f| is equal to the minimum value of the capacity of a cut over all the cuts. Therefore
|f|  = min{C(S,V-S)} over all the cuts.
Now, let us consider a flow ‘f ‘ and an edge set E1 = {(x,y) | (x, y) is saturated in f}. Then edge set E1 has to follow Property A. Because if it does not follow property A, then there must exist a path P = s, x1, x2 ….., xk, t,  such that none of the edges of E1 are in the path. So, every edge in E1 has some greater than zero residual capacity. Let ‘r’ be the minimum of all the residual capacities in path P. Then every flow ‘f ‘ in every edge can be increased to f ‘ by ‘r’. So,
f ‘ (x, y) = f (x, y) + r.
The above expression is also valid for vertex ‘x’ being source vertex ‘s’, this means that the net flow |f| also increases to |f ‘| by ‘r’. which is absurd because it was our assumption that |f| was maximum. Hence E1 has to follow the Property A.
Now, let us consider a set E1′ ⊆ E1 and E1′ is also minimal, Then E1′ also has to follow the Property A and every edge of E1 is saturated because it is a subset of E1 which itself contains all the saturated edges . Since, E1′ also follows the Property A and it has saturated edges, the capacity through E1′ is equal to the net flow.

This tells us that as long as there is room for some improvement in the flow, or we can say that the flow is not maximum, there exists a path from source ‘s’ to sink ‘t’. Note that by saying that there exists a path from source ‘s’ to sink ‘t’, we mean that there is no saturated edge in the path.

Residual Graph
In a Residual Graph, an edge (x, y) has the value equal to the residual capacity of that edge. Residual Graph is made from the original graph. It does not contain the edges with residual capacity equal to zero. The figure below shows a residual graph
We always have to find a path in the original graph on which the flow can be improved. Since, the residual graph only contains the edges that have positive residual capacity, we will only have to find a path in the residual graph and update the residual capacity which is the weight of each edge. This path is called “Augmenting Path“. So, the flow will be maximum if and only if the residual graph has no path form source ‘s’ to sink ‘t’.
So, now according to the theorem : ‘f ‘ is a maximum flow if and only if  the residual graph has no path from source ‘s’ to sink ‘t’.

## Ford-Fulkerson Method

The algorithm for the Ford-Fulkerson algorithm is shown below.
The algorithm below works only for the integer capacities.
The Augmenting path in the step 2 can be found out using Breadth First Search or Deapth First Search. If the former is used, the algorithm is called as Edmonds-Karp.

### Complexity Analysis

By adding flow in the augmenting path to the existing flow in the graph, maximum flow will be reached when there is no more augmenting paths found in the graph. However, it is not certain that the maximum flow will be found every time, because if the capacities are irrational, the flow might not even converge to the maximum flow.
The complexity of the ford-fulkerson algorithm (ford-fulkerson uses Deapth First Search) is O(E*f) where ‘E’ is the number of edges in the graph and ‘f’ is the maximum flow in the graph. This is because each augmenting path can be found in O(E) time and increases the flow by an integer amount by atleast 1.
A variation in ford-fulkerson method is Edmond-Karp algorithm which has complexity O(VE2).

The code of the algorithm will be added later.

# Strongly connected components in graph – Tarjan Algorithm

What are strongly connected components in a graph? A directed graph is said to be strongly connected if we can reach any vertex from every other vertex in graph. Strongly connected components of directed graphs is subset of the vertices in the given graph such that any two vertices of this subset are reachable from each other ,i.e
$Forall u, v in C$
$u mapsto v, v mapsto u$ where u,v are the vertices of the graph and  $Mapsto$ denotes reachability ,i.e existence of path from first vertex to the second vertex.You can see it from the diagram below:

You can see different strongly connected components in the given graph shown by different colors.

## Strongly connected components : Algorithm

There are two algorithm to find strongly connected components one is Kosaraju’s algorithm and another one is Tarjan algorithm . Tarjan algorithm requires only one depth first search traversal to find out the all strongly connected present in the graph.So it is efficient as compared to Kosaraju’s algorithm as it requires two Depth First traversal (DFS).
For studying the algorithm we should know DFS traversal creates a DFS tree. By DFS tree I mean, a node is not encountered more than once while traversing graph depth first. For example if take above graph and do a DFS on it. node being traversed will be in this order:

We start at ‘a’, there is only one way to move forward and that is to go to ‘b’. At this point, our DFS tree has two node as shown below.

Now at ‘b’ there are three choices. Either we can go to ‘c’ or ‘f’ or’e’. This will result in three children of node ‘b’ in DFS tree as shown below. Again, any one of the three nodes will be selected. Let’s say it’s ‘e’. Is there any node we can go from node ‘e’? No, hence we will move back to it’s parent ‘b’ and explore other nodes. DFS tree at this point will be

‘f’ will be selected next and then ‘g’ is visited and then ‘h’. At h there is no node which can be visited. Move back to ‘g’, no further node to be traversed. DFS tree is like this

Now we traverse node ‘c’ and its descended the same fashion. Final DFS tree is shown below.

How DFS tree is related to Strongly connected Components(SCC)?
A sub tree with a node as root, is a strongly connected components if there is no back link from any descendant of the node to any ascendant of the node. A bit confusing. In simple term there should not be a link between a node which is traversed after the root node which allows to traverse a node which is traversed earlier than root node. Below figure explains concept of back links, descendant and ascendant clearly.

If  ‘x’ is the root of the sub tree and ‘y’ is descendant of ‘x’ and there is an edge from ‘y’ to ‘z'(ancestor of  x) then we can say that edge z-y together with the paths in the DFS tree from z to x and from x to z form a cycle, which must all be part of the same component. But y is higher or equal in the tree than x, so x can not be the head of this component thereby contradicting our above statement.
Explanation of Algorithm?

Vertex are placed in stack S in order in which they are visited so when  DFS recursively explores a vertex v and it’s descendants ,those nodes are not necessarily popped from stack S before this recursive call returns.Thing which we will consider in building the stack is that a vertex will remain in stack after exploration if and only if it has a path to some vertex earlier in stack. At the end of recursive call that explores v and it’s descendant ,we know whether v itself has a path or not to any vertex earlier on the stack.If so,call returns ,leaving v on the stack S.If not then v must be the root of its strongly connected component at this stage,stack S consist of v together with it’s descendants(these nodes have paths back to v but not to any earlier node,because if they have path to earlier node then v would also have paths to earlier nodes which is false thereby v will not be the root of this subtree) .Now all elements which are popped from the stack at this stage which are strongly connected components.

For doing this ,we define two variables for each vertex, one is low_index and other is index.

For example,if we are exploring descendants of ‘u’ then value of low_index of the child ‘v’ will be changed in two ways:
1)If node ‘v’ is visited for the first time:
low_index(v) = min(low_index(u),low_index(v))

2)If node ‘v’ is already visited(back edge):
low_index(v) = min(low_index(u),index(v))

In second case we are considering index(v) instead of low_index(v) because v is?

How can we determine that there is no edge from v or decedent of v to u or its ancestor? For that we need to maintain time when a node was discovered during depth first search, call is d[].  d[u] is time when node u was discovered using DFS. If d[u] < d[v], means u was discovered before v. Below is graph with d[] filled for each node.

Now, figure out the lowest d[x] which can be reached from each node. Reason to find that is to see if there is a node x which is reachable from children of v and has d[x] less than u,i.e. x is ancestor of u reachable from children of v. Store lowest DFS ancestor reachable from node u in an array low[u]. So low[u] = min(low[u], low[v])  for edge (u,v)

Idea here is that if (u,v) is an edge, then either there is no back edge from sub tree of v to u or ancestor of u, in that case low[u] will be lowest. If there is a back edge to x from sub tree of v, then minimum d[x] reached by node in sub tree will be low[u].

Below diagram shows calculation of low[] in graph.

Finally, if low[v] > disc [u] that means if discovery time of u is less than least ancestor that can be reached from sub tree of v, we have a bridge, because there is no way we can reach to any ancestor of u once we disconnect edge (u,v).

### Strongly connected components : Tarjan Algorithm implementation

```#define MAX 10000
stack<int> S;
int disc = 0;  //discovery time of vertex
int in_stack[MAX]; //used to check whether element is present in stack

void tarjan_algorithm(int u,vector< vector<int> > g,
int index[],int low_index[]){

index[u] = low_index[u] = ++disc;
in_stack[u] = 1;
S.push(u);

for(int i = 0;i < g[u].size();i++){
int v = g[u][i];
if(index[v] == -1) {
//node is not visited yet
tarjan_algorithm(v,g,index,low_index);
low_index[u] = min(low_index[v],low_index[u]);
}
//back edge(in the tree)
else if(in_stack[v] == 1){
//element 'v' is already present in stack
low_index[u] = min(low_index[u],index[v]);
}
}
if(index[u] == low_index[u]) {
//strongly connected component.
while(S.empty() == false && index[u] == low_index[S.top()] ){
cout<<S.top()<<" ";
in_stack[S.top()] = 0;
S.pop();
}
cout<<endl;
}
}
```

As we see from implementation that only one DFS traversal is done, therefore complexity of Tarjan algorithm to find strongly connected components in a graph will be  O(|V| + |E|), here V and E are total number of vertex and edges present in the graph.

# Euler path or circuit in graph: Fleury’s algorithm

In last post,Graphs: Find bridges in connected graphs, we discussed how we can find bridges in an undirected graph. In this post, we will be discussing an algorithm which uses bridges to find Euler path in graph, algorithm is called as Fleury’s algorithm.

What is Euler path or circuit in graph?  Euler path is trail in graph which visits each edge only once from source to destination. For example, in following graph, Euler path will be :

If source and destination are same node, then it becomes Euler’s circuit.
For undirected graphs:
For an undirected graph to have Euler path, each and every node of the graph should have even degree except from start and end node of path which may odd degree. To understand what is degree,please follow the link at the top which explains basics about graphs. To have Euler circuit all nodes should have even degree with no exception.

For directed graphs:
For a directed graph to have Euler path, on each node, number of incoming edges should be equal to number of outgoing nodes except start node where out degree is one more than in degree and end node where incoming is one more than outgoing.
To have Euler circuit, all nodes should have in degree equal to out degree.

We have to keep in mind that for both directed and undirected graphs, above conditions hold when all nodes with non zero degree are part of strongly connected component of graph. To read more on strongly connected component, please follow this link :  Connected components in graph

For implementation, count in degree and out degree of each node and check for conditions mentioned earlier. Based on output, Euler path exists or doesn’t .

### To check if Euler path exists implementation

```#include<stdio.h>
#include<stdlib.h>

#define NUM_NODE 6
#define TRUE 1
#define FALSE 0
typedef struct node{
int value;
struct node *next;

}Node;

Node *graph[NUM_NODE];

Node * createNode(int j){

Node * newNode = (Node *)malloc(sizeof(Node));
if(newNode){
newNode->value = j;
newNode->next = NULL;
}
else{
printf("\n Node cannot be allocated");
}
return newNode;
}

Node * currentNode = graph[i];
if(!currentNode){
graph[i] = createNode(j);
}
else{
while(currentNode->next){
currentNode = currentNode->next;
}
currentNode->next = createNode(j);
}
currentNode = graph[j];
if(!currentNode){
graph[j] = createNode(i);
}
else{
while(currentNode->next){
currentNode = currentNode->next;
}
currentNode->next = createNode(i);
}
}

int visited[NUM_NODE + 1];

void dfs(Node * current){
if(!current) return;

visited[current->value] = TRUE;
Node *temp = current->next;

while(temp){
if(visited[temp->value] != TRUE){
dfs(graph[temp->value]);
}
temp = temp->next;
}
}
int isStrongComponent(Node * graph[], int degree[]){
int i = 0;
for(i=1; i<=NUM_NODE; i++){
if(degree[i] != 0)
break;
}
if(i == NUM_NODE) return TRUE;

dfs(graph[0]);

for(int i=1; i<=NUM_NODE; i++){
if(degree[i] !=0 && visited[i]== FALSE)
return FALSE;
}
return TRUE;
}

int EulerPathExists(Node * graph[]){
int degree[NUM_NODE+1];
int count =0;

for(int i=1; i<=NUM_NODE; i++){
degree[i] =0;
}
for(int i=1; i<=NUM_NODE; i++){
count =0;
Node * current  = graph[i];
while(current){
current = current->next;
count++;
}
degree[i] = count;
}
int connected  = isStrongComponent(graph, degree);
count = 0;
for(int i=1; i<=NUM_NODE; i++){
if(degree[i]%2){
count++;
}
}
if(count !=0 && count !=2 && !connected) return FALSE;

return TRUE;
}
int main(){

int isEulerPath = EulerPathExists(graph);

if(isEulerPath)
printf("\n Graph contains Euler Path");
else
printf("\n Graph does not contain Euler Path");
return 0;
}
```

Complexity of algorithm to is O(V+E). Once we have figured out that there is a Euler path, we need to find and print the actual path. There are two algorithms for it.
Fleury’s algorithm and Hierholzer’s algorithm. Let’s discuss Fleury’s algorithm.

## Fluery’s algorithm to find Euler path or circuit

1. Start from the source node, call it as current node u. If there are nodes with odd degree (there can be max two such nodes), start any one of them. Else start from any node in graph.
2. Traverse any edge (u, v) from current node which is not a bridge edge.
3. Set current as v and go to step 2
4. When all edges which are not bridge edges are traversed, start traversing bridges edges.
5. Stop when all nodes are traversed.

### Walk through Euler path step by step

1. Graph is as follows, there are two nodes 1 and 6 which have odd degrees, so our candidates for start will be either one of these. Start from 6.

2. Traverse to 5 and then to 1 and then to 3 followed by 2 one by one as these are not bridge edges.

3. At 2, there are three edges (2,1), (2,4) and (2,4). However, if we follow edge (2,1) then we will burn the bridge and never be able to come back to rest of the graph, hence we will not be going to that edge. So, we will be moving edge (2,4).

4. At 4 again, we have two edges (4,2), (4,6) and (4,3), But if we go through (4,2) we will again burn bridge as (4,2) is bridge edge. Hence, we will traverse (4,3).

5. At 3, we are left with only edge and we will follow it, which will be followed by edge (6,4).
At 4, bridge edge now has to be followed at 4 because there is no option and same for edge (2,1).

Hence, Euler path will be 6-5-1-3-2-4-3-6-4-2-1

Print Euler path in graph :  Implementation

```//This function removes a specific edge from undirected graph
void removeEdge(int start, int end){
Node * current  = graph[start];
Node * previous = current;
Node * temp = NULL;
if(current->value == end){
graph[start] = current->next;
}
else{
while(current && current->value != end){
previous = current;
current = current->next;
}
previous->next = current->next;
}
free(current);

current = graph[end];
previous = current;
if(current->value == start){
graph[end] = current->next;
}
else{
while(current && current->value != start){
previous = current;
current = current->next;
}
previous->next = current->next;
}
free(current);
}

// Creating status for finding bridges in graph after removal of edge
int DFSUtil(int u, int low[],int d[], int parent[], int *time){
Node * current  = graph[u];
Node * v = NULL;
d[u] = low[u] = ++(*time);
visited[u] = TRUE;
for(v = current; v != NULL; v = v->next){
if(visited[v->value] != TRUE){
parent[v->value] = u;
DFSUtil(v->value,low,d, parent,time);

low[u] = min(low[u], low[v->value]);
}
if(v->value != parent[u])
low[u] = min(low[u], low[v->value]);
}
}

Node * getValidNode(int source, int low[], int d[], int visited[]){
int i =0;
Node * temp = graph;
// Find non bridge edge first, if not available then take any edge.
while(temp){
if(low[temp->value] <= d){
return temp;
}
temp = temp->next;
}
temp = graph;
return temp;
}

void dfs(int current, int low[], int d[], int parent[]){
int time =0;
int i;
//Traversing the node
printf("\nCurrent : %d" , current);
// Getting the next edge to be traversed. It should not be a bridge
Node *temp = getValidNode(current, low, d, visited);
while(temp){
int a = temp->value;
// Remove the edge which we have traverse from the graph
remove_edge(current, temp->value);
for(i=1; i<=NUM_NODE; i++){
parent[i] = -1;
low[i] = -1;
d[i] = -1;
visited[i] = FALSE;
}
// After removal of node, find if calculation
// for finding if edge is bridge or not.
DFSUtil(a, low, d, parent, &time);
// Again go down to the edge which is eligible
dfs(a, low, d,parent);
// Find next eligible edge
temp = getValidNode(current, low, d, visited);
}
}

// Main function to print Euler path
void printEulerPath(Node * graph[]){

int startNode = EulerPathExists(graph);

int parent[NUM_NODE +1];
int low[NUM_NODE +1];
int d[NUM_NODE +1];
int time=0;

for(int i=1; i<=NUM_NODE; i++){
parent[i] = -1;
low[i] = -1;
d[i] = -1;
visited[i] = FALSE;
}
// Create book keeping to fin if edge is a bridge or not.
DFSUtil(startNode, low, d, parent, &time);
dfs(startNode,low, d, parent);
}

```

Complexity of  Fluery’s algorithm to find Euler path is O ((V + E)* (V+ E)), as for each vertex V, we need to check each edge whether it is bridge or not which take O (V+E) time.

# Bridges in connected graphs

Given a connected or disconnected graph, find bridges in graph. An edge is called as bridge edge if and only if on removal of that node, graph becomes disconnected if it was connected graph and if it was disconnected then number of components increase by one.  For example in below graphs, bridges are shown in red.

Concept of detecting bridges in graph will be useful in solving Euler path or tour problem.

Depth First Search of graph can be used to see if graph is connected or not. We can use same concept, one by one remove each edge and see if graph is still connected using DFS. If yes, then edge is not bridge edge, if not, then edge is bridge edge. However, this method entails quite a complexity of O(E * (V+E)) where E is number of edges and V is number of vertices.
Let’s think something better. Consider that we are looking at edge (u,v) in graph. In what condition, we can say that it is bridge edge?
If we can somehow reach node u or any ancestor of u from any node which is decedent of v, that means graph is still connected and (u,v) is not a bridge edge. If above condition is not possible, then (u,v) is bridge.

How can we determine that there is no edge from decedent of v to u or its ancestor? For that we need to maintain time when a node was discovered during depth first search, call is d[].
d[u] is time when node u was discovered using DFS. If d[u] < d[v], means u was discovered before v.
Below is graph with d[] filled for each node.

Now, figure out the lowest d[x] which can be reached from each node. Reason to find that is to see if there is a node x which is reachable from children of v and has d[x] less than u,i.e. x is ancestor of u reachable from children of v. Store lowest DFS ancestor reachable from node u in an array low[u].

So low[u] = min(low[u], low[v])  for edge (u,v)
Idea here is that if (u,v) is an edge, then either there is no back edge from sub tree of v to u or ancestor of u, in that case low[u] will be lowest. If there is a back edge to x from sub tree of v, then minimum d[x] reached by node in sub tree will be low[u].
Below diagram shows calculation of low[] in graph.

Finally, if low[v] > disc [u] that means if discovery time of u is less than least ancestor that can be reached from sub tree of v, we have a bridge, because there is no way we can reach to any ancestor of u once we disconnect edge (u,v).

In above diagram, if we remove edge (3,4), still graph is connected as low[4] = 2 which is less than d[3] which is 3. Hence edge (3,4) is not a bridge.

Lots of theory,lets code it. We will be modifying Depth First Search implementation to keep track of d[] and low[].

## Code for finding bridges in graph

```#define min(a,b) a>b ? b:a

typedef struct node{
int value;
struct node *next;
}Node;

Node *graph[NUM_NODE];

Node *createNode(int j){
Node * newNode = (Node *)malloc(sizeof(Node));
if(newNode){
newNode->value = j;
newNode->next = NULL;
}
else{
printf("\n Node cannot be allocated");
}
return newNode;
}

Node * currentNode = graph[i];
if(!currentNode){
graph[i] = createNode(j);
}
else{
while(currentNode->next){
currentNode = currentNode->next;
}
currentNode->next = createNode(j);
}
}

//This function removes a specific edge from undirected graph
void removeEdge(int start, int end){
printf("Deleting edge (%d, %d)", start, end);
Node * current  = graph[start];
Node * previous = current;
Node * temp = NULL;
if(current->value == end){
graph[start] = current->next;
}
else{
while(current && current->value != end){
previous = current;
current = current->next;
}
previous->next = current->next;
}
free(current);

current = graph[end];
previous = current;
if(current->value == start){
graph[end] = current->next;
}
else{
while(current && current->value != start){
previous = current;
current = current->next;
}
previous->next = current->next;
}
free(current);
}

/*Creating status for finding bridges
in graph after removal of edge */
int DFSUtil(int u, int low[],int d[], int parent[], int *time){
Node * current  = graph[u];
Node * v = NULL;
d[u] = low[u] = ++(*time);
visited[u] = TRUE;

for(v = current; v != NULL; v = v->next){
if(visited[v->value] != TRUE){
parent[v->value] = u;
DFSUtil(v->value,low,d, parent,time);

low[u] = min(low[u], low[v->value]);
}
if(v->value != parent[u])
low[u] = min(low[u], low[v->value]);
}
}

Node * getValidNode(int source, int low[], int d[], int visited[]){
int i =0;
Node * currentNode = graph;

// Find non bridge edge first,
//if not available then take any edge.
while(currentNode){
if(low[currentNode->value] <= d){ return currentNode; }
currentNode = currentNode->next;
}
currentNode = graph;
return currentNode;
}

void dfs(int current, int low[], int d[], int parent[]){
int time = 0;

/* Getting the next edge to be traversed.
It should not be a bridge */

Node *currentNode = getValidNode(current, low, d, visited);
while(currentNode){
int a = currentNode->value;

// Remove the edge which we have traverse from the graph
removeEdge(current, currentNode->value);
for(int i=1; i<=NUM_NODE; i++){
parent[i] = -1;
low[i] = -1;
d[i] = -1;
visited[i] = FALSE;
}

/*After removal of node, find if calculation for
finding if edge is bridge or not. */
DFSUtil(a, low, d, parent, &time);

/* Again go down to the edge which is eligible */
dfs(a, low, d,parent);

/* Find next eligible edge */
temp = getValidNode(current, low, d, visited);
}
}

// Main function to print Euler path
void printEulerPath(Node * graph[]){
int startNode = EulerPathExists(graph);
int parent[NUM_NODE +1];
int low[NUM_NODE +1];
int d[NUM_NODE +1];
int time = 0;

for(int i=1; i<=NUM_NODE; i++){
parent[i] = -1;
low[i] = -1;
d[i] = -1;
visited[i] = FALSE;
}

// Create book keeping to find if edge is a bridge or not.
DFSUtil(startNode, low, d, parent, &time);
dfs(startNode,low, d, parent);
}

int main(){

printEulerPath(graph);

return 0;
}
```

Complexity of finding bridges in graph code is O(V + E) where V is number of vertices and E is number of edges in graph.

# Topological sort of graph

Given a directed graph G (V,E), list all vertices such that for all edges (v,w), v is listed before w. This problem is called as topological sort on graph.

Simple example we can take is that of courses. Let’s say university provides four different courses, A, B, C and D. For taking B, A is prerequisite. Similarly for taking C, A and B are prerequisites. For D, C and B are prerequisite. If created a node for each course and have a directed edge from prerequisite course to other courses, we will get a graph to work with.

If somebody asks us to figure out in which order he/she must take these course, we just do the topological sorting of the graph, it will give us answer. Topological sort order will be A, B, C ,D

### Topological sort algorithm

First we need to figure where to start. What do we understand from above example?
Yes, we should take course first which has no prerequisites. If a course has no prerequisites, then there will be no incoming edges to that course. In graph terminology, node would have zero in-degree.

OK, we got the first node. Now how to proceed. Lets go back to our example again. If we have taken a course, what we can do is remove edges from that course to all other courses for which it was prerequisite. Why? Because if we have taken this course we are eligible to take all courses which have this course as prerequisite. Correction : If this course was only prerequisite. So when we visit a node, what we do is we decrease the in-degree of all adjacent nodes.

Now again search for course, which has no prerequisite and process as above. In graph terminology, search for the node in update graph, which has zero in-degree and repeat above steps till we have no such node which has zero in-degree. If there are still some nodes which are not visited, it will be in case of cycles in graph,  topological sorting is not possible.

Let’s take an example and work it out.

## Topological sort on graphs : Implementation

```#include<stdio.h>
#include<stdlib.h>

#define NUM_NODE 7
#define true 1
#define false 0

#define INFINITE 1000

typedef struct node{
int value;
int wt;
struct node *next;
}Node;

Node *graph[NUM_NODE + 1];
int visited[NUM_NODE + 1];

int findNextNode(int indegree[]){
for( int i=0; i<= NUM_NODE; i++){
if(indegree[i] == 0 && visited[i] == 0){
return i;
}
}
return -1;
}
void topologicalSort(Node * graph[]){

int indegree[NUM_NODE+1];
int i;

Node * current = NULL;
Node * next = NULL;
for(i=1; i<=NUM_NODE; i++){
indegree[i] =0;
}
/* Find indegree of each node. go through all edges.
complexity : O(E) */

for(i=1; i<=NUM_NODE; i++){
current = graph[i];
if(current){
next = current->next;
while(next){
indegree[next->value]++;
next = next->next;
}
}
}

/* We will traverse each node */
for(i=1; i<=NUM_NODE; i++){
/* find a node with zero indegree */
int node  = findNextNode(indegree);
/* If there is no node with zero indegree,
topological sort not possible */
if(node != -1){
printf("%d ", node);
visited[node] = 1;
/* decrease in degree of each node adjacent
to node wih zero indegree */
current = graph[node];
if(current){
next = current->next;
while(next){
indegree[next->value]--;
next = next->next;
}
}
}
else{
printf("\n Topological sort no possible");
}
}
return ;
}

Node * createNode(int j, int wt){

Node * newNode = (Node *)malloc(sizeof(Node));
if(newNode){
newNode->value = j;
newNode->next = NULL;
newNode->wt = wt;
}
else{
printf("\n Node cannot be allocated");
}
return newNode;
}

void addEdge(int i, int j, int wt){

Node * currentNode = graph[i];
if(!currentNode){
graph[i] = createNode(j, wt);
}
else{
while(currentNode->next){
currentNode = currentNode->next;
}
currentNode->next = createNode(j, wt);
}
}

int main(){

for(int i=1; i<=NUM_NODE; i++){
graph[i] = createNode(i,0);
}

topologicalSort(graph);

return 0;
}
```

Complexity of topological sort algorithm is  O(V2 +E). For each vertex we find vertex with zero in-degree, hence the quadratic time.
Let’s think of something which can make this update and search less complex. Whenever we update in-degree of all adjacent node, we can store all vertices for which in-degree becomes zero in a queue. So, we don’t have to separately search for nodes with zero in-degree, we can take front of the queue.

### Optimized implementation of topological sort

```#include<stdio.h>
#include<stdlib.h>

#define NUM_NODE 7
#define true 1
#define false 0

typedef struct node{
int value;
int wt;
struct node *next;
}Node;

Node *graph[NUM_NODE + 1];
int visited[NUM_NODE + 1];

typedef struct Queue{
Node *front;
Node *rear;
}Queue;

int isEmpty(Queue *q){
if(!(q->front || q->rear) ) return true;

return false;
}

void enqueue(Queue *q, int value){

Node * newNode = (Node *)malloc(sizeof(Node));
newNode->value = value;

if(isEmpty(q)){
q->front = newNode;
q->rear  = newNode;
}
else{
if(q->rear){
q->rear->next = newNode;
q->rear 	  = newNode;
}
else{
printf("\n Error ");
}
}
}

int dequeue(Queue *q){
if(!isEmpty(q)){
int data = q->front->value;
Node * currentFront = q->front;
if(q->front == q->rear){
q->rear = NULL;
q->front = NULL;
}
else{
q->front = currentFront->next;
}
free(currentFront);
return data;
}
}
void initializeQueue(Queue **q){
*q = (Queue *)malloc(sizeof(Queue));
(*q)->front = NULL;
(*q)->rear = NULL;
}

void topologicalSort(Node * graph[]){

int indegree[NUM_NODE+1];
int i;

Queue *q = NULL;

initializeQueue(&q);

Node * current = NULL;
Node * next = NULL;
for(i=1; i<=NUM_NODE; i++){
indegree[i] =0;
}
/* Find indegree of each node. go through all edges.
complexity : O(E) */
for(i=1; i<=NUM_NODE; i++){
current = graph[i];
if(current){
next = current->next;
while(next){
indegree[next->value]++;
next = next->next;
}
}
}

for(i=1; i<=NUM_NODE; i++){
if(indegree[i] == 0){
enqueue(q, i);
indegree[i]--;
break;
}
}

while(!isEmpty(q)){
int node  = dequeue(q);
printf("%d ", node);
/* decrease in degree of each node adjacent to node wih zero indegree */
current = graph[node];
if(current){
next = current->next;
while(next){
indegree[next->value]--;
if(indegree[next->value] == 0){
/* If indegree of any node becomes zero, enqueue it in queue. */
enqueue(q,next->value);
indegree[next->value]--;
}
next = next->next;
}
}
}

for(i=1; i<=NUM_NODE; i++){
if(indegree[i] > 0){
printf("\n Topological sort not possible");
}
}
return;
}

Node * createNode(int j, int wt){

Node * newNode = (Node *)malloc(sizeof(Node));
if(newNode){
newNode->value = j;
newNode->next = NULL;
newNode->wt = wt;
}
else{
printf("\n Node cannot be allocated");
}
return newNode;
}

void addEdge(int i, int j, int wt){

Node * currentNode = graph[i];
if(!currentNode){
graph[i] = createNode(j, wt);
}
else{
while(currentNode->next){
currentNode = currentNode->next;
}
currentNode->next = createNode(j, wt);
}
}

int main(){

for(int i=1; i<=NUM_NODE; i++){
graph[i] = createNode(i,0);
}