# Dijkstra’s algorithm to find shortest path

Given a graph, directed or undirected and two nodes source and destination, find shortest path between these nodes using Dijkstra’s algorithm. Mathematically, problem can be stated as

Given a digraph G = (V, E), edge weights ℓe ≥ 0, source s ∈ V, and destination t ∈ V, find the shortest directed path from s to t.

Additional questions which follow are : can we use same algorithm to find shortest path from one source to all destination nodes in graph? Can we find shortest path from each node to each other node in graph using Dijkstra’s shortest path algorithm?
Coming back to original problem, let take an example, in below graph, shortest path between 1 and 4 is 1-3-2-4 with cost of 9.

Other paths are 1-2-4 with cost 11, 1-5-4 with cost 14 and so on.
Shortest path from a single source to a given destination is a classic graph problem. It finds application in routing table generation in routers, optimum path from one point to another in maps, robot navigation, routing of telecommunication messages and various other places.

In maps, there can be multiple routes from one source to given destination, but it highlights the shortest path. Similarly, consider a network where every path from each router has some weight associated with it. We wanted to forward packets on to a route so that  total sum of those weights is least. All these problem are nothing but single source to destination shortest path problems.

There are many algorithms proposed to solve this problem. One of the algorithms is Dijkstra’s algorithmTo apply Dijkstra’s algorithm to any graph two conditions are necessary :

• Graph should be connected.
• And weights on the edges should be non negative.

It does not matter if the graph is directed or undirected.

Important observation on which Dijkstra’s algorithm works is that every sub-path of an optimal path will also be optimal. Intuition is: take each vertex in the increasing order of their distance from source node and step by step create the tree adding one edge at a time.  New edge is added in tree based on the criteria that is creates a shortest path to current vertex.

## Dijkstra’s algorithm

Dijkstra’s algorithm is widely published and is as below.

```1. Initialize distance of all nodes from start node as INFINITE and all nodes as not finalized.
2. Take source node to start with, let's say s.  Distance from source or start node to itself will be zero.
3. Mark u as considered and distance finalized. Now for all neighbor nodes v of it, update the distance if the current distance of v from u is more than distance of  u + weight of (u,v). For example, if the current node A is marked with a distance of 6, and the edge connecting it with a neighbor B has length 2, then the distance to B (through A will be 6 + 2 = 8. If B was previously marked with a distance greater than 8 then change it to 8. Otherwise, keep the current value.{From Wikipedia}
4. Now select a node for which distance is not finalized and distance is minimum till now and make it u.
5.Go to step 3.
```

### Mathematical explanation of Dijkstra’s shortest path algorithm

Distance of vertex v is estimated and stored in d[v] for each vertex v. If there is a path already discovered till v, then d[v] will be equal to less than then final distance between source and node v. If there is no path yet discovered, then d[v] will be infinity.

At beginning, d[s] will be 0 and d[v] will be infinity for all nodes.

How do we proceed forward and find shortest path to destination then? Each node (u) is process and new edge is added into tree. Processing means : for each node v adjacent to u, distance to v is updated (distance from s to u + distance from u to v) if that is less than d[v]

`d[v]  = d[u] + w(u,v) if d[u] is greater than d[u] + w(u,v)`

If new d[v] is less than earlier d[v] stored then update d[v]. This step is also called as relaxation of edge. In programming terms relaxation function is :

```void relax(){
if(d[v] > d[u] + w(u,v)){
d[v]  = d[u] + w(u,v);
pred[v] = u;
}
}
```

Vertex of graph are divided into two sets; set S is set of all the vertices whose real distance from source is known and there is another set VS whose real distance is still not known. Initially S ={} and all vertices belong to other set VS. As  and when node is processed, it moves from VS to S.

What is the best method to find the next vertex to process? Dijkstra algorithm uses greedy strategy to select the next vertex to process. The next vertex will be u such that d[u] is minimum.

Let’s work out an example and see how shortest path algorithm works using greedy strategy.

Execution of above algorithm on this graph will be

To find path which was followed to reach destination from source, we can have an auxiliary array to keep track of parent node of current node whenever distance to current node is updated. By reverse tracking parents from destination to source, we can find out shortest path.

## Implementation of Dijkstra’s algorithm

```#include<stdio.h>
#include<stdlib.h>
#include<string.h>

#define NUM_NODE 7
#define NUM_WORDS 10
#define NUM_CHAR 4
#define true 1
#define false 0

#define INFINITE 1000

typedef struct node{
int value;
int wt;
struct node *next;
}Node;

Node *graph[NUM_NODE + 1];

int findMinimumNode(int visited[], int dist[]){
int min = INFINITE;
int index = -1;
int i;
for(i=1; i<= NUM_NODE; i++){
if(visited[i] == false && min > dist[i]){
min = dist[i];
index = i;
}
}
return index;
}

void dijstras(Node * graph[], int start, int end ){
int i;
int parent[NUM_NODE +1];
int distance[NUM_NODE+1];
int visited[NUM_NODE+1];

for(i=1; i<=NUM_NODE; i++){
visited[i] = false;
distance[i] = INFINITE;
parent[i] = -1;
}
// Mark distance of start as 0.
distance[start] = 0;

for(i=1; i<=NUM_NODE; i++){
int index  = find_minimum_node(visited, distance);
if(index != -1){
Node * node = graph[index];
Node * current  = node->next;

while(current){
/*If neighbor node is not visited and
its current distance is more than distance
of current node + cost of edge between
current node and this node, update the distance */

if(visited[current->value] == false
&& distance[current->value] >
distance[node->value] + current->wt){

distance[current->value] = distance[node->value]
+ current->wt;
parent[current->value] = node->value;
}
current = current->next;
}

visited[node->value] = true;
if(node->value == end)
break;
}
else{
break;
}
}

printf("\nDistance between %d and %d : %d",
start , end, distance[end]);

/* Printing path in reverse order,using stack,
we can print it normal order to.*/

printf("\nPath is  (In reverse order): ");
int currentParent =0;
while(currentParent != -1){
printf("%d ", end );
currentParent = parent[end];
end = currentParent;
}
printf("\n");
}

Node *createNode(int j, int wt){

Node * newNode = (Node *)malloc(sizeof(Node));
if(newNode){
newNode->value = j;
newNode->next = NULL;
newNode->wt = wt;
}
else{
printf("\n Node cannot be allocated");
}
return newNode;
}

void addEdge(int i, int j, int wt){

Node * currentNode = graph[i];
if(!currentNode){
graph[i] = createNode(j, wt);
}
else{
while(currentNode->next){
currentNode = currentNode->next;
}
currentNode->next = createNode(j, wt);
}
}

//driver program
int main(){

int i,j;
for(i=1; i<=NUM_NODE; i++){
graph[i] = createNode(i,0);
}
// creating graph with weighted edges.

dijstras(graph, 1, 6);

return 0;
}
```

Complexity of  Dijkstra’s algorithm to find shortest path in a given directed graph is O(V2) where V is number of vertices in graph. This can be reduced to O(E log V) by using heaps. Heaps will reduce complexity of searching minimum weight cost from O(V) to O(log V).

Implementation provided above can be optimized using priority queues. Use priority queue to store all nodes which are still not finalized. To start with put all nodes on to priority queue with key as d[v] and take out the minimum node, which will be source as d[s] =0 and rest all are infinity.
Implementation using priority queue can be done as

```1. Create an empty priority queue.
2. For each node v ≠ s : d(v) ← ∞; d(s) ← 0.
3. For each node v ∈ V : insert v with key d(v) into priority queue.
4. while priority queue is not empty
4.1 u ← delete-min from priority queue.
4.2 For each edge (u, v) ∈ E leaving u:
4.2.1 If d(v) > d(u) + w(u, v)
decrease-key of v to d(u) + w(u,   v) in priority queue.
d(v) ← d(u) + w(u, v).```

It is very easy to implement this algorithm in any high level language like Python which provides ready to use implementation of priority queues.

Limitation of algorithm
1. It does not work with negative weights.
Web resources
Wikipedia
Animation

## Word ladder algorithm using graphs

To understand word ladder algorithm, let’s see problem in detail which can be solved using this algorithm first. Given two words A and B and a dictionary of words, task is to transform first word into second word by changing one character at a time. Character should be changed in such a way that new word should be in dictionary, meaning it should be a valid word. Find minimum number of such hops from one word to reach second word. This is called as word ladder problem. For example, lets say given word are  CAT and DOG

Sequence of steps will be CAT –> COT–>COG–>DOG, hence three substitutions and we are done.
To convert COLD to WARM (from Wikipedia)
Sequence of steps will be COLD–>CORD–>CARD–>WARD–>WARM.

There are two strings/words given as input. And hoping can be done only if there is one condition : the destination of hop should differ at most one character with source. So if two words differ only in one character, we have a relationship between them.

So if take those two words as two nodes and connect them only if they differ by one character, we get a graph with two nodes and one edge.
Apply same principle to all words in dictionary and create a graph, nodes representing words and edges the relationship.
Once we have created graph as per above conditions, start breadth first traversing from the node which has values as first word and go on scanning till we get a node with value as second word.

That’s all. Simple analysis! Implementation is bit tricky. Let’s look at implementation.
For each word, if we go an scan every word in dictionary to figure out if they differ by one character in order to create edge will be too inefficient with complexity of O(N2).

If we create buckets of words with one character of that replaced with _ and put all similar words into that bucket, we will segregate words with one character different. Let’s take an example.
Dictionary given is  {“CAT”, “BAT”, “COT”, “COG”, “COW”, “RAT”, “BUT”,”CUT”, “DOG”, “WEB”}
Now for word CAT, there are three buckets possible.

Now take word BAT, if we replace B with _ it becomes _AT, for which we already have a bucket, so we can put CAT and BAT in same bucket.
Similarly when we replace second character of COT, it becomes C_T, for which we already have a bucket, hence CAT and COT are placed in once bucket.

With above given words buckets will be like below.

Now connect all words in same buckets with each other.Final graph will be like below:

```#include<stdio.h>
#include<stdlib.h>
#include<string.h>

#define NUM_WORDS 10
#define NUM_CHAR 4
#define true 1
#define false 0

typedef struct node_s{
char value[NUM_CHAR];
int wt;
struct node_s *next;
}Node;

Node *graph[NUM_WORDS + 1];
int visited[NUM_WORDS + 1];

Node * createNode(char *str){

Node * newNode = (Node *)malloc(sizeof(Node));
if(newNode){
strcpy(newNode->value, str);
newNode->next = NULL;
}
else{
printf("\n Node cannot be allocated");
}
return newNode;
}

Node * currentNode = graph[i];
if(!currentNode){
graph[i] = createNode(str);
}
else{
while(currentNode->next){
currentNode = currentNode->next;
}
currentNode->next = createNode(str);
}
}

/* Check if there is already exisitng bucket */
int searchBuckets(char buckets[NUM_WORDS * NUM_CHAR][NUM_CHAR],
int count, char *str){

for(int i=0; i<count; i++){
if(!strcmp(buckets[i], str))
return i;
}
return -1;
}

/* This adds string to a bucket */
void  addString(Node *preGraph[], int index, char * str){

Node * currentNode = preGraph[index];

if(!currentNode){
preGraph[index] = createNode(str);
}
else{
while(currentNode->next){
currentNode  = currentNode->next;
}
currentNode->next = createNode(str);
}
}

/* This function finds the index of the word in dictionary
Inefficient, can be done using binary search as words are in sorted
order in dictionary */

int findIndex(char dict[NUM_WORDS][NUM_CHAR], char * str){
int i;
for(i=0; i<NUM_WORDS; i++){
if(!strcmp(dict[i], str))
return i;
}
return -1;
}

/* This function links all words in buckets to create edges of graph */

void createGraph(Node *preGraph[NUM_WORDS * NUM_CHAR],
char dict[NUM_WORDS][NUM_CHAR], int count){

int index, indexA;
for(int i=0; i<count; i++){
Node * current =  preGraph[i];
while(current){
index  = findIndex(dict, current->value);
if(!graph[index]){
}

Node *next = current->next;
while(next){
indexA  = findIndex(dict, next->value);
if(!graph[indexA])
next = next->next;
}
current = current->next;
}
}
}

/*This function creates buckets and using those buckets also
creates graph where all neighbor nodes differ by at most one
character */
void createBuckets(char buckets[NUM_WORDS * NUM_CHAR][NUM_CHAR],
char dict[NUM_WORDS][NUM_CHAR]){

int count = 0;
Node * preGraph[NUM_WORDS * NUM_CHAR];

for(int i=0; i<NUM_WORDS * NUM_CHAR; i++){
preGraph[i] = NULL;
}
for(int i=0; i<NUM_WORDS; i++){
for(int j = 0; j<NUM_CHAR-1; j++ ){
char str[NUM_CHAR];
strcpy(str, dict[i]);
str[j] = '_';

int index = searchBuckets(buckets, count, str);
if(index != -1){
}
else{
strcpy(buckets[count], str);
count++;
}
}
}
createGraph(preGraph, dict, count);
for(int i=0; i<NUM_WORDS; i++){
printf("Nodes adjacent to %d : ", i);

Node * current  = graph[i];
while(current){
printf("%s ", current->value);
current = current->next;
}
printf("\n");
}
}

typedef struct queue{
Node *front;
Node *rear;
}Queue;

int isEmpty(Queue *q){
if(!q->front && !q->rear) return true;
return false;
}

void enqueue(Queue *q, char value[]){
Node * newNode = (Node *)malloc(sizeof(Node));
strcpy(newNode->value, value);
if(isEmpty(q)){
q->front = newNode;
q->rear = newNode;
}
else{
if(q->rear){
q->rear->next = newNode;
q->rear = newNode;
}
else{
printf("\n Error ");
}
}
}

void dequeue(Queue *q, char str[]){
if(!isEmpty(q)){
strcpy(str, q->front->value);
Node * curr = q->front;

if(q->front == q->rear){
q->rear = NULL;
q->front = NULL;
}
else{
q->front = curr->next;
}
free(curr);
}
}
void initializeQueue(Queue **q){
*q = (Queue *)malloc(sizeof(Queue));
(*q)->front = NULL;
(*q)->rear = NULL;
}

void bfs(Node *current, char *second,
char dict[NUM_WORDS][NUM_CHAR]){

Queue *q = NULL;
initializeQueue(&q);

char str[NUM_CHAR];
int index ;

if(!current) return;

enqueue(q, current->value);

while(!isEmpty(q)){
dequeue(q, str);
printf("\n%s ", str);
index = findIndex(dict, str);
visited[index] = true;

if(!strcmp(str, second)){
return;
}
Node *current = graph[index]->next;
while(current){
index = findIndex(dict, current->value);
if(visited[index] == false){
visited[index] = true;
enqueue(q, current->value);
}
current = current->next;
}
}
}

//Driver program
int main(){
char buckets[NUM_WORDS * NUM_CHAR][NUM_CHAR];
char dict[NUM_WORDS][NUM_CHAR] = {"CAT", "BAT", "COT", "COG", "COW",
"RAT", "BUT","CUT", "DOG", "WEB"};

createBuckets(buckets, dict);

bfs(graph[0], "CUT", dict);
return 0;
}
```

# Minimum Spanning Tree: Prim’s Algorithm

We discussed in previous post, we discussed Kruskal algorithm to find minimum spanning tree in a given graph. Please refer Minimum Spanning Tree with Kruskal Algorithm.
for details.

Second algorithm to find minimum spanning tree is Prim’s algorithm.

Basic idea in Prim’s algorithm is to have two sets of nodes of graph, one which are already included into minimum spanning tree and other which are not included yet. Now find an edge with least weight which connects any node or vertex in first set to any vertex in second set. Such an edge is commonly called as cut.

Let two sets be A and V where set A contains all vertices of graph which are included in MST and V contains all vertices which are not included in MST yet. Let S be set of all vertices of a graph. Relationship between these three would be S = AV. To start with A will be empty and V will be equal to S.

Start with one vertex and add it to A. Also, cost of all nodes initially is INFINITE. Once we add the vertex to A, update the cost of all neighbor vertices based on cost to reach vertex added in A and cost of edge between this vertex and neighboring vertex. Remove vertex from V.

Once update is done, greedy strategy comes into effect. Select the edge which has minimum cost and connects one vertex in A and other in V, more simply said does not form a cycle. (If both the ends of an edge are in same component of the graph there are bound form a cycle. Details you can find here: Find cycles in graph )

## Minimum spanning tree : Prim’s algorithm

Let’s see prim’s algorithm and then we will see how can we implement it.

```1. Initialize cost of each vertex with INFINITE. Except from the first vertex where we will start.
2. For each neighbor, if cost of reaching neighbor node from current vertex is less than current cost, update the cost of neighboring node.
3. Select the minimum cost neighbor vertex among these vertex and see if edge forms a cycle.
4. If it forms a cycle take another vertex till we considered every neighboring vertices.
5.If it does not form cycle, add it to MST and go to step 2.
6.Repeat all above steps till all the vertices are not included in tree.
```

Let’s see how algorithm work.

Let’s see minimum spanning tree algorithm’s implementation.
First, we need to do is to keep track of two sets. Since both sets are mutually exclusive, keeping track of one set will automatically will give the other. Let’s keep track of each node in MST. Create an array of nodes of graph and every time node is added in MST, mark it as added in array. While considering any edge in future, just check if MST[i] is true, if it is then node is already added in MST and should not be added again.

Second, keep track of distance of each node. Initially all nodes are at INFINITE distance. As we traverse graph, update each node’s distance from current node. Keep an array of integers initialized with INFINITE and every time we add a node in MST, update distance of neighboring nodes based on cost of edge from current node to them.

Last step is to find the minimum cost edge between two sets. We can scan through the distance array which we discussed in above and get a node which has least cost and not already in MST (using the array values in discussed in paragraph 1)
Finding minimum can be implemented using min heap or priority queues, that would reduce the complexity of overall algorithm.

## Minimum spanning tree : Prim’s algorithm implementation

```#include<stdio.h>
#include<stdlib.h>

#define NUM_NODE 6
#define MAX_VALUE 999
#define true 1
#define false 0

//Definition of node in adjacency list formation
typedef struct node{
int value;
int wt;
struct node *next;
}Node;

//Edge definition for sorting and storing minimum spanning tree
typedef struct edge{
int start;
int end;
int wt;
}Edge;

Node *graph[NUM_NODE + 1];

//This function creates a node in adjancency list representation with weight
Node * createNode(int j, int wt){

Node *newNode = (Node *)malloc(sizeof(Node));
if(newNode){
newNode->value = j;
newNode->next = NULL;
newNode->wt = wt;
}
else{
printf("\n Node cannot be allocated");
}
return newNode;
}
void addEdge(int i, int j, int wt){

Node *currentNode = graph[i];
if(currentNode == NULL){
graph[i] = createNode(j, wt);
}
else{
while(currentNode->next){
currentNode = currentNode->next;
}
currentNode->next = createNode(j, wt);
}
}

void updateWeights(int currentVertex, int weights[],
int predecessors[], int visited[]){

Node *currentNode = graph[currentVertex];
if(currentNode){
//update distance for all neighbor nodes
Node * itr = currentNode->next;
while(itr){
if(weights[itr->value] > itr->wt
&& visited[itr->value] == false ){
weights[itr->value] =  itr->wt;
predecessors[itr->value] = currentVertex;
}
itr = itr->next;
}
}
}

int getMinimumVertex( int weights[], int visited[], int numNodes ){
int currentMinWeight = MAX_VALUE;
int currentMinVertex = 1;

for( int i = 2; i<=numNodes; i++ ){
if( !visited[i] && weights[i] < currentMinWeight  ){
currentMinWeight = weights[i];
currentMinVertex = i;
}
}

return currentMinVertex;
}

//Real implementation of Prim's algorithm
int primsMSTAlgorithm(Node * graph[], int numNodes){

//Step 1 : Initialize all arrays
int weights[numNodes+1];
int visited[numNodes+1];
int predecessors[numNodes+1];

for( int i=1; i<=numNodes; i++ ){
weights[i] = MAX_VALUE;
}
for( int i=1; i<=numNodes; i++ ){
visited[i] = 0;
}
for( int i=1; i<=numNodes; i++ ){
predecessors[i] = MAX_VALUE;
}

weights[1] = 0;
int totalCost = 0;

//Step 2
for( int i=0; i <=numNodes; i++ ){
//Greedy strategy
int closestVertex      = getMinimumVertex( weights, visited, numNodes );
totalCost+= weights[closestVertex];
visited[closestVertex] = true;
updateWeights( closestVertex, weights, predecessors, visited  );
}

}

//Driver program
int main(){

int i;
for(i=1; i<=NUM_NODE; i++){
graph[i] = createNode(i,0);
}

int totalCost = primsMSTAlgorithm(graph, NUM_NODE);
printf("Cost of minimum spanning tree : %d", totalCost);
return 0;
}
```

Complexity of Prim’s algorithm to find minimum spanning tree in graph is O(V2).

# Minimum Spanning Tree with Kruskal Algorithm

Before , let us first understand what is a spanning tree. Spanning tree is subset of graph G which contains all nodes of graph and edges which makes graph minimally connected. By minimal connection, we mean if any edge of that graph is removed, graph will be disconnected. Also, it is maximally cyclic, means if add extra edge to subset it will create a cycle. There are n-1 edges in spanning tree of a graph with n nodes.

Below figure depicts a spanning tree of graph in thick edges.

So, what is minimum spanning tree?
Given a connected graph G = (V, E) with edge weights wi, an Minimum Spanning Tree is a subset of the edges T ⊆ E such that T is a spanning tree whose sum of edge weights is minimized.

Now, coming to over problem:  Given a weighted connected undirected graph, find minimum spanning tree in that graph.

Let’s first understand what is spanning tree. Spanning tree is tree which connects each vertex of the graph. When graph is weighted i.e each edge of the graph has some weight to move from one node to another, spanning tree with minimum cost is called as minimum spanning tree. If all the edges contain distinct weights, there will be unique minimum spanning tree for that graph, however, if two edges can have same weight then there can be more than one minimum spanning tree for given tree.
For graph shown below, second figure shows spanning trees for that graph.

### Applications of minimum spanning trees

1 . Broadcast tree preparation for broadcasting in internet. It is implemented using spanning tree protocol.
2.  Preparing telecommunication networks, water supply networks etc.

There are two widely used algorithms to find minimum spanning tree for a graph, both based on greedy algorithm. Basic idea of greedy algorithm is that at every step we chose the option which gives us maximum profit at that step without worrying about future steps.

Let’s see these greedy algorithms.

## 1. Kruskal’s algorithm for minimum spanning tree

Basic idea is that we take each edge one by one in increasing order of weight. For each edge check if it makes a cycle in existing tree. If it does not add it to minimum spanning tree formed till now. If it forms a cycle discard the edge. Repeat above steps till all nodes are added in tree. Resulting tree will be minimum spanning tree.

```1. Sort all edges based on weights.
3. Add it to T. For each edge in graph, repeat following steps.
3.1 If current edge forms a cycle, discard the edge.
3.2 If current edge does not form a cycle, add it to T.
```
```#include<stdio.h>
#include<stdlib.h>

#define NUM_NODE 5
#define true 1
#define false 0

//Definition of node in adjacency list formation
typedef struct node{
int value;
int wt;
struct node *next;
}Node;

//Edge definition for sorting and storing minimum spanning tree
typedef struct edge{
int start;
int end;
int wt;
}Edge;

Node *graph[NUM_NODE + 1];

//This function creates a node in adjancency list representation with weight
Node * createNode(int j, int wt){

Node * new_node = (Node *)malloc(sizeof(Node));
if(new_node){
new_node->value = j;
new_node->next = NULL;
new_node->wt = wt;
}
else{
printf("\n Node cannot be allocated");
}
return new_node;
}
void addEdge(int i, int j, int wt){

Node * currentNode = graph[i];
if(!currentNode){
graph[i] = createNode(j, wt);
}
else{
while(currentNode->next){
currentNode = currentNode->next;
}
currentNode->next = createNode(j, wt);
}
}

//Below functions are used for Union and Find method on disjoint set
int find(int rep[], int value){
if(rep[value] == -1 || rep[value] == value)
return value;
else find(rep, rep[value]);
}

void unionSet(int rep[], int x, int y){

int xroot = find(rep, x);
int yroot = find(rep, y);

rep[xroot] = yroot;
}

void makeSet(int rep[]){
int i;
for(i=0; i<= NUM_NODE; i++){
rep[i] = -1;
}
}
//This copies all edges to array so as to sort them on weight
void copyEdges(Edge edges[], Node *graph[]){
int i, count =0;

for(i=1; i<= NUM_NODE; i++){
Node * u = graph[i];
if(u){
Node * v = u->next;
while(v){
edges[count].start =  u->value;
edges[count].end =v->value;
edges[count].wt = v->wt;
count++;
v = v->next;
}
}
}
}

//Implementation of quick sort
void swap(Edge edges[], int i, int j){
int temp_start = edges[i].start;
int temp_end = edges[i].end;
int wt = edges[i].wt;

edges[i].start  = edges[j].start ;
edges[i].end  = edges[j].end ;
edges[i].wt  = edges[j].wt;

edges[j].start  = temp_start;
edges[j].end  = temp_end ;
edges[j].wt  = wt;
}
int partition(Edge edges[], int start, int nEdges){
int i,j;

int pivot =  edges[start].wt;
i = start+1;
j = nEdges;
while(i<=j){
while(i<=nEdges && edges[i].wt < pivot)
i++;
while(j>start && edges[j].wt >= pivot)
j--;

if(i<j){
swap(edges, i, j);
}
}
if(j>start)
swap(edges,start, j);

return j;
}

void quicksort(Edge edges[], int start, int nEdges){

if(start >=nEdges) return;
int pivot = partition(edges, start, nEdges);

quicksort(edges, start, pivot);
quicksort(edges, pivot+1, nEdges);

}

//Real implementation of krutskal algorithm
void kruskal(Node * graph[], int num_edges){

int i, count =0, wt=0;

Edge edges[num_edges];

Edge mst[num_edges];

copyEdges(edges, graph);

quicksort(edges, 0, num_edges-1);

int rep[NUM_NODE + 1];

makeSet(rep);
for(i=0; i<num_edges; i++){
//Check if edge creates cycle in the tree, If not add that to tree
if(find(rep, edges[i].start) != find(rep, edges[i].end)){
unionSet(rep, edges[i].start, edges[i].end);
mst[count++] = edges[i];
wt += edges[i].wt;
}
}
// Print the minimum spanning tree
for(i=0; i<count; i++){
printf("(%d, %d)\n", mst[i].start, mst[i].end);
}
}
//Driver program
int main(){

int i;
for(i=1; i<=NUM_NODE; i++){
graph[i] = createNode(i,0);
}

kruskal(graph, 5);

return 0;
}

```

Complexity of kruskal algorithm to find minimum spanning tree is dominated by sorting which takes O(E log E) where E is number of edges. To detect whether edge forms a cycle or not, we can use Union and Find method.
Prim’s Algorithm is discussed here :  Graphs : Minimum Spanning Tree Prim’s Algorithm

# Connected components in graph

Given an undirected graph, find out number of connected components in it. Connected components in undirected graph are subgraphs where each node is connected to other node in subgraph and not connected to any other vertex in graph. Every graph has at least one connected component that is graph itself. For example in below graph, there are two connected components {1,2,3,4} and {5, 6}.

## Algorithms to find connected components

There are two methods to find connected components in undirected graph. First is to do depth first traversal of graph. If we start recursion afresh after terminating previous recursion for any starting node, then it means current node is not connected to any of the nodes traversed with recursion previously. Every time, this case happens, increment counter. To keep track of nodes in each connected component, color components of graph with different colors. To check if two nodes belong to same connected component, check if color is same.

## Find connected components in graph implementation

```#include<stdio.h>
#include<stdlib.h>

#define NUM_NODE 7
#define true 1
#define false 0

typedef struct node{
int value;
int wt;
struct node *next;
}Node;

Node *graph[NUM_NODE + 1];
int visited[NUM_NODE + 1];

void dfs(Node * current){
if(!current) return;
visited[current->value] = true;
Node *temp = current->next;
while(temp){
if(visited[temp->value] != true){
dfs(graph[temp->value]);
}
temp = temp->next;
}
}

int connected_component(Node * graph[]){
int i, count =0;
Node * current = NULL;
for(i=0; i< NUM_NODE; i++){
current  = graph[i];
if(current && visited[current->value] == false){
dfs(current);
count++;
}
}
return count;
}

Node * createNode(int j){

Node * newNode = (Node *)malloc(sizeof(Node));
if(newNode){
newNode->value = j;
newNode->next = NULL;
}
else{
printf("\n Node cannot be allocated");
}
return newNode;
}

Node * currentNode = graph[i];
if(!currentNode){
graph[i] = createNode(j);
}
else{
while(currentNode->next){
currentNode = currentNode->next;
}
currentNode->next = createNode(j);
}
}

//Driver program
int main(){

for(int i=1; i<=NUM_NODE; i++){
graph[i] = createNode(i);
}

for(int i=1; i<=NUM_NODE; i++){
visited[i] = false;
}

int count = connected_component(graph);
printf("\n Number of connected component : %d", count);

return 0;
}
```

Complexity of above code will be complexity of depth first search O(V+E) with extra space of O(N) to store color of connected components.

## Connected components : Union and Find method

Second method to undirected graph is to use Union and Find method as discussed in previous post :Graphs : Detecting cycle in undirected graph

Go through all nodes of graph and create disjoint sets. After that, check how many disjoint sets are formed, that gives us number of connected components. To check if two nodes are connected, see if both nodes are in same disjoint set.

### Find connected components in graph using Union and Find method Implementation

```#include<stdio.h>
#include<stdlib.h>

#define NUM_NODE 6
#define true 1
#define false 0

typedef struct node{
int value;
struct node *next;
}Node;

Node *graph[NUM_NODE + 1];

int find(int rep[], int value){
if(rep[value] == -1 || rep[value] == value)
return value;
else find(rep, rep[value]);
}

void unionSet(int rep[], int x, int y){
int xroot = find(rep, x);
int yroot = find(rep, y);

rep[xroot] = yroot;
}

void makeSet(int rep[]){
for(int i=1; i<= NUM_NODE; i++){
rep[i] = -1;
}
}

int connectedComponent(Node * graph[]){

int rep[NUM_NODE + 1] ;
int count = 1;

makeSet(rep);
for(int i=1; i<= NUM_NODE; i++){
Node * u =  graph[i];
if(u){
Node *v = u->next;

while(v){
int x = find(rep, u->value);
int y = find(rep, v->value);

unionSet(rep, x,y);
v = v->next;

}
}
}

for(int i=1; i<= NUM_NODE; i++){
if(rep[i] == -1 ){
count++;
}
}
return count;
}

Node * createNode(int j){

Node * newNode = (Node *)malloc(sizeof(Node));
if(newNode){
newNode->value = j;
newNode->next = NULL;
}
else{
printf("\n Node cannot be allocated");
}
return newNode;
}

Node * currentNode = graph[i];
if(!currentNode){
graph[i] = createNode(j);
}
else{
while(currentNode->next){
currentNode = currentNode->next;
}
currentNode->next = createNode(j);
}
}
//Driver program
int main(){

for(int i=1; i<=NUM_NODE; i++){
graph[i] = createNode(i);
}

int count = connectedComponent(graph);
printf("\n Number of connected components in graph : %d", count);

return 0;
}
```

Complexity of Union and find method is again O(V+E) with extra space to store representative element of each disjoint set.

# Detect cycle in undirected graph

Given an undirected graph, check if there is a cycle in undirected graph.

For example below graph has cycles as 3->6->5->3, 1->4->3->6->2->1

Simple definition of a cycle in undirected graph would be : If while traversing graph, we reach at a node which we have already traversed to reach current node, then there is a cycle in graph.

How can we detect the above condition? It’s simple application of Depth First Search.
Do depth first traversal of a graph, if we detect any back edge while traversing, then there is a cycle. Back edge in a graph is an edge which is from a decedent to ancestor. For example in above graph, edge from 2 to 1 is a back edge.

While depth first traversing a graph, keep track of the nodes which are on the stack at present. If there is an edge from current node to any one of the node which is on recursion stack, then there is a cycle in graph.

## Cycle in undirected graph

```#include<stdio.h>
#include<stdlib.h>

#define NUM_NODE 7
#define true 1
#define false 0

typedef struct node{
int value;
int wt;
struct node *next;
}Node;

Node *graph[NUM_NODE + 1];
int visited[NUM_NODE + 1];

int onStack[NUM_NODE + 1];

void dfs(Node * current){
if(!current) return;
onStack[current->value] = true;
visited[current->value] = true;
Node *temp = current->next;
while(temp){
if(onStack[temp->value] == true){
printf("\n Cycle detected at node %d", temp->value);
}
else if(visited[temp->value] != true){
dfs(graph[temp->value]);
}
temp = temp->next;
}
onStack[current->value] = false;
}

Node * createNode(int j){

Node * newNode = (Node *)malloc(sizeof(Node));
if(newNode){
newNode->value = j;
newNode->next = NULL;
}
else{
printf("\n Node cannot be allocated");
}
return newNode;
}

Node * currentNode = graph[i];
if(!currentNode){
graph[i] = createNode(j);
}
else{
while(currentNode->next){
currentNode = currentNode->next;
}
currentNode->next = createNode(j);
}
}
//Driver program
int main(){

for(int i=1; i<=NUM_NODE; i++){
graph[i] = createNode(i);
}

for(int i=1; i<=NUM_NODE; i++){
visited[i] = false;
}
dfs(graph[1]);

return 0;
}

```

Complexity of algorithm to find cycle in graph code is equal to complexity of DFS which is O(V+E) where V is number of vertices and E is number of edges.

## Detecting cycles in graph using disjoint sets

Another method to detect cycle in an undirected graph is Union-Find method. To understand that method, we need to first understand a data structure called disjoint-set data structure.

It is a data structure, which keeps track of all the sets of elements partitioned in disjoint subset. Subsets are said to be disjoint if there is intersection between them is NULL. For example set {1,2,3 } and {4,5,6 } are disjoint sets, but {1,2,3 } and {1,3,5} are not.

Another important thing about disjoint set is that every set is represented by a member of that set called as representative.
There are some operations which are useful on this data structure:
1. Make Set : creates a new set with one element {x}

2. Union : Merges two sets containing X and Y and destroys the original sets.
3. Find : Returns the representative of the set which element belongs to.

Let’s take an example and see how it can be used to detect cycle in graph.
Principle here is that if two nodes forming an edge are in same disjoint set, then there is a cycle in graph.

Make set will create a set with each node with representative element as -1, which means there is only one element in each disjoint subset. Now one by one we check each edge and update our data structure accordingly. For edge 0-1, both are in different subsets, we need to merge them together. So sets become {0,1} and {2}. Representative of set {0,1} be {1}. Now consider another edge, 1-2. Node 1 belongs to set represented by 1 and 2 belongs to set represented by -1. Both are different subsets, we will merge them together. So set becomes {0,1,2} represented by 2.

Now take edge 0-2, where 0 and 2 are in set which is represented by 2, hence there is a cycle in the graph.

How can we implement it? There are two ways to do so. One is to have a linked list of all elements of a set for each set. Each element in turn points to the representative element. So find operation will be O(1), however union or merge operation will be O(N) as we need to modify representative element in each element of the list.

Other method is to use array. This auxiliary array stores the parent of node i at array[i]. Initially all elements in array are -1. Now whenever we merge two set, we update the corresponding index with representative element.

Find operation

`functionFind(x)`
```
if x.parent == -1
return x
else
return Find(x.parent)```

Union Operation

```
function Union(x, y)
xRoot := Find(x)
yRoot := Find(y)
xRoot.parent := yRoot
```

Make set operation

```function MakeSet(x)
x.parent := -1```

## Detect cycle in undirected graph implementation with disjoint sets

```#include<stdio.h>
#include<stdlib.h>

#define NUM_NODE 7
#define true 1
#define false 0

typedef struct node{
int value;
int wt;
struct node *next;
}Node;

Node *graph[NUM_NODE + 1];
int visited[NUM_NODE + 1];

int onStack[NUM_NODE + 1];

#define NUM_NODDE 6
#define true 1
#define false 0

int find(int rep[], int value){
if(rep[value] == -1)
return value;
else find(rep, rep[value]);
}

void unionSet(int rep[], int x, int y){
int xroot = find(rep, x);
int yroot = find(rep, y);

rep[xroot] = yroot;
}

void makeSet(int rep[]){
for(int i=0; i<= NUM_NODE; i++){
rep[i] = -1;
}
}

int detectCycleInGraph(Node * graph[]){
int rep[NUM_NODE + 1] ;
int i;

makeSet(rep);
for(i=1; i<= NUM_NODE; i++){
Node * u =  graph[i];
if(u){
Node *v = u->next;
while(v){
int x = find(rep, u->value);
int y = find(rep, v->value);

if(x == y) return true;

unionSet(rep, x,y);
v = v->next;
}
}
}
return 0;
}

Node * createNode(int j){

Node * newNode = (Node *)malloc(sizeof(Node));
if(newNode){
newNode->value = j;
newNode->next = NULL;
}
else{
printf("\n Node cannot be allocated");
}
return newNode;
}

Node * currentNode = graph[i];
if(!currentNode){
graph[i] = createNode(j);
}
else{
while(currentNode->next){
currentNode = currentNode->next;
}
currentNode->next = createNode(j);
}
}
//Driver program
int main(){

for(int i=1; i<=NUM_NODE; i++){
graph[i] = createNode(i);
}

if(detectCycleInGraph(graph)){
printf("\n Cycle detected ");
}

return 0;
}
```

## Graph representation : adjacency list and matrix

A graph is collection of edges E [e1,e2,e3…..en] and vertices V = [v1,v2,v3…vn]. Mathematically, graph is represented as  G = (V,E). An edge is pair of vertices (vi, vj) where i,j <= N. There are many applications of graph in computer science, like representing connection of roads, flight paths, internet etc. In this post, we will learn graph representation in computer programs.Before that, let’s see some terminology.

Undirected and directed graphs
A graph can be undirected or directed graph. A directed graph is where edge is only one way from one vertex to another, whereas undirected graph has two way edges; that is, there is no arrow head at the end of edge.

Number of edges
A directed graph can have at most N2 edges, one of each pair, while undirected graph will have C(N+1,2) edges. Sparse graph is a graph where number of edges is less than N2 and C(N+1,2) for directed and indirected graph respectively.

Paths
Path in a graph is a sequence of vertices [v1,v2….Vk] such that there exists an edge between consecutive vertices. Formally, v1,v2,…vk is a path if (vi, vi+1) belongs to edges.

In-degree and out-degree
In-degree of a vertex is number of edges which end on vertex. Out-degree is number of edges going out of vertex. Let’s see how this terminology applies for below graph.

```Vertex list v = {1,2,3,4,5,6}
Edge list e = {(2,1)(2,6)(1,4)(4,3)(3,5)(3,6)(5,6)}
Path p = {1,4,3,5,6}
In degree of 3 is 1, out degree of 3 is 2```

In adjacency matrix representation, two dimensional matrix is used to represent graph. Matrix(i,j) is set to true if there is an edge between vertex i and vertex j.

Advantage of adjacency matrix representation is that it is very simple to implement. However, when number of edges between vertices are sparse, lot of cells remain vacant in matrix which lead to wastage of space.
Complexity of graph algorithms is measured in terms of E and V where E is number of edges and V is number of vertices.
In adjacency matrix representation, memory used to represent graph is O(V2).

There are some optimization which are done to save space. For example, if graph is undirected, an edge between (u,v) is also an edge between (v,u). In that case, transpose of adjacency matrix is same as original. We can save half of space while representing an undirected graph using adjacency matrix. Moreover, if there are no weighted edges in graph, instead of using one byte per edge, use only one bit to indicate edge.

```#include<stdio.h>
#include<stdlib.h>

#define NUM_NODE 6
#define true 1
#define false 0

}

int main(){

for(int i=0; i<=NUM_NODE; i++){
for(int j=0; j<=NUM_NODE; j++){
}
}

for(int i=1; i<=NUM_NODE; i++){
printf("Nodes adjacent to %d : ", i);
for(int j=1; j<=NUM_NODE; j++){
printf("%d ", j);
}
}
printf("\n");
}

return 0;
}
```

## Adjacency list graph graph representation

In adjacency list representation of graph, a table of all vertices of graph is maintained. Each entry in that table will have list of vertices which it is connected to (all it’s neighbors in graph). It is useful where graph is sparse.

Memory required for this graph representation is O( V +E ). Adjacency list representation can be easily extended to represent graphs with weighted edges. Node contains another parameter wt. For edge (u,v) node in adjacency list of u will have weight of edge.

```#include<stdio.h>
#include<stdlib.h>

#define NUM_NODE 6

typedef struct node{
int value;
struct node *next;
}Node;

Node *graph[NUM_NODE];

Node *createNode(int j){
Node * newNode = (Node *)malloc(sizeof(Node));
if(newNode){
newNode->value = j;
newNode->next = NULL;
}
else{
printf("\n Node cannot be allocated");
}
return newNode;
}

Node * currentNode = graph[i];
if(!currentNode){
graph[i] = createNode(j);
}
else{
while(currentNode->next){
currentNode = currentNode->next;
}
currentNode->next = createNode(j);
}
}

int main(){

for(int i=1; i<=NUM_NODE; i++){
printf("Nodes adjacent to %d : ", i);
Node * current  = graph[i];
while(current){
printf("%d ", current->value);
current = current->next;
}
printf("\n");
}

return 0;
}
```

Please share if there is something wrong or missing. Also, if you want to contribute to website by sharing your knowledge, please write to us, we would love to have you as contributor.