Anatomy of a Process

A process is a program in execution, with it associated are process context and executable instruction.
A process has its own data, code, stack, register and memory space. Every process has its own virtual memory address range, I/O resources, opened files etc.

We can check what all processes running in system (Linux) using ps command.

Process states

It is not that process is always running once we execute the program. There is scheduling involved which depends on the priority of the process. Following are the states a process can be in :


1. RUNNING : Simple this is the state when process has got processor and it is being executed.

2. WAITING or UNINTERRUPTIBLE SLEEP  : When scheduler has moved process out of the processor while process is waiting for some input or action from I/O or any other resources.
3. INTERRUPTIBLE SLEEP : Process has gone to sleep while waiting on an event which will send signal or interrupt when that occurs.
4. TERMINATED : When process has completed its execution, process can terminated normally or abruptly due to some signal like SIGKILL, SIGSERV etc.
5. ZOMBIE : When process has completed execution but its return status is yet not collected by the parent process.

Following diagram shows transition between various states and events which trigger those transitions.


<!–[if gte vml 1]>

R

Z

T

S

I/O operation

Scheduled

If exit status is not read

Completes

<![endif]–>

Process states and events

Properties of a process

When a process is moved out of process and new process is bought in for the execution, process context switch happens. Process’s current state is saved before it moves out of the CPU and it is resumed from the point it left once it gets the hold of processor again. Process context switch is an expensive operation and algorithms are developed to minimize this. We will discuss them later.


Understand that in Linux every process except the INIT process has a parent, hence we can take all process in systems as nodes of a tree. INIT is root of that tree. Processes which are children of same parent are called as siblings. Also every process has associated process ID. Since it is 16 bit integer, maximum number of processes we can have in system at a time is around 32K. 

At program level we can easily get the process ID and its parent ID using getpid() and getppid() APIs.

What else is associated with a process? Process has UID which is user ID which own this process; GID group ID which process is running on behalf of.


A process in memory looks like this:


Process snapshot

Creation of a process

How a process is created? Most widely used method to create a process is to use ‘fork’ and ‘exec’ system calls. As mentioned earlier, every process has parent, parent uses fork system call to create exactly same copy of itself. Once new process is scheduled, it can use exec system call to execute any program it wants to.

It is interesting to understand fork system call. It is a call where one process goes in and two come out. They both start there execution from the statement just after fork call (Remember new process is exact copy, hence its PC will be same). How to distinguish between parent and child process? fork comes to rescue there. Call to ‘fork’ return child process’s PID to parent process while zero to child process. By having check on return value of ‘fork’ system call we can figure out which process is parent and which is child.

Now, fork can be a very expensive call as OS has to duplicate whole lot of information, especially the virtually memory and pages currently used by the parent process. There is one concept which is called ‘Copy on Write’, so fork system call will not copy any of the pages till the time one of the process tries to modify the page. This arrangement makes fork system call fast.


Other system call is exec(). It is used to start a new program, it will replace contents of process with of program binary. There are many versions of the same system call used for varying purposes.

  1. The calls with v in the name take an array parameter to specify the argv[] array of the new program.
  2. The calls with l in the name take the arguments of the new program as a variable-length argument list to the function itself.
  3. The calls with e in the name take an extra argument to provide the environment of the new program; otherwise, the program inherits the current process’s environment.
  4. The calls with p in the name search the PATH environment variable to find the program if it doesn’t have a directory in it (i.e. it doesn’t contain a / character). Otherwise, the program name is always treated as a path to the executable

When a process creates a child process, it may or may not wait for return status of the child process.
To wait for the return status, parent process uses wait() system call. It blocks the parent process till the time one of its child returns status. Usually return status of child process is used to check if the child process terminated normally or abnormally. Child process can inform their exit status using SIGCHILD signal.
There are variants of wait() like wait3() and wait4() which are non blocking call on parent process.

Code

Difference between zombie and orphan process

When a child process exits and if parent process is not waiting to receive exit status of it, the child process is termed as Zombie process. This process continues to exist in process table, its PID is not freed even though it has terminated. 


If parent process exits before child process, the child process is called as orphan process. This process is put directly under the INIT process and its parent process ID becomes 1. There is slight difference between zombie and orphan that is zombie process’s parent may not have exited while orphan process’s parent has terminated.


In next post we would closely on linux code for process implementation.

Least Recently Used cache

Least Recently Used cache implementation

First of all, lets understand what is a cache. In plain computer architectural terms, a cache is small buffer of pages OS maintains in-order to avoid more expensive main memory accesses.Usually cache are much more faster than main memory. Since caches are very small in size as compared to main memory, there is probability that we need to swap pages between cache and main memory. Whenever a page which is not found in cache and needs to br brought in from main memory, it is called as cache miss.There are many approaches used to decide which page goes out of cache in order to make place for new page like First In First Out approach, Least Recently Used, Least Frequently Used.

What is least recently used cache ?

In ‘First In First Out’ approach, OS selects the page which is oldest in cache and swaps that it with new page.
In ‘Least Recently Used’ approach, OS selects the page which was not accessed for longest period of time.

In ‘Least Frequently Used’ approach, OS selects the page which is accessed least number of time till a given point of time.

In this post, we would concentrate on Least Recently Used approach and will implement it.

If we look closely, LRU is more or less similar to FIFO, except that we need to do some juggling when the page is accessed again. If a page is entered in cache first it is first candidate to go out if it not accessed again in that duration. So the additional condition is that we need to keep track when the page was accessed. FIFO can be best implemented using queues. Any doubt?

Let’s take an example. We have  cache with size 4.Order of accessing page is 1,2,3,4,5,2,6,6. So first 4 steps would be simply adding to the queue.

Now comes page 5 and there is no space in cache. We simply remove page 1 as it was the least recently page among four pages we have in cache.

When page 2 is accessed, we have a cache hit. Fine we do not swap anything here. If page 6 is accessed, it will be a cache miss and we need to swap page from cache. If follow the previous approach of swapping page at front of queue, we would be wrong as 2 is most recently used.
How can we avoid swap of page 2? We can remove it from front and put it at tail of the queue when it is accessed. That will give us 3 at the front when page 6 is accessed and hence we would correctly remove the least recently used page.Insertion and removal is fine now with O(1) complexity. 

How to check for cache Hit or Miss efficiently? Scanning through the whole queue would be of order O(N). How can we make look up fast. Best way is to use Hash table.
We would have hash table but what will be Key, Value pair.  Key can be page number while value can be pointer to node in the queue. With this thing in place, we can look up any node in queue in O(1) complexity.

Algorithm for least recently cache

If cache has free entry, add the page entry to queue.
If cache is full and its cache hit, remove the page from present location and add it to end of queue.
If cache is full and its cache miss, remove the page at front and insert the new page at end of queue. 
To check hit or miss, use hash table.

Least recently used cache implementation

#include <stdio.h>
#include <stdlib.h>

#define MAX_CACHE_SIZE 4
#define true 1
#define false 0

typedef struct Queue{
    int data;
    struct Queue *next;
    struct Queue *prev;
}Queue;

typedef struct DummyNode{
    Queue * front;
    Queue * tail;
    int size; //Extra parameter to keep track of number of pages */
}DummyNode;

/* Initialize the queue */
void initializeQueue(DummyNode **q){
    *q  = (DummyNode *)malloc(sizeof(DummyNode));
    if(*q){
        (*q)->front = NULL;
        (*q)->tail = NULL;
        (*q)->size =0;
    }   
}   

/* Checks if queue is empty */
int isEmpty(DummyNode *q){
    if( q->front == NULL && q->tail == NULL)
        return true;

    return false;
}

/* Enqueue the element and returns the pointer to be 
stored in hash */
Queue * enqueue(DummyNode *q, int elem){
    Queue *newNode = (Queue *) malloc(sizeof(Queue));
    if(newNode){
        newNode->data = elem;
        newNode->next = NULL;
        newNode->prev = q->tail;
        if(q->tail){
            q->tail->next = newNode;
        }
        q->tail = newNode;
        if(!(q->front))
            q->front = newNode;
    
        q->size++;
    }
    return newNode;
}

/* Dequeue the element from queue */
int dequeue(DummyNode *d){

    if(isEmpty(d)){
        printf("\n Queue is empty");
            return -1;
        }

    Queue *q  = d->front;
    d->front = q->next;
    if(q->next)
        q->next->prev = NULL;
    else
        d->tail = NULL;

    int data = q->data;
    free(q);
    d->size--;
    
    return data;
}

/* This function modifies queue when there is cache hit */
void updateCache(DummyNode *d, Queue *hash[], int page){

    /* If the page is at the front of queue, dequeue and 
    enqueue it again */    
    if(d->front == hash[page]){
        dequeue(d);
        enqueue(d, page);
    }
    /* If it is last in queue, do not do anything */
    else if(d->tail == hash[page]){
        return;
    }
    /* Update the pointers of neighbor nodes and tail in dummy node */
    else{
        Queue * q = hash[page];
        q->prev->next =  q->next;
        q->next->prev = q->prev;
        q->prev = d->tail;
        d->tail->next = q;
        d->tail  = q;
        q->next = NULL;
    }
}

void printCache( DummyNode *d){
	Queue *current = d->front; 
	for(int i=0; i<= d->size && current; i++){
		printf("%d ", current->data);
		current  = current->next;
	}
}

/* This function implements the LRU cache */
int lruCache(DummyNode *d, Queue *hash[], int page){

    /*Cache Miss and we can add the page in the cache */
    if(hash[page] == NULL && d->size < MAX_CACHE_SIZE){
        hash[page] = enqueue(d, page);
    }
    
    /* Cache Miss and we cannot add new page to cache, 
     remove the LRU page */
    if(hash[page] == NULL && d->size == MAX_CACHE_SIZE){
        dequeue(d);
        enqueue(d, page);
    }

    /* Cache is Hit */
    if(hash[page] != NULL){
        updateCache(d,hash,page);
 }

 printf("\n Cache state : ");
 printCache(d);
 return true;
}
int main(){

    DummyNode *d;
    Queue *hash[10];

    int n;

    for(int i=0; i<10; i++)
        hash[i] = NULL;

    initializeQueue(&d);

    int pageOrder[10] = {1,2,3,5,4,2,3,2,5,6};
    int i = 0;
    while(i<10) {
        printf("\n Page Requested : %d " , pageOrder[i]);
        lruCache(d, hash, pageOrder[i]);
        i++;
    };
    return 0;
}

So, we saw how a least recently used cache is implemented using queue and hash.

Process Vs Threads

Process Vs Threads

Process

In simple terms, any program in execution is called as process. Every process has a process state associated with it like memory, resources, code, data etc. In operating system, process has unique process identifier and associated process controller block (PCB). We will see in detail what PCB contains in later posts.


Process execution

Threads

Thread is an abstract entity which executes a sequence of instructions. Threads are light weight as compared to process, their creation is efficient and inter thread communication is fast as they use memory to communicate with each other instead of IPC. Process context switch is rather very expensive operation as compared to thread switches. Every program has at least one thread of execution.  A thread cannot live on its own, it has to be within some process. A process can have multiple threads.


Thread execution

From the definition we came across some differences. Let’s tabulate them.

Process vs Threads

Process
Threads
Can live on its own
Always runs in a process
All process have separate address space and stacks
Threads of a process share address space with each other. Threads have separate stack and registers as process has.
Inter process communication is done using IPC
Inter thread communication is done through shared memory.
Heavy weight in terms of IPC, context switch and creation
Light weight in terms of IPC, context switch and creation.
Process has data segment and heap of its own
Thread does not have them

There are two types of threads :
1. User space threads.
2. Kernel space threads.

Both have their own use and use case for using in a particular context.
User space threads are very fast in terms of create and switch. But they are not suitable when we need to do blocking I/O calls. Language level threads are best example of this kind of threads.

Kernel space threads are inefficient in terms of creation and switch but they do not block on system calls. These threads are scheduled using system scheduler as any normal process. Programmer does not have any direct controller over these threads.

There are models where one user thread maps to one kernel thread and one user thread maps to multiple kernel space threads or multiple user threads maps to one kernel space threads. These are termed as 
1: 1 Model
M: N Model
M : 1 Model

These models are realized using virtual processor (VP).  For first model, all user threads run in one VP.  In second model, user threads run in pool of VPs. This is default model.  While in third model, multiple user threads run in single VP.

In next post we would discuss POSIX library APIs for thread creation, execution and termination along with detailed look at process life cycle and structures used.

This post highlights key points of difference between process and threads, in future posts, we will understand more on these two concepts.

Queues: Linked list based implementation

Queue : Abstract Data Type

In last post we understood the concept of queues and basic operations to be performed on it. We also saw array base implementation. Limitation in array based implementation is that we need to allocate array size before hand which restricts number of elements those can be accommodated. Other issue we encountered was to correctly gauge if queue is empty or full. We need to maintain an extra counter for that purpose.
In this post we would discuss linked list based implementation of queue.

Advantage of linked list based approach is there is no pre-allocation of  size of queue. So queue is never full. We can easily figure out if queue is empty by checking the head pointer of the list, if that is NULL, queue is empty, if not we have some element in it.
Only problem we see is that to remove an element we need to traverse to the end of the list. This operation would be costly as O(N) and that is not desirable. 
Can we think of something? We can store the tail pointer which points to the last node of list. When to remove a node from list, we directly go to tail node and delete it. Now question arises, how to update the tails to node previous to the node deleted. Again traverse the list, we are back to square one 🙂
What kind of list can give us previous node? Yes, its double linked list. So best data structure to implement queue using linked list would be doubly linked list. 

Code

Comparison

Array based implementation is good when we are sure that queue will not increase continuously and would not cross maximum size at any given point of time. It has all operations in constant time and memory usage is also low as compared to linked list base implementation. However, if queue can increase beyond allocate size, dynamically increasing array size is quite an expensive operation.

Queues : Array based implementation

Queue Abstract Data Type

Queue in computer programming has same properties which a queue in real life has. First there is only one end where one can enter in queue from and other where one can exit from.  Second, the person who entered first will exit first.

The end where person enters from is usually called as rear of queue and where person goes out is called as front of queue. The principle in which person who entered first leaves first is called as First In First Out (FIFO). It is exactly opposite of stack which is a LIFO data structure.
Queue abstract data structure has following basic operations:
1. Enqueue() : This operation adds new element at the rear of queue.
2. Dequeue() : This removes element from front of queue.
3. Is_empty() : Checks if there are some elements present in queue at given point of time.
4. Queue() : This initializes the queue.

Let us first consider different underlying data structures to implement queue.

Array based implementation

Ideally a queue should not have any limit on number of elements that can be added to it. But an array would inherently have limit on size. Let’s first see the simple implementation where we say queue as full if we reach at the end of array.

Code

This implementation has limitation that is queue drifts towards end of the array as enqueue() and dequeue() operations are performed. We might end up saying queue full when there is only one element present in queue.

We can fix it by making rear pointing to first element once all slots in array are filled, that is wrapping around. We need to now come up with the full and empty condition of such a queue.

We can easily wrap around as rear = rear +1 % MAX_SIZE. 
Queue is full when rear +1 = front but we see closely, same condition is true in case queue is empty.

So, initialize back and front appropriately. We start with front as 0 and rear as MAX_SIZE -1.Also, keep a counter in order to check if the queue is full or not.

Application of Queues are plenty. Like OS uses queues at multiple places like scheduler, waiting queues on semaphore, sending data from one process to another etc.
In next few posts we would problems which can be solved efficiently using queue ADT. Before that we would like to see linked list base implementation of queue in next post.

Stack with constant time Max() operation

We know that stack has constant time push and pop operation. Now we need to design an solution where we get the max value in stack in constant time.

Problem statement

Design a stack which has constant time push, pop and max operation.

Analysis

Push and pop operation in a stack are already constant time operations. Let’s concentrate on max operation. First thing we need to do is to keep track of the current max. This would suffice if are never going to pop from the stack. If we pop from the stack, the new remaining stack might have different max. Max in remaining stack can be different if the poped element is max too. 

So we get the first condition : If the poped element is equal to current max value, we need to change the max. 
Again we get the hint : We need to keep track of previous max too. Keep all the elements which were max in stack at any given time. These max values to be stored in such a way that the most recently encountered max can be accessed in O(1) . That’s is our base requirement right? 
Last in first out again? Yes, keep another stack to store max. 

Insertion order : 1,2,3,4,5 then
Main stack : 1,2,3,4,5
Max stack would be : 1,2,3,4,5

Insertion order : 5,4,3,2,1
Main stack  : 5,4,3,2,1
Max stack : 5,5,5,5,5

From the second example we observe that we are unnecessarily storing 5 repeatedly. We can optimize it by storing element in max stack only if max is changed.

Code

Complexity analysis

O(1)  🙂

Semaphore : Kernel APIs

Kernel APIs for semaphore

In last post we discussed difference between mutex and semaphore. In this post we will discuss various flavors and APIs used for semaphore usage.

Any locking mechanism has to deal with interrupts. Semaphore too has to deal with it. Since interrupts can call down (we will see it later) and up, any semaphore locking implementation should disable interrupts. At the same time, interrupt may use down when it is sure that it will get the lock, such cases we need to use another variant of down namely down_interruptable() and down_killable.

There are two basic operations which are done on semaphore, namely decrease the counter while acquiring the resource and increase the counter while releasing resource. There are two APIs:

For decreasing the counter :
void down(struct semaphore *sem)

For increasing the counter :

void up(struct semaphore *sem)


Processing in down() is as follows:
Check if the current counter is greater than 0.
If it is then decrease the counter  and take the resource.
If the counter is less than zero, put the process trying to acquire the lock in the semaphore queue.

In up() routine, process increments the counter value and wakes up next process from the queue. It is not necessary that up() is called by the same process which called down, Any process can call this function and wake up any process sleeping on this semaphore. 

If we want process to wake up when any signal is received, we may use another variant of down that is down_interruptable(). Function is similar to simple down, only difference is if the process has gone n sleep state, signal can wake process up.

int down_interruptible(struct semaphore *sem)

This function return -EINTR if process woke up due to signal and 0 if it woke up after acquiring semaphore.

Another variant of down() is down_killable(). If we use this function, process waiting for semaphore can respond to FATAL signal.

int down_killable(struct semaphore *sem)


All above APIs are using a common structure
that is semaphore. 
Definition of semaphore struct is :

struct semaphore {
spinlock_t lock;
unsigned int count;
struct list_head wait_list;
};

spinlock  is used in all above function to protect critical section inside.

Reference
http://www.cs.fsu.edu/~baker/devices/lxr/http/source/linux/kernel/semaphore.c#L204