Signals in operating systems

Signals in operating systems

Signal is kind of a message sent from kernel or process to one or more processes to notify some event occurrence. Every signal has associated number with it.
Kernel generates signals in cases like encountering illegal instruction or file size limit exceeded or I/O is ready.  Example of process sending signal to another process are a child process sending signal to parent to indicate completion of execution.
A process will receive a signal when it is running in user mode. If the receiving process is running in kernel mode, the execution of the signal will start only after the process returns to user mode.
When a signal is received by a process :
1. Process can ignore the signal altogether and it does nothing.
2. Process can accept the signal and appropriate signal handler is executed.
Signal handler can be a default handler or it can be user-defined. When a user-defined handler is used for disposition of a signal, signal is said to be caught.
Note that SIGSTOP and SIGKILL can not be caught or ignore. They will always have default action.
When a signal is accepted, process immediately stops current execution and services the signal.
Once the signal handler is completed, process is resumed again. 0-31 is reserved range for standard signals while range 32 to 64 is used by real time signals. To show all the signals which are predefined in a system use command kill -l

Signals and their meaning

This information is taken form man page of signal(7)

Changing the signal handler

signal() function provides the basic functionality for defining signal action for a particular signal. It has two parameters, first is signal number and second is handler we want to have for that signal
sighandler_t signal (int signum, sighandler_t action)

Action can be SIG_IGN or SIG_DFL for ignoring the signal or using default action for signal.
If handler is provided, that handler will be invoked when signal is received. 
Other function which can also be used is sigaction(). It give more control over how the handle should be invoked. Prototype is 

int sigaction (int signum, const struct sigaction *restrict action, 
               struct sigaction *restrict old-action)
struct sigaction contains following members 
sighandler_t sa_handler : Its is same as in signal function.
sigset_t sa_mask : It indicates set of signals which should be blocked when signal handled is invoked. Note that whenever a signal handler is invoked for a signal, that signal is automatically gets blocked.
int sa_flags : It contains various flags which define the behavior of signal.  It is interpreted as bit-mask.
SA_NOCLDSTOP  : If it is set, kernel does not send signal for child processes which are stopped. 
SA_ONSTACK :  Use a signal stack to send this signal
SA_RESTART : It decide what should happen to primitive library calls like read(), write() when signal handler for this signal returns normally. If it is set, library functions resume, if reset, library function calls fail.
The basic signal function is a feature of ISO C, while sigaction is part of the POSIX.1 standard.
Below is code for changing the handler of a signal

Process : Linux internal representation

In previous post, we discussed about the theory of process in operating system. In this we would discuss how a process is represented and used in Linux kernel.
In Linux kernel, process is represented as a struct ‘task_struct’ in file /linux/include/linux/sched.h.

struct task_struct {
      volatile long state;
      void *stack;
      unsigned int flags;     
       int static_prio
       struct list_head tasks;
       struct mm_struct *mm, *active_mm;
        int exit_state;
        int exit_code, exit_signal;
        pid_t pid;
        struct task_struct __rcu *parent;
        struct list_head children;   
        struct list_head sibling
        char comm[TASK_COMM_LEN];
        struct thread_struct thread;
        struct fs_struct *fs;
        struct files_struct *files;
        struct signal_struct *signal;
        struct sighand_struct *sighand;
        – – –
There are many more members of task structure, I have taken only those which are required for now.
Let’s see one by one what does each member represents in process and what are possible values.

Basic Info

As seen in last post that a process can be in one of the following states : RUNNING, INTERRUPTIBLE_SLEEP, UNINTERRUPTIBLE_SLEEP, TERMINATED or ZOMBIE.
task->state represents current state of the process.
To set a state of a process use
set_task_state(task, state);        /* set task 'task' to state 'state' */
stack represents the kernel stack of the process.
Flags contains the combination of system status flags PF_ALIGNWARN, PF_PTRACED, PF_TRACESYS, PF_STARTING, and PF_EXITING.
static_prio represents nice value of the process. Range of values stored in it is 100 to 139 where 100 maps to -20 and 139 maps to +19.

Task Lists

A task can be part of multiple lists. To point to previous and next tasks in the list which task is member of there is a member called tasks. Tasks is of type list_head which contain prev and next pointers.

To list all the processes in system use this:
list_entry(task->, struct task_struct, tasks)

There are macros available to go to next and previous task. To go to next use next_task(task) and for previous use prev_task(task).

Memory Management

Each process holds their own memory substructure called as mmThe component of memory structure:
struct mm_struct{
unsigned long start_code, end_code, start_data, end_data;
unsigned long start_brk, brk;
unsigned long start_stack, start_mmap;
unsigned long arg_start, arg_end, env_start, env_end;

Identity and relationships

Linux process is identified uniquely using pid. Range of pid is limited by the backward compatibility concern and is 1- 32K (2^32). and it is typically the maximum number of processes  that can be there in system at a time, else pid will roll over and will make the assumption that new process have higher pid invalid.
In Linux, every process has to have a parent except init process which is statically allocated at the boot time which in turn read init scripts and start new process. Every process stores the pointer to its parent in  parent member in its structure. If we need to find out parent process pid:    current->parent->pid

Init process can be accessed as  for (task = current; task != init_task;   task->parent)

Similarly children is list of all the children of the process and sibling is list of all processes which have same parent. comm is command to execute the process and it does not include the whole path.

File System

fs stores the file system pointer for the process while files stores all the files which are currently opened by the process.

Process information

Earlier task structure pointer was stored at the bottom of kernel stack, this helped to calculate the task pointer using stack pointer. Now it has been changed and instead of task pointer being stored at bottom of stack it the thread_info struct pointer which is being stored and task member inside that gives the pointer to actual task struct.
struct thread_info {
        struct task_struct    *task;
        struct exec_domain    *exec_domain;
        unsigned long         flags;
        unsigned long         status;
        __u32                 cpu;
        __s32                 preempt_count;
        mm_segment_t          addr_limit;
        struct restart_block  restart_block;
        unsigned long         previous_esp;
        __u8                  supervisor_stack[0];
Current is macro used to get currently executing process. Some architectures store it in register for fast access. While some use the fact the task pointer can be calculated using thread_info at the end of the stack. Just mask last 13 bits of sp and we get the thread_info pointer. Using that we can easily get the task structure.  Why last 13 bits? Because each process has kernel stack of size 8192 which is 2^13.


signal and signal handlers require a separate post and we will understand them separately.

As we go forward we would look into each and every member in detail.

Anatomy of a Process

A process is a program in execution, with it associated are process context and executable instruction.
A process has its own data, code, stack, register and memory space. Every process has its own virtual memory address range, I/O resources, opened files etc.

We can check what all processes running in system (Linux) using ps command.

Process states

It is not that process is always running once we execute the program. There is scheduling involved which depends on the priority of the process. Following are the states a process can be in :

1. RUNNING : Simple this is the state when process has got processor and it is being executed.

2. WAITING or UNINTERRUPTIBLE SLEEP  : When scheduler has moved process out of the processor while process is waiting for some input or action from I/O or any other resources.
3. INTERRUPTIBLE SLEEP : Process has gone to sleep while waiting on an event which will send signal or interrupt when that occurs.
4. TERMINATED : When process has completed its execution, process can terminated normally or abruptly due to some signal like SIGKILL, SIGSERV etc.
5. ZOMBIE : When process has completed execution but its return status is yet not collected by the parent process.

Following diagram shows transition between various states and events which trigger those transitions.

Process states and events

Properties of a process

When a process is moved out of process and new process is bought in for the execution, process context switch happens. Process’s current state is saved before it moves out of the CPU and it is resumed from the point it left once it gets the hold of processor again. Process context switch is an expensive operation and algorithms are developed to minimize this. We will discuss them later.

Understand that in Linux every process except the INIT process has a parent, hence we can take all process in systems as nodes of a tree. INIT is root of that tree. Processes which are children of same parent are called as siblings. Also every process has associated process ID. Since it is 16 bit integer, maximum number of processes we can have in system at a time is around 32K. 
At program level we can easily get the process ID and its parent ID using getpid() and getppid() APIs.

What else is associated with a process? Process has UID which is user ID which own this process; GID group ID which process is running on behalf of. 
A process in memory looks like this:

Creation of a process

How a process is created? Most widely used method to create a process is to use ‘fork’ and ‘exec’ system calls. As mentioned earlier, every process has parent, parent uses fork system call to create exactly same copy of itself. Once new process is scheduled, it can use exec system call to execute any program it wants to.

It is interesting to understand fork system call. It is a call where one process goes in and two come out. They both start there execution from the statement just after fork call (Remember new process is exact copy, hence its PC will be same). How to distinguish between parent and child process? fork comes to rescue there. Call to ‘fork’ return child process’s PID to parent process while zero to child process. By having check on return value of ‘fork’ system call we can figure out which process is parent and which is child.

Now, fork can be a very expensive call as OS has to duplicate whole lot of information, especially the virtually memory and pages currently used by the parent process. There is one concept which is called ‘Copy on Write’, so fork system call will not copy any of the pages till the time one of the process tries to modify the page. This arrangement makes fork system call fast.

Other system call is exec(). It is used to start a new program, it will replace contents of process with of program binary. There are many versions of the same system call used for varying purposes.

  1. The calls with v in the name take an array parameter to specify the argv[] array of the new program.
  2. The calls with l in the name take the arguments of the new program as a variable-length argument list to the function itself.
  3. The calls with e in the name take an extra argument to provide the environment of the new program; otherwise, the program inherits the current process’s environment.
  4. The calls with p in the name search the PATH environment variable to find the program if it doesn’t have a directory in it (i.e. it doesn’t contain a / character). Otherwise, the program name is always treated as a path to the executable

When a process creates a child process, it may or may not wait for return status of the child process.
To wait for the return status, parent process uses wait() system call. It blocks the parent process till the time one of its child returns status. Usually return status of child process is used to check if the child process terminated normally or abnormally. Child process can inform their exit status using SIGCHILD signal. There are variants of wait() like wait3() and wait4() which are non blocking call on parent process.


Difference between zombie and orphan process

When a child process exits and if parent process is not waiting to receive exit status of it, the child process is termed as Zombie process. This process continues to exist in process table, its PID is not freed even though it has terminated. 

If parent process exits before child process, the child process is called as orphan process. This process is put directly under the INIT process and its parent process ID becomes 1. There is slight difference between zombie and orphan that is zombie process’s parent may not have exited while orphan process’s parent has terminated.
In next post we would closely on linux code for process implementation.

Process Vs Threads

Process Vs Threads


In simple terms, any program in execution is called as process. Every process has a process state associated with it like memory, resources, code, data etc. In operating system, process has unique process identifier and associated process controller block (PCB). We will see in detail what PCB contains in later posts.

Process execution


Thread is an abstract entity which executes a sequence of instructions. Threads are light weight as compared to process, their creation is efficient and inter thread communication is fast as they use memory to communicate with each other instead of IPC. Process context switch is rather very expensive operation as compared to thread switches. Every program has at least one thread of execution.  A thread cannot live on its own, it has to be within some process. A process can have multiple threads.
Thread execution

From the definition we came across some differences. Let’s tabulate them.

Process vs Threads

Can live on its own
Always runs in a process
All process have separate address space and stacks
Threads of a process share address space with each other. Threads have separate stack and registers as process has.
Inter process communication is done using IPC
Inter thread communication is done through shared memory.
Heavy weight in terms of IPC, context switch and creation
Light weight in terms of IPC, context switch and creation.
Process has data segment and heap of its own
Thread does not have them
There are two types of threads :
1. User space threads.
2. Kernel space threads.
Both have their own use and use case for using in a particular context.
User space threads are very fast in terms of create and switch. But they are not suitable when we need to do blocking I/O calls. Language level threads are best example of this kind of threads.
Kernel space threads are inefficient in terms of creation and switch but they do not block on system calls. These threads are scheduled using system scheduler as any normal process. Programmer does not have any direct controller over these threads.
There are models where one user thread maps to one kernel thread and one user thread maps to multiple kernel space threads or multiple user threads maps to one kernel space threads. These are termed as 
1: 1 Model
M: N Model
M : 1 Model
These models are realized using virtual processor (VP).  For first model, all user threads run in one VP.  In second model, user threads run in pool of VPs. This is default model.  While in third model, multiple user threads run in single VP.
In next post we would discuss POSIX library APIs for thread creation, execution and termination along with detailed look at process life cycle and structures used.

This post highlights key points of difference between process and threads, in future posts, we will understand more on these two concepts.

Semaphore : Kernel APIs

Kernel APIs for semaphore

In last post we discussed difference between mutex and semaphore. In this post we will discuss various flavors and APIs used for semaphore usage.

Any locking mechanism has to deal with interrupts. Semaphore too has to deal with it. Since interrupts can call down (we will see it later) and up, any semaphore locking implementation should disable interrupts. At the same time, interrupt may use down when it is sure that it will get the lock, such cases we need to use another variant of down namely down_interruptable() and down_killable.

There are two basic operations which are done on semaphore, namely decrease the counter while acquiring the resource and increase the counter while releasing resource. There are two APIs:
For decreasing the counter :

 void down(struct semaphore *sem)

For increasing the counter :

void up(struct semaphore *sem)

Processing in down() is as follows:

  • Check if the current counter is greater than 0.
  • If it is then decrease the counter  and take the resource.
  • If the counter is less than zero, put the process trying to acquire the lock in the semaphore queue.

In up() routine, process increments the counter value and wakes up next process from the queue. It is not necessary that up() is called by the same process which called down, Any process can call this function and wake up any process sleeping on this semaphore.
If we want process to wake up when any signal is received, we may use another variant of down that is down_interruptable(). Function is similar to simple down, only difference is if the process has gone n sleep state, signal can wake process up.

int down_interruptible(struct semaphore *sem)

This function return -EINTR if process woke up due to signal and 0 if it woke up after acquiring semaphore.
Another variant of down() is down_killable(). If we use this function, process waiting for semaphore can respond to FATAL signal.

int down_killable(struct semaphore *sem) 

All above APIs are using a common structure that is semaphore.
Definition of semaphore struct is :

struct semaphore {
      spinlock_t              lock;
      unsigned int            count;
      struct list_head        wait_list;

spinlock  is used in all above function to protect critical section inside.


Mutex Vs Semaphore

Mutex Vs Semaphore

These are two primitives which everyone confuses almost every time. Let’s clarify on these today and see what are the basic difference between these two. There must be some difference else two of them would not have existed. 🙂 Difference may be subtle but they exist.

First thing we should shed the belief that mutex and semaphore are one and same and can be used interchangeably. No they are not and they cannot be.

Most of use believe that semaphore is just extension of mutex. That is to say mutex is binary version of semaphore, if want to put concurrency control on more than one similar resources, we use semaphores adding count.  It’s wrong.

The toilet analogy has a basic flaw, we have multiple keys to multiple bathrooms in case of semaphore, we have the key, but we still don’t know which bathroom is free. 🙂 So eventually we end up in mechanism where one key is for one bathroom and that is the mutex for us.
Hence adding the count paradigm to mutex does not turn that into semaphore.So hope first misconception is clarified.

Mutex is basically a locking mechanism where a process locks a resource using mutex. Till the time, process has mutex, no other process can have the same resource. (Mutually exclusive for you). Once process is done with resource, it releases the mutex.
Here comes the concept of ownership. Mutex is locked and released by the same process/thread. It cannot happen that mutex is acquired by one process and released by other.
Semaphore is a synchronization mechanism. It is used between two process to synchronize each other. Best example would be producer-consumer problem. Consumer cannot consume till the time producer has produced something. When producer has produced something, it will inform consumer using semaphore primitive.
There is no ownership attached to semaphore. One process can acquire it and other can release it.

In next post we would see some of the APIs used in Linux kernel for mutex and semaphore.

Locking : Spinlocks vs Mutex Part -2

Spinlocks Kernel APIs

In last post we learned about the mechanisms for concurrency control.
Concurrency really comes into picture only in case of Symmetric Multiprocessing systems, where two processes can actually execute a piece of code simultaneously. In uni-processing system, no processes can execute at the same time, even though we need some kind of synchronization because relative ordering of execution of instruction can affect the output. 
In this post we would see what are the various APIs which are available in Linux kernel for spinlocks and when should we use them. 

With interrupt enable/disable

spin_lock_irqsave(lock, flags)
spin_unlock_irqrestore(lock, flags)

spinlock_t my_lock = SPIN_LOCK_UNLOCKED;

unsigned long flags;

spin_lock_irqsave(&my_lock, flags);
/* critical section … */
spin_unlock_irqrestore(&my_lock, flags);

Above API is used to go into busy loop but after disabling interrupts. This may be required because if interrupt itself wants another spinlock, there might be a deadlock. The corresponding unlocking API, restore the interrupts when spinlock is acquired.  This provides protection against both SMP and interrupt issues. If one wants to disable all interrupts unconditionally, spin_lock_irq(spinlock_t *lock) and spin_unlock_irq(spinlock_t *lock) are used.

Without interrupt enable/disable

spin_lock(spinlock_t *lock)
spin_unlock(spinlock_t *lock)

spinlock_t mr_lock = SPIN_LOCK_UNLOCKED;
/* critical section … */

Saving and restoring interrupts can take time. If we are quite sure that critical section is not going to receive any interrupts or data is not going to be used in any interrupt, we might avoid saving and restoring interrupts. There we use above APIs. With soft interrupt enable/disable
spin_lock_bh(spinlock_t *lock)
spin_unlock_bh(spinlock_t *lock)

Above APIs disable soft IRQ on processor but not hardware interrupts.
This is required in code segment which is written outside soft IRQ but used inside it.
These are three major variants of spinlock usage. There is another variant of all these as 

spin_trylock_irqsave(lock, flags) 
spin_trylock(spinlock_t *lock)
spin_trylock_bh(spinlock_t *lock)

These function try to acquire to lock and return value which tell success or failure to process.
For example spin_trylock() does not spin but returns non-zero if it acquires the spinlock on the first try or 0 if not.

Word of wisdom 🙂

Use spinlocks in case where you are sure that lock will be acquired in reasonably small amount of time. If it takes time, we would end unnecessary keeping processor busy spinning in loop. trade off is : If making process sleep and wake up again is affordable than spinning, if yes, use semaphore, if not, use spinlock.
Also never use spinlocks for guarding critical section which are accessing memory or IO or scheduling some other process. We may end up in deadlock.
There are other variants like read-write spinlocks. We would discuss them in next post.

Locking : Mutex vs Spinlocks

Difference between mutex vs spinlocks

I would start with a very basic technical interview question, that is : What is difference between mutex and spin-lock?
First of all, let’s understand what is a mutex?
Mutex is kind of a key to use any resources in the system. If you have mutex, you can use the resource, if you don’t wait till the mutex is released. Process goes to wait queue for that particular resource. 
What is spin-lock? Spin lock is a mechanism in which the process needing a resource poll the lock on resource till it gets it. It is also called as busy-waiting. Process will be busy in a loop till it gets the resource.
So we get the first difference there itself, the way both wait for the resource lock. In case of mutex, process goes in wait state and does not use processor while in spin-lock, process keeps on looping and hence uses the processor even while waiting.
Second difference comes from : when should we use spin-lock or mutex?
Spin-locks are best used when a piece of code cannot go to sleep state. Best example of such code would be interrupts request handlers.
Mutexes are best used in user space program where sleeping of process does not affect the system in catastrophic way.
Spin locks make sense when we have multi-processor systems or we have uni-processor system with preemption enabled, spin locks used in uni-processor systems without preemption enabled, may lead to deadlock where process goes on spinning forever. Following table summarizes the above points.

Mutex Vs Spinlock

Test for lock.
If available, use the resource
If not, go to wait queue
Test for lock.
If available, use the resource.
If not, loop again and test the lock till you get the lock
When to use
Used when putting process is not harmful like user space programs.
Use when there will be considerable time before process gets the lock.
Used when process should not be put to sleep like Interrupt service routines.
Use when lock will be granted in reasonably short time.
Incur process context switch and scheduling cost.
Processor is busy doing nothing till lock is granted, wasting CPU cycles.
In next post we would list some the APIs which are used for using mutex and spinlocks in Linux kernel.