Atomic Behaviour: Coding Best Practice

Atomic Behaviour: Coding Best Practice

Many organizations follow a pattern where they have a master database where updates happen and queries are handled from a separated database which is regularly synchronized. Such synchronization stuff is normally handled at the database level, but sometimes there are occasions when programs are written to achieve this synchronization. This generally happens when frequently changing business rules are used to extract data from the master database that needs to go into the query database. Interestingly most of such implementations that update the query database, I have seen, are implemented in the following manner:
  1. Verify the major input data – reject if inconsistent
  2. Delete the previous data present in the query database
  3. Insert the new data with which the process is invoked
At the outset the above steps look fine, and yes they work in most of the cases except in few scenarios. In one of the instances, when I faced problem with the above, was in a JEE environment. The method having the steps there had the transaction semantics as “TRANSACTION REQUIRED”. This method was being called from a batch program executing in and with transactions controlled by the JEE environment.

Problematic Scenario

The scenario involved entities that can be mapped to User, Privileges and User Privileges and the data being updated into the query database regularly was that of User and User Privileges entities. Privileges were assumed to be more stable and were not synchronized.
The incidents reported for this scenario was that of a particular user having lost some of his privileges was not able to perform some tasks. Not all privileges were lost and the privileges lost were random and inconsistent, eg. losing the view access but still having the update access for the same business functionality.

Causal Analysis

In order to understand why the way the method was written was the cause for the incident, let us look at the methods steps replicated below for the scenario:
  1. Check whether the user is present – reject input if not
  2. DELETE all the existing privileges of the specified user
  3. INSERT all the specified privileges for the user

Note that step one does not validate the presence of privileges in the query database; it assumes that they are present. The database, however, does this checking in step 3, to validate the referential integrity. Consider a scenario, when the one of the privileges of the user is missing from the privileges entity. This will cause the step 3 above to fail. However, note that the step 2 would have succeeded and if the transaction were to be committed after the failure, the database would be in an inconsistent state from business point of view. This is exactly how the batch program was written; it ignored the exception from this method. The rationale for doing so is obvious – one cannot rollback the complete transaction, just because one user’s data failed. The point to note, here is that the failure should not have left the database in an inconsistent state from the business point of view, which in our case it was doing. It was relying on the assumption that the caller would rollback the transaction and hence the problem.
The diagram illustrates the relationship between the data synchronization method and the batch program and highlights the difference from an online update on the same method. A batch program does a lot more transactional tasks than an equivalent online program. Each task cannot be treated as a transaction because that would unnecessarily slow down the system on the whole. Further if any of the tasks fails, a complete rollback too is not possible as that would lead to wasting of time and efforts spent in all the tasks that were successful.
One would argue that this should be handled by the transaction semantics, well that is correct, but then the JEE container managed transactions semantics does not allow for nesting of transaction context.

Learning & Best Practices

There are a few learnings that can be drawn from this episode, which are also the best practices for coding for atomic behaviour. These will certainly not eliminate the risks, but following them would certainly reduce the risks substantially. The best practices are as follows:
  1. Perform all the validations before any update. If the validations are not successful then raise an exception
  2. Perform the unrelated operations in Insert, Updates and then Delete order – Rationale being having incorrect data is better than losing correct data
  3. Perform only the operations that are required
  4. In case of throwing an exception, ensure that the database is in a consistent business state. This may require you to perform some more database operations to bring the database to a consistent business state.

The best practices above generic in nature and should be followed for all development having database.

Binary Indexed Trees

Binary Indexed Trees (Fenwick Tree)

Why Binary Indexed Tree?

Consider an array A : {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}, you have to find the sum of a range say (1,5). One way of doing this can be by keeping the sum of elements from index 0 to index i, where 0 <= i<= n. So the array with all the cumulative sum is "sum" :{1, 3, 6, 10, 15, 21, 28, 36, 45, 55, 66, 78, 91, 105, 120, 136} and to calculate sum from 1 to 5 we can simply do sum[5] – sum[1]. This will take O(n) precomputing time and O(q) with q queries with O(1) complexity per query.

Let us modify the question now, Suppose we have another query that modifies value at some index i,  this will make us calculate the sum from index ‘i’ to ‘n’ again. Now the complexity will be O(q*n) if there are ‘q’ queries. Segment trees can be used to solve this in O(q*log(n)). (Refer to this post for segment trees).

Coding for segment trees can be a very lengthy and Hectic process, Segment Trees require a very large memory space, Debugging a code of segment tree is very difficult. Another approach to solve the above problem is to use Binary Indexed Tree data structure, which also has O(q*log(n)) complexity but BIT (Binary Indexed Trees) are much easier to code and require very less memory space than segment trees. Binary Indexed trees are also called Fenwick Trees. 

Representation of Binary Indexed Tree

Consider an input array A : {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}. Binary Indexed Tree or Fenwick tree is represented using an array of size n, where n is the length of the input array. Let’s call the binary indexed tree of this array “tree[n]”. tree[idx] (where idx is some index of BIT) will store the sum of values of the given input array from index (idx – 2^r +1)  to index idx. Here, r is the position of the last set bit (from left to right) in binary representation of the index idx

So, for example idx = 6, binary representation of 6 is 0110, Therefore the last set bit from left to right is the 1st bit (considering 0 based index) which makes r = 1. Therefore tree[6] stores the sum from index 5 to index 6.
The diagram below shows value of r for every index from 1 to 16. The color of rth index is changed for better understanding.

Binary Indexed Tree

Therefore, tree[12] = A[9] + A[10] + A[11] + A[12]. To calculate “tree[idx]”, we can store the cumulative sum from index “0” to index “idx”  where (0 <= idx < n) in an array and then subtract "sum[idx – 2^r + 1]" from "sum[idx]". This will find value of tree[idx] in O(1).
So, tree[idx] = sum[idx] – sum[idx – 2^r + 1]. 

How to Find the value of the last set bit?

Let num be an integer whose last set bit we want to isolate. Note that num can be represented in the form a1b, where a represents the series of bits before last set bit and b represents all the zeros after the last set bit.

Integer (-num) can be found out using 2’s complement of num, which is done by adding 1 to the inverse of num. the expression for finding twos complement is as follows,

(a1b)¯ + 1 = a¯0b¯ + 1.

Since b consists of all zeroes, so b¯  consists all ones.
Therefore, finally we have

-num = (a1b)¯ + 1 = a¯0b¯ + 1 = a¯0(0…0)¯ + 1 = a¯0(1…1) + 1 = a¯1(0…0) = a¯1b

Now, we can easily isolate the last digit, using bitwise operator AND with num and -num

              a 1 b

        &   a¯1 b
     = (0…0)1(0…0)

So, to calculate the value of (idx – 2^r + 1) we just have to do the following operation

idx = idx – (idx & -idx);

Construction of Binary Indexed Tree 

For every index “idx”, tree[idx] is calculated in O(1) complexity using the expression tree[idx] = sum[idx] – sum[idx – 2^r + 1], where “sum[idx]” stores the cumulative sum from index “0” to index “idx”.

The code is shown below.

Sum Between Two Indices :

To calculate sum between two given indices l and r. we will have to calculate sum from index ‘0’ to index ‘r’, then the same thing from index ‘0’ to index ‘l’ and then calculate the difference between of the results obtained.

Let us consider an example of index 13, to calculate sum from index 0 to index 13, array tree will play a major role here, we know that tree[13] will store sum of 13th index only, tree[12] stores sum from 9th index to 12th index and tree[8] stores sum from  index 0 to index 8. So, adding tree[8] + tree[12] + tree[13] will give us cumulative sum from index 0 to index 13.

tree[13] = tree[13] + tree[12] + tree[8]

tree[1101] = tree[1101] + tree[1100] + tree[1000]                   (Binary representation)

Note that, complexity of our algorithm to calculate sum from index 0 to index idx will be O(log(idx)) .

The diagram below illustrate this.

Binary Indexed Tree

The Code to find sum from index 0 to index idx is shown below

Update Value at some position and update BIT : 

If a value at some index idx is added by some value say val, then we will have to update the tree at all those places which are affected by this index. 
For example, if value at 9 is changed, then tree[10], tree[12], tree[16] …so on, will be changed because

tree[10] = tree[9] + tree[10];
tree[12] = tree[9] + tree[10] + tree[11] + tree[12];

while we were reading the sum, we were removing last set bit from index until it became zero. Now, while updating the tree we should add one set bit to the index idx until it becomes greater than or equal to length.

Below is the code to do that.

Binary Indexed Trees easy to code, code length is very short and should be used wherever possible.

The Links of some practice problems are given below :

Virtualization: Hypervisors

Hypervisors in virtualization

As we saw basics of virtualization in this posts: Virtualizaton and it’s role in cloud computing
To understand concept of virtualization, first understand hypervisor concept. In this post will go in detail about hypervisors.
To quote Wikipedia, hypervisor is piece of software, firmware or hardware which enables users to run virtual machines. It allocates CPU, storage and bandwidth to virtual machine and manages execution of them. The machine on which this hypervisor runs is known as host machine and the machine which run on top of it are called as guest machines. There are two kinds of hypervisors available.
First is Type 1 which directly run on bare hardware and have complete access to hardware. Sometimes these are called as bare metal hypervisors. Some implementation of this kind are Xenserver by citrix, Hyper-V by Microsoft and VMWare ESX by VMWare.

Type 1 Hypervisor
Type 1 Hypervisor

The second is type 2 which run on conventional operating system as normal software and allow user to run different virtual machines as guest operating systems, these operating systems can have different file systems and process management however, they see hardware completely dedicated to them even though it is not. Some of the type 2 hypervisors are VMWare’s workstation and Oracle’s Virtualbox.

Type 2 hypervisor
Type 2 Hypervisor

Let’s see these hypervisors in detail

1. VMWare ESXi hypervisor

VMware is the industry leader in virtualizing on Intel platforms. They have a comprehensive suite of software products to managed small and large environments, both for internal IT staff and for service providers offering virtualized cloud infrastructure for their clients. ESXi is type 1 hypervisor.

Now a days, ESXi hypervisor comes embedded into the server hardware and not need to buy server and ESXi CD_ROM separately. VMWare has remove the bulk of the console code which was based on Red Hat Linux kernel code and replaced that with remote command line console. This has greatly reduce the memory footprint of ESXi from 2GB to approximately 70MB.

Basic component of ESXi is it kernel also called as vmkernel. In addition to all basic functionalities of modern day operating systems, it support the feature of sharing hardware seamlessly to many users above it. Architecture diagram of ESXi is given below:

ESXi type 1 hypervisor
ESXi architecture

We will discuss each component of this architecture in detail in next few posts.
To read how WMWare has removed legacy  Linux derived device drivers with their own native device driver, please refer :

Advantages of VMWare
It has been around longer and has developed as an Enterprise-grade virtualization technology.

2. XenServer

Xen is an open source hypervisor, commercialized by Xensource and currently developed by Citrix who is trying to provide a similar set of virtualization and cloud management software as VMware. It is the hypervisor used by Amazon EC2, and is available in many Linux distributions.
Xen is the only type 1 hypervisor which is open source.

Architecture of Xen hypervisor is shown in figure below

Xen Type 1 hypervisor

Xen hypervisor is very lean piece of software which run directly on hardware and manages memory and CPU scheduling. Dom 0 is special VM which has privileges to directly access hardware and it handles all the system IO.
Guest OS can Paravirtualized or Hardware virtualized based on the need and Xen can support both the types.

Advantages of XenServer
  • ability to run “paravirtualized” guests, where the guest machine runs much faster by making modifications to the guest OS rather than translating everything through the hypervisor. 
  • It can run on older processors.
  • Very small footprint (less than 1 MB as it builds on principle of micro kernel)
Good source to learn about Xen project is Xen overview 

3. KVM

KVM (kernel virtual machine) is an open source hypervisor based on QEMU and adopted by the Linux community when Citrix bought Xensource and stopped playing nice with the Linux distributions. 

Advantages of KVM
  • It runs as a kernel module (similar to a .dll file in Windows), so that it can run on any Linux system and has even been ported to FreeBSD and other operating systems. 
  • KVM requires newer processors with hardware-assisted virtualization extensions (Intel-VT and AMD-V)

4. Hyper-V

Need to research on that.

5. Virtualbox.

Virtualbox is now part of Oracle. Virtualbox runs really well as a desktop virtualization platform (similar to Parallels Desktop and VMware Workstation/Fusion). Even though most people run Virtualbox as a desktop system, it works on servers too. It’s based on the QEMU emulator, 
If you’ve got a Mac and want to run Windows, or have Windows and want to try Linux, Virtualbox is thing you are looking for. It’s cross-platform, which is always a winning choice.
This is for now on Hypervisors, I would incrementally add more info to this post as and when I read something good on this topic. If you any material or links that should be included here, please write in comment.

Counting permutations : Programming as an art

Counting permutations : Programming as an art

I am writing this blog on Algorithms to lay emphasis on not only the solution to the problem, but to some other aspects which get completely ignored when writing the implementation of the algorithms. To start with lets first look at the Question:
Implement a program that would find the total number of ways of picking one number each from N bags (each bag containing K numbers) such that the sum of the picked numbers is greater than a given threshold (H)? K & N > 1? Assume that the program is invoked on the command line, with N, K, input-file-name and H as parameters to the program and outputs the result to its standard output.

Brute Force Algorithm

To start with let us look at how we would solve this problem in the simplest of manner. If we could somehow manage to iterate through all the permutations, then we have to simply add the numbers selected in the current permutation and compare it with the threshold and increment the count if found greater.
The algorithm is simple and can be written as follows:

The complexity of this simple algorithm is O(KNN) since there would KN permutations and for each one of them the sum needs to be found for the N numbers in a given permutation.

Think before you Program

I had given this as an exercise to some students. I am reproducing here, without prejudice, one of the answer to highlight some of the other aspects that were completely ignored.

Biggest problem that I think in the above code is to locate where the algorithm is implemented. The program is simple and it is not difficult to locate where the real algorithm is in this program. Even though the implementation is recursive, but it is in essence doing exactly, with some optimization, what the algorithm I have written above. However, it is spread across at multiple places in the program. It is party present in the recursive method and partly in the main method.
Consider the situation when I would have to use this algorithm for some analytical reasoning in some business application and I had asked for POC to be done. Using the above implementation probably would be difficult for me, since the program does not separate the concerns by putting the logically similar concerns together and the one dis-similar separated with loose coupling.
Another big problem here is the field ‘count’ being held in a class scoped variable. This is very dangerous considering the fact one would wants to calculate the count multiple times in parallel when in production.
One concern that an architect would have is to be able to use different algorithm in different situations, and therefore will require a proper modeling of the problem. Since there is none, the architect would have to sit down and do the modeling afresh.

Art or Science

Now, the question is, whether the aspects that were missed completely in the above solution so difficult that they were completely ignored. So lets just put some thought to it and list down the high level concerns in the problem (I think this is know as the CRC [class responsibility collaboration] card). At a very high level, I see the following concerns:

  1. Bag containing the numbers – I intend to model it via a interface Bag which is responsible for the following:
    1. Count of numbers in the bag
    2. Some kind of Iterator though the numbers in the bag
    3. Min, Max numbers in the bag
    4. Mechanism to hold the numbers in a particular order as desired by the algorithm
    5. With a Self Cloning, Self Factory default implementation
  2. Algorithm to find the count – I intend to model it via a interface PermutationCounter which is responsible for the following:
    1. Name of the algorithm
    2. A method to calculate the count for a given array of bags and the threshold
  3. Take problem input from the screen/file and invoke the algorithm for the problem – this will be done by the main method

Simple, was it not? The advantage that one gets from this approach is that the responsibilities have been clearly demarcated and it is any bodies guess which all parts can be very easily reused. Yes, the Bag and PermutationCounter along with all its implementation can be reused. Now, lets look at adjacent diagram which captures the class hierarchy and how it is to be used by the client (for the scope of this problem it is the main method).  In actual, one can experiment and decide on the strategy to follow for using the implementations. Now let me return the the point where I left the iterative algorithm for this problem. So how are the aspects present in the algorithm mentioned in the start actually implemented. The current permutation can be easily held in an integer array of same size as the number of bags. Each element in the array represents the index of the number in the corresponding bag. To start with this array will be initialized to say for N = 4 as [0, 0, 0, -1] and then can be manipulated to produce arrays (for K = 3) [0, 0, 0, 0], [0, 0, 0, 1], [0, 0, 0, 2], [0, 0, 1, 0], and so on till [2, 2, 2, 2]. These two methods can be present in the IterativePermutationCounter class as private method (implementation as below).
Given the above two implementations, implementation of the algorithm is as follows:
For the complete code and some interesting implementations for the problem you can request to the author.

A method to quantify the quality of software

Measurement of quality

Today, the dependence of an enterprise on IT has increased many folds than it used to be twenty years back. The business too, is changing very fast and to remain competitive, the agility of the IT infrastructure and software is essential. Virtualization and cloud provides this much needed agility to IT in the infrastructure area, but when it comes to software and that too custom software the solution is not as simple. It all boils down to how fast the software, specifically the custom software, can be restructured to meet the ever changing demands of the business. Among many factors that influence this restructuring, the biggest that comes in the way of high agility is the quality of the code and hence measurement of quality.

There are many quality metrics in the software industry that are today used to measure some aspect of the code. For example cyclomatic complexity, which measures the number of linearly independent paths through a section of program, gives a measure of the complexity of the corresponding section in some way. Is this the complete measure of the complexity? Obviously the answer would be no, since the complexity dependents on many other factor apart from the linearly independent paths. Some of the key measures are cyclomatic complexity, cohesion, coupling (for example Data Abstraction Coupling and Fan-out Coupling), N path complexity, code coverage, program size, documentation, MOOD metrics, and adherence to standards.

Software quality measurement

The quantities obtained from these quality metrics are different as they measure different aspects of the code. Simply doing a mathematical operation to some of these quantities and then adding them will give us measure e.g. Maintainability Index, but will it balance all concerns of different stakeholders? A single approach to fit all needs would be too simplistic an approach. With years Maintainability Index has been redefined many times. Following are some of its definitions:
  • The original formula:
    MI = 171 – 5.2 * ln(V) – 0.23 * (G) – 16.2 * ln(LOC)
  • The derivative used by SEI:
    MI = 171 – 5.2 * log2(V) – 0.23 * G – 16.2 * log2 (LOC) + 50 * sin (sqrt(2.4 * CM))
  • The derivative used by Microsoft Visual Studio (since v2008):
    sMI = MAX(0,(171 – 5.2 * ln(V) – 0.23 * (G) – 16.2 * ln(LOC))*100 / 171)
The above formulation uses V for Halstead Program Volume; G for Cyclomatic Complexity; LOC: for Lines of Source Code; and CM for Comment Ratio (lines of comment to total number of lines). This formulation for Maintainability index used the experience and skills of the individuals and organizations where they were first conceived. This has for long been an art and highly dependent on the skills of the individuals and the data he/she is working with. Note that with experience only have the individuals/ organization been able to find constants or mathematical functions which have given results matching the expectations on the code at hand.
In my experience with developing and maintaining software for many of my organizations customers, I have seen the concerns change over time in the same engagement. The index such as the above still gives the same measure and therefore becomes less applicable. From engagement to engagement, since the priorities vary, the use of same index again is less applicable. Maintenance engagement, I have noticed, are more focused towards maintainability. So would be the case with products which are mature. Development engagements, however, are more holistic but then tend to focus on the maintainability aspect as the product being developed becomes mature.
The quality metrics sited above are not the universe. There are bound to be changes in them itself and addition of newer and smarter metrics. Architects and managers would certainly want to use them too.
A more mature method is, therefore, required to be developed which is independent of the quality metric in question and treats all the quality metrics in a similar manner. With quantities produced from such a method, it would be easier to alter the index based on the concerns relevant at the time and still be able to correlate it with the indices values obtained in the historical times.

Why should such a quantity exist?

To answer this question, I would like to consider the example of two applications along with the quality metric ‘cyclomatic complexity’. Let me call them A1 and A2. Let these two applications have similar number of code artifacts. Application A1 has most of the code artifacts with cyclomatic complexity in the 1-3 range. While the application A2 has most of the code artifacts with cyclomatic

complexity in the 8-10 range. Note that the Industry best practice for this quality measure is 10. So the question is do the two applications considered in the scenario have the same code quality?

Obviously the answer is no. The artifacts in application A1 have cyclomatic complexity less than that in application A2. This in turn means that the artifacts of application A1 are simpler than that of application A2. The graph of two applications when plotted in the manner shown above makes this fact very obvious.
Notice, in the graph above I compared two applications. Let us for this moment assume that we have a mathematical formulation which can compare two applications in the manner shown in the graph above and give us a quantity. What if we were to compare each application with a hypothetically perfect application of similar size? Now with the assumed mathematical formulation we can obtain a quantity for both the applications and can use it to compare the two applications.
Now, what is such a mathematical formulation? One would be tempted to use average as the formulation, but then average will not cover the essence present in the graph. If one dwells further into the statistical quantities, the quantity that covers the essence of the graph above is the general correlation coefficient. Here, the correlation is on the count of code artifacts having a particular range of values of the quality metric with a hypothetical perfect application. Note that it is very simple to qualify a similar sized perfect application. All the counts would be in the range that is considered best from the quality perspective for that quality metric. The formula that I will use for correlation after deriving it from the general correlation coefficient will be as follows:
The scores ai are derived by subtracting the quality metric value of the code artifact i from the value that is considered best for that quality metric. This is to be done for all artifacts that are not at the desirable levels (It should be ensured that these values are negative). For the ones that are at the desirable levels the value obtained for the quality metric is to be used. However, if the quality metric is grouped in k groups with the ith group having the assigned score as ai and the count of artifacts from the application lying in the ith group is ni (given that ∑ni=1ni = n), the above formula will change to
Now let us look at some good formulations for this quantity for a given quality metric. The table below shows some scenario of different kinds of application where the counts for the quality metric is split into three groups viz. good (2), average(-1) and bad (-2).

Quality Metric Grouping Artifacts Count for Application Classification
Perfect Bad Bad Bad Below Average Below Average Average On the edge Good
Good(2) 50 0 0 0 25 25 25 35 40
Average(-1) 0 50 0 25 0 17 25 15 7
Bad(-2) 0 0 50 25 25 8 0 0 3
Expected quantity < 0.2 < 0.2 < 0.2 < 0.4 < 0.4 < 0.65 < 0.7 > 0.7
τ -1 -1 -0.948 0 0.197 0.316 0.625 0.709
(1+τ)/2 0 0 0.025 0.5 0.598 0.658 0.812 0.854
[(1+τ)/2]2 0 0 0 0.25 0.358 0.433 0.659 0.729

Notice that the spread for bad applications correlation value lies between 0.5 to -1 while for applications average or better the range of correlation lies from 0.5 to 1. This leaves very little scope of identifying good, on the edge, average applications. Thankfully, since the number is between -1 and 1, squaring or cubing the number will result in increasing the range where we want it to be increased. Squaring (and keeping the sign) reduces the range for bad applications by making it from 0.25 to -1 while increasing the range for the rest type of applications by making it from 0.25 to 1. Also notice the calculation (1 + τ)/2 just changes the whole range from [-1, 1] to [0, 1]. Since [(1 + τ)/2]2 gives a very good value in comparison to the value I was expecting for each type of application.


The method to quantify the quality or measurement of quality of software provides a way to calculate a correlation value for different code quality matrices. Since the value obtained are all correlation values, a weighted addition can easily be done to arrive at an overall score. The weights can be so chosen to be in line with the key concerns of various stakeholders relevant to the software. Such a score is not dependent on the skills of the individual and therefore have greater significance and can be used in many ways.