Skip list implementation

In last post : Skip lists : Part 1 , we discussed what is skip list, how nodes are arranged, complexity improvements over singly linked list and how search is performed on skip lists. In this post, we will learn, how nodes are inserted and removed from skip lists, what randomization and what are various applications of skip lists, basically, the skip list implementation.

Randomization in skip lists implementation

What we saw in last post was exactly balance skip list. That means every node as exactly same number of nodes as was theoretically possible. Every i+1 level will have exactly half the nodes at i level. However, problem with such a  data structure is insertion and removal becomes too cumbersome, as we need to maintain the entire structure at every insertion and deletion in skip list.

In skip list, this problem is solved using randomization. In skip list it is not mandatory that every second node at level i should be promoted to level i+1. Each node’s level is decided by probability, by tossing a coin. If it is “head”, node moves to level above or remains at same level otherwise. Probability of “head” coming up in flipping of coin is 1/2. So ideally, level 1 should have n/2 nodes, level 2 should have n/4 nodes and so on. Also, node at each level is equally distributed and not bunched together at one end. So in ideal world, randomized skip list is same as balance skip list.

Insertion and delete in skip list

To insert a key x node in skip list, do a search on skip list to find predecessor of x.  If x is not in skip list, create a new node and insert is the lowest level and flip the coin. Based on output if coin flip, promote node to next level. Continue till we see “tail” on coin or highest level is reached.

To delete node from skip list is very simple, just remove node from all the levels in skip list.

Insert a node in skip list implementation

Complexity analysis of skip list implementation

As we are discussing though out our discussion, complexity of search on skip list is O(log n). Also, complexity of insertion and deletion is also dominated by O(log n).

To understand these complexities, please see that expected number of levels in skip list with n nodes is log n where no node is missing. (This is very similar to a binary tree with n nodes).

Search complexity is interesting and easy to understand when we look at it in reverse way, sometimes it is called as backward analysis. Look at figure below.

Observe that whenever a node is at level i+1, path jumps from level i to i+1. Probability of node being promoted to level i+1 from level i is 1/2. Similarly, there 1/2 probability that node remains at the same level. So the recurrence relation for number of jumps to reach level j will  be

`c(j) = 1 + 1/2 *C(j-1) + 1/2 * C(j)`

Solving equation it becomes

```c(j) = 2 + c(j-1)
which can proved as c(j) = 2j```

Since j is number of level in skip lists, hence search complexity becomes O(log n).

Applications of skip lists

1. Skip lists are used as underlying data store for other data structures like ordered set and ordered map.
2. MemSQL uses skip lists as its prime indexing structure for its database technology.
3.Lucene uses skip lists to search delta-encoded posting lists in logarithmic time.
4. Skip lists are used to implement lockless priority queues in distributed systems.

Reference: http://www.cs.umd.edu/~meesh/420/Notes/MountNotes/lecture11-skiplist.pdf
Please share if you know some other applications of skip lists.

Please comment or write if you find something wrong or think can be improved. Also, if you are interested in contributing to community, please share you articles, we will putting it on site with due credit.