Virtual memory using paging

Virtual memory using paging

In last post in this section, basics of virtual memory were discussed. In this post, we will understand how actually it is implemented in operating system and hardware using paging. As in said earlier there are two methods with which virtual memory is implemented: Demand paging and segmentation. Focus of this post is to discuss demand paging.
What is demand paging?
It is concept where the entire process in not brought into the memory but is divided into number of pages, and only relevant pages which are required for execution of instructions at that time are brought in main memory. Rest all pages reside at disk and as and when they are required by process, are swapped in from disk. Since pages are brought in on demand, it’s called demand paging.
As we know logical address space of a process can be bigger than the real physical memory, there might be scenarios when the required address page is not in the main memory and hence swap in and out are required. Let’s understand the paging and demand paging using a simple example.

A process which has logical address space of 128 bytes. (ridiculously small, but good for an example) and page size of process is 16 bytes, so there are total of 8 pages as shown in figure below.

Now, the main memory has only 64 bytes of memory and page size is same as 16 byte. At a time only four pages can reside in main memory. Mapping of four pages of the process is shown below. Rest four pages are put on to disk. Initial arrangement is as shown below

paging virtual memory

How many bytes are required to represent logical address space of process of 128 bytes? 7 bits right.
If there are 8 pages  then 3 bits are required to identify the page and rest 4 bits can identify the offset in that page raging from 0 to 16 bytes.3 bits give us virtual page number called as VPN and last 4 bits are offset.

logical address space page address

How do system translates VPN into physical page also called as Physical Frame Number?

There is a special per process data structure which stores this translation information which is called as page table. Page table is keyed on VPN and stores the corresponding PFN on to main memory.
One the PFN is found from this table offset is added to that PFN and the required address is accessed from the memory. For the above initial state, page table entries will be like this

Page table layout

Translation happens like this. In real world there are multiple hierarchies of page tables like page directory and then that points to page table and then the offset. For simplicity, those details are skipped here.

page address translation

Let’s take an example : Process wants to access address “34”. Binary representation of “34” is 
“0100010″ . In this VPN bits are 010 which is 2. Process looks in page table corresponding to VPN , PFN is 2  and get PFN as 3 in binary “011” which is added to the offset which is “0010”, So the final address accessed on physical memory will be “0110010” which is address number “50”.

There is a catch here. As mentioned earlier, the PFN given by page table may not be in the memory. In this case page fault is generated and operating system brings in the required page from disk to memory and instruction is restarted.

What if the main memory is already full when page fault occurs? Operating system needs to swap out one of the page to disk to make room for new page. How does operating system decides which page to move out? There are various page replacement algorithms like first in first out, round robin, least recently used, these paging algorithms will be discussed in detail in next post.
The entire process is depicted by figure below.

Page access and fault handling

There are some special bits which are stored in page table along with the PFN for a VPN.
These bits are useful to operating system while deciding on action to be taken for particular page. bits are as follows:
  1. Invalid bit /Present bit : This bit means that page is not valid and moved out the memory.
  2. Dirty bit: If this bit is set that means page has been written by some other process and this is not the latest copy of the page.
  3. Read/Write bit : It indicated the access right of process on this page.
  4. Accessed bit : This indicate that page is accessed since it is in the memory. This helps in page replacement algorithms like least recently used page replacement algorithm.
It is not that every time a process requires a page from memory, it has to read it from memory. There is a concept called as caching, In caching, operating system store some of the pages into faster accessible memory called as cache. Caching works on principle of locality, that means if the process has requested an address, it is more likely that it will access a near by address in next few instruction. So why not cache that page and made it fast accessible, Process first checks the cache in order to fetch the page, then if page is not found in cache then it goes to main memory. This is called as cache miss. Again size of cache is much smaller than main memory and page replacement algorithms need to applied here too.
There are various kind of caches which can be employed for this purpose, details of this later.
This is how virtual memory is implemented using paging, please write in comment if I missed something or something is wrong.