Understanding Virtual Address, Virtual Memory and Paging

Before I answer your questions (I hope I do), here are a few introductory remarks:

Remarks

The problem here is that “virtual memory” has two senses. “Virtual memory” as a technical term used by low-level programmers has (almost) nothing to do with “virtual memory” as explained to consumers.

In the technical sense, “virtual memory” is a memory management system whereby every process has its own virtual address space, and memory addresses in that address space are mapped to physical memory addresses by the OS kernel with hardware support (uses terms like TLB, multi-level page tables, page faults and walks, etc.). This is the sense of VM that you are interested in (described below).

In the non-technical sense, “virtual memory” is disk space used in lieu of RAM (uses terms like swap, backing store, etc.). This is the sense of VM that you’re not particularly interested in, but it seems that you’ve seen some material that deals primarily with this sense of the term or muddles the two.

Question 1

what happens when my programs want to access memory address 0xFFFFFFFFF? I do only have 4GB

In this case, your “Theory 1” is closer.

VM decouples the addresses that your program “sees” and works with — virtual addresses — from physical addresses. Your 4GiB of memory may be at physical addresses from 0x0 to 0xFFFFFFFF (8 F’s), but the address 0xFFFFFFFFF (9 F’s) is in the user-space (in canonical layout) of virtual addresses. Provided that 0xFFFFFFFFF is in a block allocated to the process, the CPU and kernel (in concert) will translate the page address 0xFFFFFF000 (assuming a 4k page, we just hack off the lower 12 bits) to a real physical page, which could have (almost) any physical base address. Suppose the physical address of that page is 0xeac000 (a relationship established when the kernel gave you the virtual page 0xFFFFFF000), then the byte at virtual address 0xFFFFFFFFF is at physical address 0x00eacfff.

When you dereference 0xFFFFFFFFF (assuming 4k pages), the kernel “asks” the CPU to access that virtual address, and the CPU chops off the lower 12 bits, and looks up the page in the dTLB (translation lookaside buffers are virtual-to-physical page-mapping caches; there’s at least one for data and one for instructions). If there’s a hit, the CPU constructs the real physical address and fetches the value. If there’s a TLB miss, the CPU raises a page fault, which causes the kernel to consult (or “walk”) the page tables to determine the right physical page, and “returns” that value to the CPU, which caches it in the dTLB (it’s highly likely to be reused almost immediately). The kernel then asks the CPU for that address again and this time, it will succeed without triggering a walk.

I admit that this description is pretty crummy (reflecting my own level of knowledge). In particular, the exact way that a particular process is identified in the TLB is not 100% clear to me and at least somewhat hardware-specific. It used to be that every context switch needed a full TLB flush, but more recent Intel CPUs have a 6-bit “PID” field, which means that flushes, while still required sometimes, aren’t always required on a context switch. Further crumminess arises from my failure to describe multi-level TLBs, PTEs (page table entries) and address the significance of this on data and instruction caching (although I do know that modern hardware can be seeing if it’s at all possible that an address is in some cache level at the same time as the TLB lookup).

Question 2

How processes are put in Virtual Memory? I mean does each process has
0x0 – 0xFFFFFFFFF virtual memory space available for them or there is
only one Virtual Memory address space where all the process are
placed?

Each process has its own completely distinct virtual memory space. This is (almost) the entire point of VM.

In the olden days, the TLB was not “process aware” in any sense. Every context switch meant that the TLBs had to be flushed completely. Nowadays, TLB entries have a short “process context” (PCID?) field and support selective flushing, so you can kinda/sorta think of it as the PID (or, rather, the PCID: some kind of hash of the PID) being prepended to the virtual page address, so the TLB is more process aware, and only those entries need to be flushed when there’s a PCID collision with another process (two processes map to the same PCID).

Question 3

Is there one big giant page table which includes all the pages for
every process or each process has its own page table?

This is OS-specific, of course, but my understanding is that Linux has one multi-level set of page tables where the entries (PTEs) are tagged with the PID, rather than there being separate per-process page tables. I think the basic reason for this is that a lot of virtual-to-physical mappings are n:1 rather than 1:1, since them all being 1:1 would largely defeat a major purpose of VM: think about shared readonly pages containing the instructions for libraries like libc, or copy-on-write data pages shared between parent and child after a fork. Duplicating these entries for every process in per-process page tables is less efficient than adding/deleting the process-specific entries to/from a common set of page tables when a process is created/exits.

Where Disk Comes In

Once you have a VM system, it becomes almost trivial to add the ability to retrieve a page from disk when a page fault occurs, and implement “aging” for PTEs so that the least recently used pages can be put on disk. Although this is an important feature on memory-constrained systems, is almost entirely irrelevant to understanding how a VM system actually works.

Leave a Comment