The canonical address rules mean there is a giant hole in the 64-bit virtual address space. 2^47-1 is not contiguous with the next valid address above it, so a single mmap
won’t include any of the unusable range of 64-bit addresses.
+----------+
| 2^64-1 | 0xffffffffffffffff
| ... |
| 2^64-2^47| 0xffff800000000000
+----------+
| |
| unusable | not to scale: this part is 2^16 times as large
| |
+----------+
| 2^47-1 | 0x00007fffffffffff
| ... |
| 0 | 0x0000000000000000
+----------+
Also most kernels reserve the high half of the canonical range for their own use. e.g. x86-64 Linux’s memory map. User-space can only allocate in the contiguous low range anyway so the existence of the gap is irrelevant.
Is there a guarantee by the OS that you will never be allocated memory whose address range does not vary by the 47th bit?
Not exactly. The 48-bit address space supported by current hardware is an implementation detail. The canonical-address rules ensure that future systems can support more virtual address bits without breaking backwards compatibility to any significant degree.
At most, you’d just need a compat flag to have the OS not give the process any memory regions with high bits not all the same. (Like Linux’s current MAP_32BIT
flag for mmap, or a process-wide setting). That could support programs that used the high bits for tags and manually redid sign-extension.
Future hardware won’t need to support any kind of flag to ignore high address bits or not, because junk in the high bits is currently an error. Intel 5-level paging adds another 9 virtual address bits, widening the canonical high andd low halves. white paper.
Fun fact: Linux defaults to mapping the stack at the top of the lower range of valid addresses. (Related: Why does Linux favor 0x7f mappings?)
$ gdb /bin/ls
...
(gdb) b _start
Function "_start" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (_start) pending.
(gdb) r
Starting program: /bin/ls
Breakpoint 1, 0x00007ffff7dd9cd0 in _start () from /lib64/ld-linux-x86-64.so.2
(gdb) p $rsp
$1 = (void *) 0x7fffffffd850
(gdb) exit
$ calc
2^47-1
0x7fffffffffff
(Modern GDB can use starti
to break before the first user-space instruction executes instead of messing around with breakpoint commands.)