Why is the size of L1 cache smaller than that of the L2 cache in most of the processors?

L1 is very tightly coupled to the CPU core, and is accessed on every memory access (very frequent). Thus, it needs to return the data really fast (usually within on clock cycle). Latency and throughput (bandwidth) are both performance-critical for L1 data cache. (e.g. four cycle latency, and supporting two reads and one write by … Read more

What is the purpose of the “Prefer 32-bit” setting in Visual Studio and how does it actually work?

Microsoft has a blog entry What AnyCPU Really Means As Of .NET 4.5 and Visual Studio 11: In .NET 4.5 and Visual Studio 11 the cheese has been moved. The default for most .NET projects is again AnyCPU, but there is more than one meaning to AnyCPU now. There is an additional sub-type of AnyCPU, … Read more

Can a speculatively executed CPU branch contain opcodes that access RAM?

The cardinal rules of speculative out-of-order (OoO) execution are: Preserve the illusion of instructions running sequentially, in program order Make sure speculation is contained to things that can be rolled back if mis-speculation is detected, and that can’t be observed by other cores to be holding a wrong value. Physical registers, the back-end itself that … Read more