Cache Coloring as a Path to an (even) safer System?

Dec 06, 2023 – PikeOS

Please accept functional cookies to watch this video.

The be-all and end-all of safety is deterministic behavior. This means that one knows in advance exactly how an electronically programmable system (vulgo: computer) will behave. Hypervisors or separation kernels and real-time operating systems are designed to guarantee deterministic behavior. The system manages the resources so that the developer can pursue his work (relatively) carefree. When it comes to memory, a system allocates memory areas of the RAM with the help of the Memory Management Unit (MMU). So why use cache coloring to circumvent this way of working?

Modern CPUs without Cache unthinkable

First of all, it's important to understand how a modern multi-core CPU is built. In addition to several cores, modern multi-core processors also have several (intermediate) memories called caches. These caches serve as a "fast" middleman between comparatively slow RAM and CPU. Essentially, the task of a CPU cache is to speed up processes by storing important data in the cache. This shortens the execution time.

Each core has at least one cache that only it can access and typically one cache that all cores can access. It is this last-level cache (typically L2 or L3 cache) that can cause problems in theory: If processes access the same resource, there is inevitably a delay and, with a bit of bad luck, a problem (conflict). If one CPU task blocks a certain memory area, another process has to wait until it is ready. This leads to execution time problems and, because data can in principle be distributed anywhere in the last-level cache, also to clutter and disorganization, which can cause systems to behave less deterministically.

Well orchestrated: The Relationship between MMU and Hypervisor/Separation Kernel

Now, the MMU manages memory by making connections (which is called mapping) between virtual memory and physical memory. On the one hand, this makes it possible for the programmer to address memory reliably and safely; on the other hand, an MMU functions so reliably that its use in normal applications is unproblematic.

In safety-critical applications, however, the use of an MMU is somewhat more difficult. A real-time operating system and hypervisor/separation kernel like PikeOS now goes one step further and partitions resources so that they don't affect each other. Specifically, this means that memory areas of the RAM are predefined during normal operation. PikeOS requires a starting address in memory and a size to define a specific memory area (cluster). The mapping is still done by the MMU.

How Cache Coloring works

However, "coloring" cache now bypasses the normal way the MMU works. A typical address could be 0x81CE5018 - but the concrete layout can differ in principle from architecture to architecture. Now a certain area is "colored" (means: starting from here) like the 5 in 0x81CE5018 and a certain distance to the next memory address is created to define a new memory cluster (memory page) over the sum of the distances. With the i.MX8X processor family, for example, up to 16 colors can be defined with PikeOS - the actual number of possible colors depends on the hardware used. Because the memory clusters are called pages, the term page coloring is used synonymously with cache coloring.

The Sense behind Cache Coloring

The whole purpose behind cache coloring is to achieve better deterministic behavior of applications through the exact knowledge and definition of the physically available memory area. The idea behind this is to group certain processes (threads) into a partition and allocate them to the known and previously defined physical memory. This has the advantage that the spatial separation is carried out even more strictly and thus interference is completely eliminated in the best case. If the system is designed carefully, cache coloring can even achieve efficiency gains because, unlike in normal operation, the cache does not have to be invalidated for partition switches or context switches. Finally, this prevents excessive memory accesses from slowing down other tasks. This results in increased deterministic behavior and also reduces the probability of side-channel attacks.

However, cache coloring can also have disadvantages, precisely because processes no longer have the entire cache available in principle and memory-hungry applications can be slowed down in the worst case. It therefore makes particular sense to use cache coloring when resource requirements are known exactly in advance and no applications that are too memory-hungry are running. This may be the case for some security requirements. In principle, it would still be possible to assign several colors to a certain process, but here, too, a conscientious design of the system is urgently necessary. In the worst case, errors are preprogrammed here.

Conclusion

Cache coloring makes sense for applications that require deterministic behavior and where an exact design is possible in advance. Security, safety, and in special designs performance can then be increased. If in doubt, you can always rely on the robust functionality of PikeOS.

More information at www.sysgo.com/pikeos

Back to the Overview

memory, mmu, rtos, safety, security