How to use efficient and safe Real-Time Scheduling

Jun 03, 2016 – PikeOS, Whitepapers, Railway

William Stallings book „Operating Systems“ is a good reference to learn about scheduling. In chapter 9.2 he summarizes, that the following aspects play a role when writing a scheduler for an operating System:

Turnaround Time: Time between the submission of a process and its completion
Response Time / Determinism: Time from the submission of a request until the response to be received
Deadline: The processes start or completion deadline
Predictability: A job should run in the same amount of time
Throughput: The scheduler should maximize the number of processes completed per unit of time
Processor utilization: Percentage of time, the processor is busy
Fairness: Processes should be treated the same so that no process should suffer starvation
Enforcing priorities: The scheduler should favour higher priorities
Balancing resources: The scheduler should keep the resources of the system busy

If the OS is a Real-Time Operating System (RTOS) for embedded systems, than determinism and response time becomes evident. But also the processor utilization should be optimized. An inefficient scheduler can force you to use a more powerful (and expensive) processor in order to meet the required run-time behaviour. The complexity for the scheduler rises with a multi-core processor as applications can be executed concurrently on the available cores.

A safety certification of a real-time application running on a multi-core system is the ultimate challenge for a scheduler. Applications running on different cores of a multi-core processor are not executing independently from each other. Even if there is no explicit data or control flow between these applications, a coupling exists on the processor platform level since they are implicitly sharing resources. A platform property, which may cause interference between independent applications, is called a hardware interference channel. They can be categorized as:

CPU-core interference: We assume that cores execute independent from each other as long as they do not share hardware. Inter Processor Interrupts should be handled by the operating system
Cache Sharing: Usually second level caches are shared amongst the cores
Cache Coherency: If applications run on several cores, the consistency of local caches connected to the shared resources has to be ensured
Memory bus: Usually the bandwidth of the memory bus is shared between the corers
Shared I/O: Concurrent access to shared I/O device may cause a performance loss, if the bus reaches its bandwidth or if the bus can only handle one request at a time
Shared Interrupts: A hardware interrupt is typically routed to one core. If multiple devices are attached to one interrupt line and the same core does not serve the devices, the core who receives the interrupt must pass this interrupt to the other core(s) forcing them to check the interrupt status of their devices

On Software level, interference may be caused by the load-balancing algorithm (e.g. SMP, AMP, BMP), which has to provide means to execute the application on different cores or switch from one core to another during run-time.

The above-mentioned software and hardware interferences on a multi-core platform are a hurdle for the deterministic behaviour of an embedded safety system. Most safety standards (IEC 61508, EN 50128, ISO 26262, …) require a timing analysis and the determination of the Worst Case Execution Time (WCET) for the safety system. This can be quite difficult on multi-core processors due to the HW and SW interferences.

An adaptive time partitioning scheduler is able to handle the HW/SW interference challenges on a multi-core platform and provide optimized CPU usage with a guaranteed response time (WCET). Using a patented Time Partition Scheduler, SYSGO was awarded with the world wide first EN 50128 SIL 4 certificate for the PikeOS real-time hypervisor on a multi-core platform.