You are here: Home / OSADL / News / 
2020-07-15 - 21:38
2010-03-10 12:00 Age: 10 Years

Parallel real-time on multi-core systems with mainline Linux

By: Carsten Emde

Several tasks simultaneously running at real-time priority no longer interfere to each other!

The uni-processor experience

When learning how to use a uni-processor real-time system, people often are surprised to realize that it provides real-time capabilities for a single task only. The real-time task is the one with the highest priority - only this task will enjoy the determinism of always responding to asynchronously arriving events in time, i.e. within the worst-case latency of the system.

Assigning the highest priority to two processes that do not have a control mechanism to ensure strict subsequent execution, violates the prerequisites and is, therefore, considered a bad design. The effective worst-case latency (Leff) of such a scenario is

Leff = Lsys + Texe

where Lsys is the worst-case latency of the system and Texe is the longest possible continuous execution time of any of the two processes. Since Texe may be orders of magnitude longer than Lsys, it is evident that such a poorly designed system may not fulfill the real-time expectation based on the worst-case latency of the system. BTW: The built-in latency histograms of the Linux kernel (CONFIG_WAKEUP_LATENCY_HIST) separately record latencies of both unique and shared real-time priority tasks (see also this article). Only the wakeup latency of a real-time process that does not share its priority with another process is relevant.

Parallel real-time on a multi-core processor

With the advent of the PREEMPT_RT real-time patches of the mainline Linux kernel, deterministic execution is provided on a per-core basis. This is a major advantage, since it permits, for the first time, to execute several processes with the same real-time priority in parallel - as long as the number of such processes does not exceed the number of cores. The CPU affinities must, of course, be set in such a way that the processes are pinned to a particular core, e.g. using sched_setaffinity().

The above mentioned built-in latency histograms of the Linux kernel do not assume shared priority, if processes with the same real-time priority run on different cores. Therefore, in the context of multi-core real-time systems, the latency histograms may help to further fine-tune the system, since the specific worst-case latency of every core is evaluated and displayed separately. These differences and the individual shapes of the latency histograms are due to the various interrupt service routines that are assigned to a particular core. It is, thus, possible to select a core that is best suitable to run a selected real-time task. In the example wakeup latency histogram plot on the left that was obtained on an Intel Nehalem i7 processor (click here to display it at full size), core #3 has a considerably shorter worst-case latency (17 µs) than core #2 (37 µs). Note the individual shape of every core's histogram. The system uses a default IRQ configuration based on the standard IRQ load balancer. It is, however, well conceivable that manual assignment of the IRQ threads to a limited number of cores will result in even better user space real-time capabilities. In addition, future kernel development may provide mechanisms to free a particular core completely from housekeeping, statistics and other system tasks; such an isolated core may then be able to run a real-time task with a yet unseen low worst-case latency.

Multi-core real-time systems for the future

It is expected that future performance gains of microprocessors will mainly be obtained through multi-processing based on multi-core topology. Computer boards with such processors are ideally suitable for real-time applications, since they may provide true parallel real-time performance, if the operating system supports it.

For the time being, only Linux can do that.