Real Time Linux Workshops
1999 - 2000 - 2001 - 2002 - 2003 - 2004 - 2005 - 2006 - 2007 - 2008 - 2009 - 2010 - 2011 - 2012 - 2013 - 2014 - 2015
14th Real Time Linux Workshop, October 18 to 20, 2012 at the Department of Computer Science, University of North Carolina at Chapel Hill
Announcement - Call for papers (ASCII) - Hotels - Directions - Agenda - Paper Abstracts - Presentations - Registration - Abstract Submission - Sponsors - Gallery
The Case for Migratory Priority Inheritance in Linux: Bounded Priority Inversions on Multiprocessors
Björn Brandenburg, Max Planck Institute for Software Systems, Germany
Andrea Bastoni, SYSGO AG, Germany
The PREEMPT_RT patch has been instrumental in turning Linux into a viable platform for demanding real-time applications. Crucially, the PREEMPT_RT patch enables priority inheritance (PI) for virtually all in-kernel locks, which ensures that real-time tasks are never delayed due to locks accessed only by lower-priority tasks. As Linux consists of many complex subsystems (of which real-time tasks likely access only few), this isolation among real-time and non-real-time tasks is essential to providing suitably short response times in practice. More generally, the combination of fixed-priority scheduling and PI, as mandated by the POSIX real-time standard, has seen tremendous success in uniprocessor real-time systems because it is both analytically sound and robust with regard to real-world engineering concerns (e.g., it transparently tolerates unpredictable critical section lengths in non-real-time tasks).
Unfortunately, "classic" uniprocessor PI breaks down in multiprocessor systems (unless all tasks may execute on all processors, i.e., "classic" PI works only under global scheduling). As we explain in detail in the paper, the intuition that tasks are only delayed by local higher-priority tasks is not maintained, even when using PI. In fact, as we demonstrate with an example, in a multiprocessor system *with* PI, it is still possible for a task to suffer priority inversions that are as long as those incurred in a uniprocessor system *without* PI. Analytically speaking, this renders PI ineffective and unsound; practically speaking, this violates the intuition and "best practices" that real-time engineers have acquired over the last two decades. While this problem has received considerable academic attention, most solutions proposed in the literature make assumptions that are incompatible with the Linux kernel. For instance, most require "priority boosting" of critical sections, which exposes real-time tasks to delays from *any* critical section in the kernel – given Linux's large, complex, and rapidly changing code base, this is not a viable option.
In this position paper, we argue that a simple tweak to Linux's existing priority inheritance mechanism will restore both the analytical properties and the intuition of uniprocessor PI, without substantially altering the implementation. In a nutshell, inheritance should apply not only to scheduling priorities, but also to processor affinity masks. That is, we submit that Linux should implement "migratory priority inheritance" (m-PI), under which a task that blocks a real-time task may execute on any processor that the blocked real-time task is eligible to execute on. As we explain in detail, m-PI is a straightforward generalization of PI (m-PI reduces to PI on uniprocessors and under global scheduling) that prevents unbounded priority inversions in all cases. Further, "PREEMPT_RT with m-PI" would not violate the property that tasks are never delayed by locks accessed only by lower-priority tasks, both analytically speaking and from the point of view of a developer's intuition (which, for Linux, is arguably as important as analytical correctness).
The main contribution of this paper is a discussion of possible implementation pitfalls and tradeoffs of m-PI in the context of the Linux kernel, and a detailed review of PI and m-PI that relates their advantages and intuitive blocking behavior to the underlying analysis.