You are here: Home / RTLWS 1999-2017 / RTLWS Submitted Papers / 
2022-12-09 - 17:15

Dates and Events:

OSADL Articles:

2022-01-13 12:00

Phase #3 of OSADL project on OPC UA PubSub over TSN successfully completed

Another important milestone on the way to interoperable Open Source real-time Ethernet has been reached

2021-02-09 12:00

Open Source OPC UA PubSub over TSN project phase #3 launched

Letter of Intent with call for participation is now available

2016-11-12 12:00

Raspberry Pi and real-time Linux

Let's have a look at the OSADL QA Farm data

Real Time Linux Workshops

1999 - 2000 - 2001 - 2002 - 2003 - 2004 - 2005 - 2006 - 2007 - 2008 - 2009 - 2010 - 2011 - 2012 - 2013 - 2014 - 2015 - 2017

16th Real Time Linux Workshop, October 12 to 13, 2014 at the CCD Congress Center Dusseldorf collocated with LinuxCon Europe in Dusseldorf, Germany

Announcement - Call for participation (ASCII) - Hotels - Directions - Agenda - Paper Abstracts - Presentations - Registration - Abstract Submission - Sponsors - Gallery

Performance Isolation for Real-Time Applications on Multicore Platforms using PALLOC and MemGuard

Santosh Gondi, The University of Kansas
Siddhartha Biswas, The University of Kansas
Heechul Yun, The University of Kansas

Performance isolation is important for real-time applications such as real-time video conferencing software. On modern multi-core platforms, however, it is increasingly difficult to achieve because contention in shared hardware resources – including shared DRAM, LLC, and network – can cause high performance variations which is highly undesirable for real-time applications.

The cpuset subsystem of Linux CGROUP is a well known isolation mechanism which provides a basic core-level partitioning capability by confining applications in a given CGROUP partition to a subset of cores. Unfortunately, however, core level partitioning does not provide sufficient performance isolation when the shared memory resources – LLC capacity, DRAM banks, and memory bandwidth – become bottlenecks.

Recently, two new OS mechanisms,  PALLOC[1] and MemGuard[2], have been proposed to provide better performance isolation for the shared memory resources on COTS multi-core platforms. PALLOC,  is a DRAM bank aware memory allocator, which enables us to partition banks among cores to avoid bank sharing. MemGuard is a memory bandwidth reservation system which provides a minimum bandwidth guarantee to each core.

In this paper, we present a case study using WebRTC, a free open-source real-time video conferencing software developed by Google, which provides real-time video communication between browsers without having to install any plug-ins. Our goal is to provide a high degree of performance isolation to WebRTC on a commodity multi-core platform in a multiprogrammed environment. We first investigate the performance variability of WebRTC in the presence of memory intensive co-running applications. We then compare CPUSET(baseline), PALLOC and MemGurad in terms of performance isolation of the WebRTC application as well as the overall throughput of the entire system.

In our evaluation, we use a Intel Xeon processor platform having 4 cores, each runs at 2.8 GHz frequency, 8MB 16-way shared L3 cache and 256KB private L2 cache. The platform has 4 GB DDR DIMM memory with 16 banks, and an integrated memory controller. The workload is composed of four tasks: a Chrome browser running WebRTC, Xserver, and two LBM benchmarks from SPEC2006. Our goal is to provide performance isolation to the WebRTC and the Xserver from the co-running LBM benchmarks.

With CPUSET mechanism alone, performance of WebRTC, indirectly measured via achieved memory bandwidth, is reduced by 44%, leading to 52% drop in the frame rate and 14 times increase in the RTT value. With PALLOC (in conjunction with the CPUSET), we achieve 16% increase in memory bandwidth, which leads to 60% improvement in RTT value compared to CPUSET isolation alone while experiencing only 15% performance reduction in the co-running LBMs. With MemGuard, we are able to come up with a configuration which completely eliminates the interference to WebRTC and Xserver from co-running applications. However, this causes 75% drop in the performance of each co-running application, leading to 56% overall throughput reduction.

In this study, we found that PALLOC achieves moderate real-time performance improvements without significantly sacrificing the overall throughput. On the other hand, we found that MemGuard can be configured to achieve near perfect performance isolation for real-time applications, albeit at the cost of significantly reduced throughput for co-runners.

[1] Heechul Yun, Renato, Zheng-Pei Wu, Rodolfo Pellizzoni. "PALLOC: DRAM Bank-Aware Memory Allocator for Performance Isolation on Multicore Platforms," IEEE Intl. Conference on Real-Time and Embedded Technology and Applications Symposium (RTAS), 2014

[2] Heechul Yun, Yao Gang, Rodolfo Pellizzoni, Marco Caccamo, and Lui Sha, "MemGuard: Memory Bandwidth Reservation System for Efficient Performance Isolation in Multi-core Platforms," Real-Time and Embedded Technology and Applications Symposium (RTAS), April, 2013