diff --git a/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.html b/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.html index 57300db4b5ff..31c99382994e 100644 --- a/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.html +++ b/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.html @@ -56,8 +56,8 @@ sections. RCU-preempt Expedited Grace Periods

-CONFIG_PREEMPT=y kernels implement RCU-preempt. -The overall flow of the handling of a given CPU by an RCU-preempt +CONFIG_PREEMPT=y and CONFIG_PREEMPT_RT=y kernels implement +RCU-preempt. The overall flow of the handling of a given CPU by an RCU-preempt expedited grace period is shown in the following diagram:

ExpRCUFlow.svg @@ -140,8 +140,8 @@ or offline, among other things. RCU-sched Expedited Grace Periods

-CONFIG_PREEMPT=n kernels implement RCU-sched. -The overall flow of the handling of a given CPU by an RCU-sched +CONFIG_PREEMPT=n and CONFIG_PREEMPT_RT=n kernels implement +RCU-sched. The overall flow of the handling of a given CPU by an RCU-sched expedited grace period is shown in the following diagram:

ExpSchedFlow.svg diff --git a/Documentation/RCU/Design/Requirements/Requirements.html b/Documentation/RCU/Design/Requirements/Requirements.html index 467251f7fef6..348c5db1ff2b 100644 --- a/Documentation/RCU/Design/Requirements/Requirements.html +++ b/Documentation/RCU/Design/Requirements/Requirements.html @@ -106,7 +106,7 @@ big RCU read-side critical section. Production-quality implementations of rcu_read_lock() and rcu_read_unlock() are extremely lightweight, and in fact have exactly zero overhead in Linux kernels built for production -use with CONFIG_PREEMPT=n. +use with CONFIG_PREEMPTION=n.

This guarantee allows ordering to be enforced with extremely low @@ -1499,7 +1499,7 @@ costs have plummeted. However, as I learned from Matt Mackall's bloatwatch efforts, memory footprint is critically important on single-CPU systems with -non-preemptible (CONFIG_PREEMPT=n) kernels, and thus +non-preemptible (CONFIG_PREEMPTION=n) kernels, and thus tiny RCU was born. Josh Triplett has since taken over the small-memory banner with his @@ -1887,7 +1887,7 @@ constructs, there are limitations.

Implementations of RCU for which rcu_read_lock() and rcu_read_unlock() generate no code, such as -Linux-kernel RCU when CONFIG_PREEMPT=n, can be +Linux-kernel RCU when CONFIG_PREEMPTION=n, can be nested arbitrarily deeply. After all, there is no overhead. Except that if all these instances of rcu_read_lock() @@ -2229,7 +2229,7 @@ be a no-op.

However, once the scheduler has spawned its first kthread, this early boot trick fails for synchronize_rcu() (as well as for -synchronize_rcu_expedited()) in CONFIG_PREEMPT=y +synchronize_rcu_expedited()) in CONFIG_PREEMPTION=y kernels. The reason is that an RCU read-side critical section might be preempted, which means that a subsequent synchronize_rcu() really does have @@ -2568,7 +2568,7 @@ the following:

If the compiler did make this transformation in a -CONFIG_PREEMPT=n kernel build, and if get_user() did +CONFIG_PREEMPTION=n kernel build, and if get_user() did page fault, the result would be a quiescent state in the middle of an RCU read-side critical section. This misplaced quiescent state could result in line 4 being @@ -2906,7 +2906,7 @@ in conjunction with the The real-time-latency response requirements are such that the traditional approach of disabling preemption across RCU read-side critical sections is inappropriate. -Kernels built with CONFIG_PREEMPT=y therefore +Kernels built with CONFIG_PREEMPTION=y therefore use an RCU implementation that allows RCU read-side critical sections to be preempted. This requirement made its presence known after users made it @@ -3064,7 +3064,7 @@ includes rcu_barrier_bh(), and rcu_read_lock_bh_held(). However, the update-side APIs are now simple wrappers for other RCU -flavors, namely RCU-sched in CONFIG_PREEMPT=n kernels and RCU-preempt +flavors, namely RCU-sched in CONFIG_PREEMPTION=n kernels and RCU-preempt otherwise.

Sched Flavor (Historical)

@@ -3088,12 +3088,12 @@ of an RCU read-side critical section can be a quiescent state. Therefore, RCU-sched was created, which follows “classic” RCU in that an RCU-sched grace period waits for for pre-existing interrupt and NMI handlers. -In kernels built with CONFIG_PREEMPT=n, the RCU and RCU-sched +In kernels built with CONFIG_PREEMPTION=n, the RCU and RCU-sched APIs have identical implementations, while kernels built with -CONFIG_PREEMPT=y provide a separate implementation for each. +CONFIG_PREEMPTION=y provide a separate implementation for each.

-Note well that in CONFIG_PREEMPT=y kernels, +Note well that in CONFIG_PREEMPTION=y kernels, rcu_read_lock_sched() and rcu_read_unlock_sched() disable and re-enable preemption, respectively. This means that if there was a preemption attempt during the @@ -3302,12 +3302,12 @@ The tasks-RCU API is quite compact, consisting only of call_rcu_tasks(), synchronize_rcu_tasks(), and rcu_barrier_tasks(). -In CONFIG_PREEMPT=n kernels, trampolines cannot be preempted, +In CONFIG_PREEMPTION=n kernels, trampolines cannot be preempted, so these APIs map to call_rcu(), synchronize_rcu(), and rcu_barrier(), respectively. -In CONFIG_PREEMPT=y kernels, trampolines can be preempted, +In CONFIG_PREEMPTION=y kernels, trampolines can be preempted, and these three APIs are therefore implemented by separate functions that check for voluntary context switches. diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt index e98ff261a438..087dc6c22c37 100644 --- a/Documentation/RCU/checklist.txt +++ b/Documentation/RCU/checklist.txt @@ -210,8 +210,8 @@ over a rather long period of time, but improvements are always welcome! the rest of the system. 7. As of v4.20, a given kernel implements only one RCU flavor, - which is RCU-sched for PREEMPT=n and RCU-preempt for PREEMPT=y. - If the updater uses call_rcu() or synchronize_rcu(), + which is RCU-sched for PREEMPTION=n and RCU-preempt for + PREEMPTION=y. If the updater uses call_rcu() or synchronize_rcu(), then the corresponding readers my use rcu_read_lock() and rcu_read_unlock(), rcu_read_lock_bh() and rcu_read_unlock_bh(), or any pair of primitives that disables and re-enables preemption, diff --git a/Documentation/RCU/rcubarrier.txt b/Documentation/RCU/rcubarrier.txt index a2782df69732..5aa93c215af4 100644 --- a/Documentation/RCU/rcubarrier.txt +++ b/Documentation/RCU/rcubarrier.txt @@ -6,8 +6,8 @@ RCU (read-copy update) is a synchronization mechanism that can be thought of as a replacement for read-writer locking (among other things), but with very low-overhead readers that are immune to deadlock, priority inversion, and unbounded latency. RCU read-side critical sections are delimited -by rcu_read_lock() and rcu_read_unlock(), which, in non-CONFIG_PREEMPT -kernels, generate no code whatsoever. +by rcu_read_lock() and rcu_read_unlock(), which, in +non-CONFIG_PREEMPTION kernels, generate no code whatsoever. This means that RCU writers are unaware of the presence of concurrent readers, so that RCU updates to shared data must be undertaken quite @@ -303,10 +303,10 @@ Answer: This cannot happen. The reason is that on_each_cpu() has its last to smp_call_function() and further to smp_call_function_on_cpu(), causing this latter to spin until the cross-CPU invocation of rcu_barrier_func() has completed. This by itself would prevent - a grace period from completing on non-CONFIG_PREEMPT kernels, + a grace period from completing on non-CONFIG_PREEMPTION kernels, since each CPU must undergo a context switch (or other quiescent state) before the grace period can complete. However, this is - of no use in CONFIG_PREEMPT kernels. + of no use in CONFIG_PREEMPTION kernels. Therefore, on_each_cpu() disables preemption across its call to smp_call_function() and also across the local call to diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt index f48f4621ccbc..bd510771b75e 100644 --- a/Documentation/RCU/stallwarn.txt +++ b/Documentation/RCU/stallwarn.txt @@ -20,7 +20,7 @@ o A CPU looping with preemption disabled. o A CPU looping with bottom halves disabled. -o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel +o For !CONFIG_PREEMPTION kernels, a CPU looping anywhere in the kernel without invoking schedule(). If the looping in the kernel is really expected and desirable behavior, you might need to add some calls to cond_resched(). @@ -39,7 +39,7 @@ o Anything that prevents RCU's grace-period kthreads from running. result in the "rcu_.*kthread starved for" console-log message, which will include additional debugging information. -o A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might +o A CPU-bound real-time task in a CONFIG_PREEMPTION kernel, which might happen to preempt a low-priority task in the middle of an RCU read-side critical section. This is especially damaging if that low-priority task is not permitted to run on any other CPU, diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt index 7e1a8721637a..7e03e8f80b29 100644 --- a/Documentation/RCU/whatisRCU.txt +++ b/Documentation/RCU/whatisRCU.txt @@ -648,9 +648,10 @@ Quick Quiz #1: Why is this argument naive? How could a deadlock This section presents a "toy" RCU implementation that is based on "classic RCU". It is also short on performance (but only for updates) and -on features such as hotplug CPU and the ability to run in CONFIG_PREEMPT -kernels. The definitions of rcu_dereference() and rcu_assign_pointer() -are the same as those shown in the preceding section, so they are omitted. +on features such as hotplug CPU and the ability to run in +CONFIG_PREEMPTION kernels. The definitions of rcu_dereference() and +rcu_assign_pointer() are the same as those shown in the preceding +section, so they are omitted. void rcu_read_lock(void) { } diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst index 64aeee1009ca..0329a4d3fa9e 100644 --- a/Documentation/admin-guide/sysctl/vm.rst +++ b/Documentation/admin-guide/sysctl/vm.rst @@ -128,6 +128,9 @@ allowed to examine the unevictable lru (mlocked pages) for pages to compact. This should be used on systems where stalls for minor page faults are an acceptable trade for large contiguous free memory. Set to 0 to prevent compaction from moving pages that are unevictable. Default value is 1. +On CONFIG_PREEMPT_RT the default value is 0 in order to avoid a page fault, due +to compaction, which would block the task from becomming active until the fault +is resolved. dirty_background_bytes diff --git a/Documentation/printk-ringbuffer.txt b/Documentation/printk-ringbuffer.txt new file mode 100644 index 000000000000..6bde5dbd8545 --- /dev/null +++ b/Documentation/printk-ringbuffer.txt @@ -0,0 +1,377 @@ +struct printk_ringbuffer +------------------------ +John Ogness + +Overview +~~~~~~~~ +As the name suggests, this ring buffer was implemented specifically to serve +the needs of the printk() infrastructure. The ring buffer itself is not +specific to printk and could be used for other purposes. _However_, the +requirements and semantics of printk are rather unique. If you intend to use +this ring buffer for anything other than printk, you need to be very clear on +its features, behavior, and pitfalls. + +Features +^^^^^^^^ +The printk ring buffer has the following features: + +- single global buffer +- resides in initialized data section (available at early boot) +- lockless readers +- supports multiple writers +- supports multiple non-consuming readers +- safe from any context (including NMI) +- groups bytes into variable length blocks (referenced by entries) +- entries tagged with sequence numbers + +Behavior +^^^^^^^^ +Since the printk ring buffer readers are lockless, there exists no +synchronization between readers and writers. Basically writers are the tasks +in control and may overwrite any and all committed data at any time and from +any context. For this reason readers can miss entries if they are overwritten +before the reader was able to access the data. The reader API implementation +is such that reader access to entries is atomic, so there is no risk of +readers having to deal with partial or corrupt data. Also, entries are +tagged with sequence numbers so readers can recognize if entries were missed. + +Writing to the ring buffer consists of 2 steps. First a writer must reserve +an entry of desired size. After this step the writer has exclusive access +to the memory region. Once the data has been written to memory, it needs to +be committed to the ring buffer. After this step the entry has been inserted +into the ring buffer and assigned an appropriate sequence number. + +Once committed, a writer must no longer access the data directly. This is +because the data may have been overwritten and no longer exists. If a +writer must access the data, it should either keep a private copy before +committing the entry or use the reader API to gain access to the data. + +Because of how the data backend is implemented, entries that have been +reserved but not yet committed act as barriers, preventing future writers +from filling the ring buffer beyond the location of the reserved but not +yet committed entry region. For this reason it is *important* that writers +perform both reserve and commit as quickly as possible. Also, be aware that +preemption and local interrupts are disabled and writing to the ring buffer +is processor-reentrant locked during the reserve/commit window. Writers in +NMI contexts can still preempt any other writers, but as long as these +writers do not write a large amount of data with respect to the ring buffer +size, this should not become an issue. + +API +~~~ + +Declaration +^^^^^^^^^^^ +The printk ring buffer can be instantiated as a static structure: + + /* declare a static struct printk_ringbuffer */ + #define DECLARE_STATIC_PRINTKRB(name, szbits, cpulockptr) + +The value of szbits specifies the size of the ring buffer in bits. The +cpulockptr field is a pointer to a prb_cpulock struct that is used to +perform processor-reentrant spin locking for the writers. It is specified +externally because it may be used for multiple ring buffers (or other +code) to synchronize writers without risk of deadlock. + +Here is an example of a declaration of a printk ring buffer specifying a +32KB (2^15) ring buffer: + +.... +DECLARE_STATIC_PRINTKRB_CPULOCK(rb_cpulock); +DECLARE_STATIC_PRINTKRB(rb, 15, &rb_cpulock); +.... + +If writers will be using multiple ring buffers and the ordering of that usage +is not clear, the same prb_cpulock should be used for both ring buffers. + +Writer API +^^^^^^^^^^ +The writer API consists of 2 functions. The first is to reserve an entry in +the ring buffer, the second is to commit that data to the ring buffer. The +reserved entry information is stored within a provided `struct prb_handle`. + + /* reserve an entry */ + char *prb_reserve(struct prb_handle *h, struct printk_ringbuffer *rb, + unsigned int size); + + /* commit a reserved entry to the ring buffer */ + void prb_commit(struct prb_handle *h); + +Here is an example of a function to write data to a ring buffer: + +.... +int write_data(struct printk_ringbuffer *rb, char *data, int size) +{ + struct prb_handle h; + char *buf; + + buf = prb_reserve(&h, rb, size); + if (!buf) + return -1; + memcpy(buf, data, size); + prb_commit(&h); + + return 0; +} +.... + +Pitfalls +++++++++ +Be aware that prb_reserve() can fail. A retry might be successful, but it +depends entirely on whether or not the next part of the ring buffer to +overwrite belongs to reserved but not yet committed entries of other writers. +Writers can use the prb_inc_lost() function to allow readers to notice that a +message was lost. + +Reader API +^^^^^^^^^^ +The reader API utilizes a `struct prb_iterator` to track the reader's +position in the ring buffer. + + /* declare a pre-initialized static iterator for a ring buffer */ + #define DECLARE_STATIC_PRINTKRB_ITER(name, rbaddr) + + /* initialize iterator for a ring buffer (if static macro NOT used) */ + void prb_iter_init(struct prb_iterator *iter, + struct printk_ringbuffer *rb, u64 *seq); + + /* make a deep copy of an iterator */ + void prb_iter_copy(struct prb_iterator *dest, + struct prb_iterator *src); + + /* non-blocking, advance to next entry (and read the data) */ + int prb_iter_next(struct prb_iterator *iter, char *buf, + int size, u64 *seq); + + /* blocking, advance to next entry (and read the data) */ + int prb_iter_wait_next(struct prb_iterator *iter, char *buf, + int size, u64 *seq); + + /* position iterator at the entry seq */ + int prb_iter_seek(struct prb_iterator *iter, u64 seq); + + /* read data at current position */ + int prb_iter_data(struct prb_iterator *iter, char *buf, + int size, u64 *seq); + +Typically prb_iter_data() is not needed because the data can be retrieved +directly with prb_iter_next(). + +Here is an example of a non-blocking function that will read all the data in +a ring buffer: + +.... +void read_all_data(struct printk_ringbuffer *rb, char *buf, int size) +{ + struct prb_iterator iter; + u64 prev_seq = 0; + u64 seq; + int ret; + + prb_iter_init(&iter, rb, NULL); + + for (;;) { + ret = prb_iter_next(&iter, buf, size, &seq); + if (ret > 0) { + if (seq != ++prev_seq) { + /* "seq - prev_seq" entries missed */ + prev_seq = seq; + } + /* process buf here */ + } else if (ret == 0) { + /* hit the end, done */ + break; + } else if (ret < 0) { + /* + * iterator is invalid, a writer overtook us, reset the + * iterator and keep going, entries were missed + */ + prb_iter_init(&iter, rb, NULL); + } + } +} +.... + +Pitfalls +++++++++ +The reader's iterator can become invalid at any time because the reader was +overtaken by a writer. Typically the reader should reset the iterator back +to the current oldest entry (which will be newer than the entry the reader +was at) and continue, noting the number of entries that were missed. + +Utility API +^^^^^^^^^^^ +Several functions are available as convenience for external code. + + /* query the size of the data buffer */ + int prb_buffer_size(struct printk_ringbuffer *rb); + + /* skip a seq number to signify a lost record */ + void prb_inc_lost(struct printk_ringbuffer *rb); + + /* processor-reentrant spin lock */ + void prb_lock(struct prb_cpulock *cpu_lock, unsigned int *cpu_store); + + /* processor-reentrant spin unlock */ + void prb_lock(struct prb_cpulock *cpu_lock, unsigned int *cpu_store); + +Pitfalls +++++++++ +Although the value returned by prb_buffer_size() does represent an absolute +upper bound, the amount of data that can be stored within the ring buffer +is actually less because of the additional storage space of a header for each +entry. + +The prb_lock() and prb_unlock() functions can be used to synchronize between +ring buffer writers and other external activities. The function of a +processor-reentrant spin lock is to disable preemption and local interrupts +and synchronize against other processors. It does *not* protect against +multiple contexts of a single processor, i.e NMI. + +Implementation +~~~~~~~~~~~~~~ +This section describes several of the implementation concepts and details to +help developers better understand the code. + +Entries +^^^^^^^ +All ring buffer data is stored within a single static byte array. The reason +for this is to ensure that any pointers to the data (past and present) will +always point to valid memory. This is important because the lockless readers +may be preempted for long periods of time and when they resume may be working +with expired pointers. + +Entries are identified by start index and size. (The start index plus size +is the start index of the next entry.) The start index is not simply an +offset into the byte array, but rather a logical position (lpos) that maps +directly to byte array offsets. + +For example, for a byte array of 1000, an entry may have have a start index +of 100. Another entry may have a start index of 1100. And yet another 2100. +All of these entry are pointing to the same memory region, but only the most +recent entry is valid. The other entries are pointing to valid memory, but +represent entries that have been overwritten. + +Note that due to overflowing, the most recent entry is not necessarily the one +with the highest lpos value. Indeed, the printk ring buffer initializes its +data such that an overflow happens relatively quickly in order to validate the +handling of this situation. The implementation assumes that an lpos (unsigned +long) will never completely wrap while a reader is preempted. If this were to +become an issue, the seq number (which never wraps) could be used to increase +the robustness of handling this situation. + +Buffer Wrapping +^^^^^^^^^^^^^^^ +If an entry starts near the end of the byte array but would extend beyond it, +a special terminating entry (size = -1) is inserted into the byte array and +the real entry is placed at the beginning of the byte array. This can waste +space at the end of the byte array, but simplifies the implementation by +allowing writers to always work with contiguous buffers. + +Note that the size field is the first 4 bytes of the entry header. Also note +that calc_next() always ensures that there are at least 4 bytes left at the +end of the byte array to allow room for a terminating entry. + +Ring Buffer Pointers +^^^^^^^^^^^^^^^^^^^^ +Three pointers (lpos values) are used to manage the ring buffer: + + - _tail_: points to the oldest entry + - _head_: points to where the next new committed entry will be + - _reserve_: points to where the next new reserved entry will be + +These pointers always maintain a logical ordering: + + tail <= head <= reserve + +The reserve pointer moves forward when a writer reserves a new entry. The +head pointer moves forward when a writer commits a new entry. + +The reserve pointer cannot overwrite the tail pointer in a wrap situation. In +such a situation, the tail pointer must be "pushed forward", thus +invalidating that oldest entry. Readers identify if they are accessing a +valid entry by ensuring their entry pointer is `>= tail && < head`. + +If the tail pointer is equal to the head pointer, it cannot be pushed and any +reserve operation will fail. The only resolution is for writers to commit +their reserved entries. + +Processor-Reentrant Locking +^^^^^^^^^^^^^^^^^^^^^^^^^^^ +The purpose of the processor-reentrant locking is to limit the interruption +scenarios of writers to 2 contexts. This allows for a simplified +implementation where: + +- The reserve/commit window only exists on 1 processor at a time. A reserve + can never fail due to uncommitted entries of other processors. + +- When committing entries, it is trivial to handle the situation when + subsequent entries have already been committed, i.e. managing the head + pointer. + +Performance +~~~~~~~~~~~ +Some basic tests were performed on a quad Intel(R) Xeon(R) CPU E5-2697 v4 at +2.30GHz (36 cores / 72 threads). All tests involved writing a total of +32,000,000 records at an average of 33 bytes each. Each writer was pinned to +its own CPU and would write as fast as it could until a total of 32,000,000 +records were written. All tests involved 2 readers that were both pinned +together to another CPU. Each reader would read as fast as it could and track +how many of the 32,000,000 records it could read. All tests used a ring buffer +of 16KB in size, which holds around 350 records (header + data for each +entry). + +The only difference between the tests is the number of writers (and thus also +the number of records per writer). As more writers are added, the time to +write a record increases. This is because data pointers, modified via cmpxchg, +and global data access in general become more contended. + +1 writer +^^^^^^^^ + runtime: 0m 18s + reader1: 16219900/32000000 (50%) records + reader2: 16141582/32000000 (50%) records + +2 writers +^^^^^^^^^ + runtime: 0m 32s + reader1: 16327957/32000000 (51%) records + reader2: 16313988/32000000 (50%) records + +4 writers +^^^^^^^^^ + runtime: 0m 42s + reader1: 16421642/32000000 (51%) records + reader2: 16417224/32000000 (51%) records + +8 writers +^^^^^^^^^ + runtime: 0m 43s + reader1: 16418300/32000000 (51%) records + reader2: 16432222/32000000 (51%) records + +16 writers +^^^^^^^^^^ + runtime: 0m 54s + reader1: 16539189/32000000 (51%) records + reader2: 16542711/32000000 (51%) records + +32 writers +^^^^^^^^^^ + runtime: 1m 13s + reader1: 16731808/32000000 (52%) records + reader2: 16735119/32000000 (52%) records + +Comments +^^^^^^^^ +It is particularly interesting to compare/contrast the 1-writer and 32-writer +tests. Despite the writing of the 32,000,000 records taking over 4 times +longer, the readers (which perform no cmpxchg) were still unable to keep up. +This shows that the memory contention between the increasing number of CPUs +also has a dramatic effect on readers. + +It should also be noted that in all cases each reader was able to read >=50% +of the records. This means that a single reader would have been able to keep +up with the writer(s) in all cases, becoming slightly easier as more writers +are added. This was the purpose of pinning 2 readers to 1 CPU: to observe how +maximum reader performance changes. diff --git a/Documentation/trace/ftrace-uses.rst b/Documentation/trace/ftrace-uses.rst index 1fbc69894eed..1e0020b0bc74 100644 --- a/Documentation/trace/ftrace-uses.rst +++ b/Documentation/trace/ftrace-uses.rst @@ -146,7 +146,7 @@ FTRACE_OPS_FL_RECURSION_SAFE itself or any nested functions that those functions call. If this flag is set, it is possible that the callback will also - be called with preemption enabled (when CONFIG_PREEMPT is set), + be called with preemption enabled (when CONFIG_PREEMPTION is set), but this is not guaranteed. FTRACE_OPS_FL_IPMODIFY diff --git a/arch/Kconfig b/arch/Kconfig index 238dccfa7691..a886cbe86efc 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -31,6 +31,7 @@ config OPROFILE tristate "OProfile system profiling" depends on PROFILING depends on HAVE_OPROFILE + depends on !PREEMPT_RT select RING_BUFFER select RING_BUFFER_ALLOW_SWAP help diff --git a/arch/alpha/include/asm/spinlock_types.h b/arch/alpha/include/asm/spinlock_types.h index 1d5716bc060b..6883bc952d22 100644 --- a/arch/alpha/include/asm/spinlock_types.h +++ b/arch/alpha/include/asm/spinlock_types.h @@ -2,10 +2,6 @@ #ifndef _ALPHA_SPINLOCK_TYPES_H #define _ALPHA_SPINLOCK_TYPES_H -#ifndef __LINUX_SPINLOCK_TYPES_H -# error "please don't include this file directly" -#endif - typedef struct { volatile unsigned int lock; } arch_spinlock_t; diff --git a/arch/arc/kernel/entry.S b/arch/arc/kernel/entry.S index ea74a1eee5d9..29a45c2f0b17 100644 --- a/arch/arc/kernel/entry.S +++ b/arch/arc/kernel/entry.S @@ -331,11 +331,11 @@ resume_user_mode_begin: resume_kernel_mode: ; Disable Interrupts from this point on - ; CONFIG_PREEMPT: This is a must for preempt_schedule_irq() - ; !CONFIG_PREEMPT: To ensure restore_regs is intr safe + ; CONFIG_PREEMPTION: This is a must for preempt_schedule_irq() + ; !CONFIG_PREEMPTION: To ensure restore_regs is intr safe IRQ_DISABLE r9 -#ifdef CONFIG_PREEMPT +#ifdef CONFIG_PREEMPTION ; Can't preempt if preemption disabled GET_CURR_THR_INFO_FROM_SP r10 diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 05c9bbfe444d..d3f02889760b 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -32,6 +32,7 @@ config ARM select ARCH_OPTIONAL_KERNEL_RWX if ARCH_HAS_STRICT_KERNEL_RWX select ARCH_OPTIONAL_KERNEL_RWX_DEFAULT if CPU_V7 select ARCH_SUPPORTS_ATOMIC_RMW + select ARCH_SUPPORTS_RT select ARCH_USE_BUILTIN_BSWAP select ARCH_USE_CMPXCHG_LOCKREF select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU @@ -64,7 +65,7 @@ config ARM select HARDIRQS_SW_RESEND select HAVE_ARCH_AUDITSYSCALL if AEABI && !OABI_COMPAT select HAVE_ARCH_BITREVERSE if (CPU_32v7M || CPU_32v7) && !CPU_32v6 - select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL && !CPU_ENDIAN_BE32 && MMU + select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL && !CPU_ENDIAN_BE32 && MMU && !PREEMPT_RT select HAVE_ARCH_KGDB if !CPU_ENDIAN_BE32 && MMU select HAVE_ARCH_MMAP_RND_BITS if MMU select HAVE_ARCH_SECCOMP_FILTER if AEABI && !OABI_COMPAT @@ -103,6 +104,7 @@ config ARM select HAVE_PERF_EVENTS select HAVE_PERF_REGS select HAVE_PERF_USER_STACK_DUMP + select HAVE_PREEMPT_LAZY select HAVE_RCU_TABLE_FREE if SMP && ARM_LPAE select HAVE_REGS_AND_STACK_ACCESS_API select HAVE_RSEQ diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h index 46d41140df27..c421b5b81946 100644 --- a/arch/arm/include/asm/irq.h +++ b/arch/arm/include/asm/irq.h @@ -23,6 +23,8 @@ #endif #ifndef __ASSEMBLY__ +#include + struct irqaction; struct pt_regs; diff --git a/arch/arm/include/asm/spinlock_types.h b/arch/arm/include/asm/spinlock_types.h index 5976958647fe..a37c0803954b 100644 --- a/arch/arm/include/asm/spinlock_types.h +++ b/arch/arm/include/asm/spinlock_types.h @@ -2,10 +2,6 @@ #ifndef __ASM_SPINLOCK_TYPES_H #define __ASM_SPINLOCK_TYPES_H -#ifndef __LINUX_SPINLOCK_TYPES_H -# error "please don't include this file directly" -#endif - #define TICKET_SHIFT 16 typedef struct { diff --git a/arch/arm/include/asm/switch_to.h b/arch/arm/include/asm/switch_to.h index d3e937dcee4d..285e6248454f 100644 --- a/arch/arm/include/asm/switch_to.h +++ b/arch/arm/include/asm/switch_to.h @@ -4,13 +4,20 @@ #include +#if defined CONFIG_PREEMPT_RT && defined CONFIG_HIGHMEM +void switch_kmaps(struct task_struct *prev_p, struct task_struct *next_p); +#else +static inline void +switch_kmaps(struct task_struct *prev_p, struct task_struct *next_p) { } +#endif + /* * For v7 SMP cores running a preemptible kernel we may be pre-empted * during a TLB maintenance operation, so execute an inner-shareable dsb * to ensure that the maintenance completes in case we migrate to another * CPU. */ -#if defined(CONFIG_PREEMPT) && defined(CONFIG_SMP) && defined(CONFIG_CPU_V7) +#if defined(CONFIG_PREEMPTION) && defined(CONFIG_SMP) && defined(CONFIG_CPU_V7) #define __complete_pending_tlbi() dsb(ish) #else #define __complete_pending_tlbi() @@ -26,6 +33,7 @@ extern struct task_struct *__switch_to(struct task_struct *, struct thread_info #define switch_to(prev,next,last) \ do { \ __complete_pending_tlbi(); \ + switch_kmaps(prev, next); \ last = __switch_to(prev,task_thread_info(prev), task_thread_info(next)); \ } while (0) diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h index 0d0d5178e2c3..13be9c2137e2 100644 --- a/arch/arm/include/asm/thread_info.h +++ b/arch/arm/include/asm/thread_info.h @@ -46,6 +46,7 @@ struct cpu_context_save { struct thread_info { unsigned long flags; /* low level flags */ int preempt_count; /* 0 => preemptable, <0 => bug */ + int preempt_lazy_count; /* 0 => preemptable, <0 => bug */ mm_segment_t addr_limit; /* address limit */ struct task_struct *task; /* main task structure */ __u32 cpu; /* cpu */ @@ -139,7 +140,8 @@ extern int vfp_restore_user_hwstate(struct user_vfp *, #define TIF_SYSCALL_TRACE 4 /* syscall trace active */ #define TIF_SYSCALL_AUDIT 5 /* syscall auditing active */ #define TIF_SYSCALL_TRACEPOINT 6 /* syscall tracepoint instrumentation */ -#define TIF_SECCOMP 7 /* seccomp syscall filtering active */ +#define TIF_SECCOMP 8 /* seccomp syscall filtering active */ +#define TIF_NEED_RESCHED_LAZY 7 #define TIF_NOHZ 12 /* in adaptive nohz mode */ #define TIF_USING_IWMMXT 17 @@ -149,6 +151,7 @@ extern int vfp_restore_user_hwstate(struct user_vfp *, #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) #define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME) +#define _TIF_NEED_RESCHED_LAZY (1 << TIF_NEED_RESCHED_LAZY) #define _TIF_UPROBE (1 << TIF_UPROBE) #define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE) #define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT) @@ -164,7 +167,8 @@ extern int vfp_restore_user_hwstate(struct user_vfp *, * Change these and you break ASM code in entry-common.S */ #define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \ - _TIF_NOTIFY_RESUME | _TIF_UPROBE) + _TIF_NOTIFY_RESUME | _TIF_UPROBE | \ + _TIF_NEED_RESCHED_LAZY) #endif /* __KERNEL__ */ #endif /* __ASM_ARM_THREAD_INFO_H */ diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c index c773b829ee8e..f3a0e1cd1f04 100644 --- a/arch/arm/kernel/asm-offsets.c +++ b/arch/arm/kernel/asm-offsets.c @@ -53,6 +53,7 @@ int main(void) BLANK(); DEFINE(TI_FLAGS, offsetof(struct thread_info, flags)); DEFINE(TI_PREEMPT, offsetof(struct thread_info, preempt_count)); + DEFINE(TI_PREEMPT_LAZY, offsetof(struct thread_info, preempt_lazy_count)); DEFINE(TI_ADDR_LIMIT, offsetof(struct thread_info, addr_limit)); DEFINE(TI_TASK, offsetof(struct thread_info, task)); DEFINE(TI_CPU, offsetof(struct thread_info, cpu)); diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S index a874b753397e..1e689a727cb9 100644 --- a/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -204,13 +204,20 @@ __irq_svc: svc_entry irq_handler -#ifdef CONFIG_PREEMPT +#ifdef CONFIG_PREEMPTION ldr r8, [tsk, #TI_PREEMPT] @ get preempt count - ldr r0, [tsk, #TI_FLAGS] @ get flags teq r8, #0 @ if preempt count != 0 + bne 1f @ return from exeption + ldr r0, [tsk, #TI_FLAGS] @ get flags + tst r0, #_TIF_NEED_RESCHED @ if NEED_RESCHED is set + blne svc_preempt @ preempt! + + ldr r8, [tsk, #TI_PREEMPT_LAZY] @ get preempt lazy count + teq r8, #0 @ if preempt lazy count != 0 movne r0, #0 @ force flags to 0 - tst r0, #_TIF_NEED_RESCHED + tst r0, #_TIF_NEED_RESCHED_LAZY blne svc_preempt +1: #endif svc_exit r5, irq = 1 @ return from exception @@ -219,14 +226,20 @@ ENDPROC(__irq_svc) .ltorg -#ifdef CONFIG_PREEMPT +#ifdef CONFIG_PREEMPTION svc_preempt: mov r8, lr 1: bl preempt_schedule_irq @ irq en/disable is done inside ldr r0, [tsk, #TI_FLAGS] @ get new tasks TI_FLAGS tst r0, #_TIF_NEED_RESCHED + bne 1b + tst r0, #_TIF_NEED_RESCHED_LAZY reteq r8 @ go again - b 1b + ldr r0, [tsk, #TI_PREEMPT_LAZY] @ get preempt lazy count + teq r0, #0 @ if preempt lazy count != 0 + beq 1b + ret r8 @ go again + #endif __und_fault: diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S index 271cb8a1eba1..fd039b1b3731 100644 --- a/arch/arm/kernel/entry-common.S +++ b/arch/arm/kernel/entry-common.S @@ -53,7 +53,9 @@ __ret_fast_syscall: cmp r2, #TASK_SIZE blne addr_limit_check_failed ldr r1, [tsk, #TI_FLAGS] @ re-check for syscall tracing - tst r1, #_TIF_SYSCALL_WORK | _TIF_WORK_MASK + tst r1, #((_TIF_SYSCALL_WORK | _TIF_WORK_MASK) & ~_TIF_SECCOMP) + bne fast_work_pending + tst r1, #_TIF_SECCOMP bne fast_work_pending @@ -90,8 +92,11 @@ __ret_fast_syscall: cmp r2, #TASK_SIZE blne addr_limit_check_failed ldr r1, [tsk, #TI_FLAGS] @ re-check for syscall tracing - tst r1, #_TIF_SYSCALL_WORK | _TIF_WORK_MASK + tst r1, #((_TIF_SYSCALL_WORK | _TIF_WORK_MASK) & ~_TIF_SECCOMP) + bne do_slower_path + tst r1, #_TIF_SECCOMP beq no_work_pending +do_slower_path: UNWIND(.fnend ) ENDPROC(ret_fast_syscall) diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c index ab2568996ddb..50d43ce823bf 100644 --- a/arch/arm/kernel/signal.c +++ b/arch/arm/kernel/signal.c @@ -649,7 +649,8 @@ do_work_pending(struct pt_regs *regs, unsigned int thread_flags, int syscall) */ trace_hardirqs_off(); do { - if (likely(thread_flags & _TIF_NEED_RESCHED)) { + if (likely(thread_flags & (_TIF_NEED_RESCHED | + _TIF_NEED_RESCHED_LAZY))) { schedule(); } else { if (unlikely(!user_mode(regs))) diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 46e1be9e57a8..581caf6493a9 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -682,11 +682,9 @@ void handle_IPI(int ipinr, struct pt_regs *regs) break; case IPI_CPU_BACKTRACE: - printk_nmi_enter(); irq_enter(); nmi_cpu_backtrace(regs); irq_exit(); - printk_nmi_exit(); break; default: diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index 97a512551b21..1e70e7227f0f 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -250,6 +250,8 @@ void show_stack(struct task_struct *tsk, unsigned long *sp) #ifdef CONFIG_PREEMPT #define S_PREEMPT " PREEMPT" +#elif defined(CONFIG_PREEMPT_RT) +#define S_PREEMPT " PREEMPT_RT" #else #define S_PREEMPT "" #endif diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S index 0ee8fc4b4672..dc8f152f3556 100644 --- a/arch/arm/mm/cache-v7.S +++ b/arch/arm/mm/cache-v7.S @@ -135,13 +135,13 @@ flush_levels: and r1, r1, #7 @ mask of the bits for current cache only cmp r1, #2 @ see what cache we have at this level blt skip @ skip if no cache, or just i-cache -#ifdef CONFIG_PREEMPT +#ifdef CONFIG_PREEMPTION save_and_disable_irqs_notrace r9 @ make cssr&csidr read atomic #endif mcr p15, 2, r10, c0, c0, 0 @ select current cache level in cssr isb @ isb to sych the new cssr&csidr mrc p15, 1, r1, c0, c0, 0 @ read the new csidr -#ifdef CONFIG_PREEMPT +#ifdef CONFIG_PREEMPTION restore_irqs_notrace r9 #endif and r2, r1, #7 @ extract the length of the cache lines diff --git a/arch/arm/mm/cache-v7m.S b/arch/arm/mm/cache-v7m.S index a0035c426ce6..1bc3a0a50753 100644 --- a/arch/arm/mm/cache-v7m.S +++ b/arch/arm/mm/cache-v7m.S @@ -183,13 +183,13 @@ flush_levels: and r1, r1, #7 @ mask of the bits for current cache only cmp r1, #2 @ see what cache we have at this level blt skip @ skip if no cache, or just i-cache -#ifdef CONFIG_PREEMPT +#ifdef CONFIG_PREEMPTION save_and_disable_irqs_notrace r9 @ make cssr&csidr read atomic #endif write_csselr r10, r1 @ set current cache level isb @ isb to sych the new cssr&csidr read_ccsidr r1 @ read the new csidr -#ifdef CONFIG_PREEMPT +#ifdef CONFIG_PREEMPTION restore_irqs_notrace r9 #endif and r2, r1, #7 @ extract the length of the cache lines diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index bd0f4821f7e1..865fd8fbfe59 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -414,6 +414,9 @@ do_translation_fault(unsigned long addr, unsigned int fsr, if (addr < TASK_SIZE) return do_page_fault(addr, fsr, regs); + if (interrupts_enabled(regs)) + local_irq_enable(); + if (user_mode(regs)) goto bad_area; @@ -481,6 +484,9 @@ do_translation_fault(unsigned long addr, unsigned int fsr, static int do_sect_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) { + if (interrupts_enabled(regs)) + local_irq_enable(); + do_bad_area(addr, fsr, regs); return 0; } diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c index a76f8ace9ce6..eb4a361d47d7 100644 --- a/arch/arm/mm/highmem.c +++ b/arch/arm/mm/highmem.c @@ -31,6 +31,11 @@ static inline pte_t get_fixmap_pte(unsigned long vaddr) return *ptep; } +static unsigned int fixmap_idx(int type) +{ + return FIX_KMAP_BEGIN + type + KM_TYPE_NR * smp_processor_id(); +} + void *kmap(struct page *page) { might_sleep(); @@ -51,12 +56,13 @@ EXPORT_SYMBOL(kunmap); void *kmap_atomic(struct page *page) { + pte_t pte = mk_pte(page, kmap_prot); unsigned int idx; unsigned long vaddr; void *kmap; int type; - preempt_disable(); + preempt_disable_nort(); pagefault_disable(); if (!PageHighMem(page)) return page_address(page); @@ -76,7 +82,7 @@ void *kmap_atomic(struct page *page) type = kmap_atomic_idx_push(); - idx = FIX_KMAP_BEGIN + type + KM_TYPE_NR * smp_processor_id(); + idx = fixmap_idx(type); vaddr = __fix_to_virt(idx); #ifdef CONFIG_DEBUG_HIGHMEM /* @@ -90,7 +96,10 @@ void *kmap_atomic(struct page *page) * in place, so the contained TLB flush ensures the TLB is updated * with the new mapping. */ - set_fixmap_pte(idx, mk_pte(page, kmap_prot)); +#ifdef CONFIG_PREEMPT_RT + current->kmap_pte[type] = pte; +#endif + set_fixmap_pte(idx, pte); return (void *)vaddr; } @@ -103,44 +112,75 @@ void __kunmap_atomic(void *kvaddr) if (kvaddr >= (void *)FIXADDR_START) { type = kmap_atomic_idx(); - idx = FIX_KMAP_BEGIN + type + KM_TYPE_NR * smp_processor_id(); + idx = fixmap_idx(type); if (cache_is_vivt()) __cpuc_flush_dcache_area((void *)vaddr, PAGE_SIZE); +#ifdef CONFIG_PREEMPT_RT + current->kmap_pte[type] = __pte(0); +#endif #ifdef CONFIG_DEBUG_HIGHMEM BUG_ON(vaddr != __fix_to_virt(idx)); - set_fixmap_pte(idx, __pte(0)); #else (void) idx; /* to kill a warning */ #endif + set_fixmap_pte(idx, __pte(0)); kmap_atomic_idx_pop(); } else if (vaddr >= PKMAP_ADDR(0) && vaddr < PKMAP_ADDR(LAST_PKMAP)) { /* this address was obtained through kmap_high_get() */ kunmap_high(pte_page(pkmap_page_table[PKMAP_NR(vaddr)])); } pagefault_enable(); - preempt_enable(); + preempt_enable_nort(); } EXPORT_SYMBOL(__kunmap_atomic); void *kmap_atomic_pfn(unsigned long pfn) { + pte_t pte = pfn_pte(pfn, kmap_prot); unsigned long vaddr; int idx, type; struct page *page = pfn_to_page(pfn); - preempt_disable(); + preempt_disable_nort(); pagefault_disable(); if (!PageHighMem(page)) return page_address(page); type = kmap_atomic_idx_push(); - idx = FIX_KMAP_BEGIN + type + KM_TYPE_NR * smp_processor_id(); + idx = fixmap_idx(type); vaddr = __fix_to_virt(idx); #ifdef CONFIG_DEBUG_HIGHMEM BUG_ON(!pte_none(get_fixmap_pte(vaddr))); #endif - set_fixmap_pte(idx, pfn_pte(pfn, kmap_prot)); +#ifdef CONFIG_PREEMPT_RT + current->kmap_pte[type] = pte; +#endif + set_fixmap_pte(idx, pte); return (void *)vaddr; } +#if defined CONFIG_PREEMPT_RT +void switch_kmaps(struct task_struct *prev_p, struct task_struct *next_p) +{ + int i; + + /* + * Clear @prev's kmap_atomic mappings + */ + for (i = 0; i < prev_p->kmap_idx; i++) { + int idx = fixmap_idx(i); + + set_fixmap_pte(idx, __pte(0)); + } + /* + * Restore @next_p's kmap_atomic mappings + */ + for (i = 0; i < next_p->kmap_idx; i++) { + int idx = fixmap_idx(i); + + if (!pte_none(next_p->kmap_pte[i])) + set_fixmap_pte(idx, next_p->kmap_pte[i]); + } +} +#endif diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index a0bc9bbb92f3..0ba73283452c 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -35,32 +35,32 @@ config ARM64 select ARCH_HAS_TEARDOWN_DMA_OPS if IOMMU_SUPPORT select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST select ARCH_HAVE_NMI_SAFE_CMPXCHG - select ARCH_INLINE_READ_LOCK if !PREEMPT - select ARCH_INLINE_READ_LOCK_BH if !PREEMPT - select ARCH_INLINE_READ_LOCK_IRQ if !PREEMPT - select ARCH_INLINE_READ_LOCK_IRQSAVE if !PREEMPT - select ARCH_INLINE_READ_UNLOCK if !PREEMPT - select ARCH_INLINE_READ_UNLOCK_BH if !PREEMPT - select ARCH_INLINE_READ_UNLOCK_IRQ if !PREEMPT - select ARCH_INLINE_READ_UNLOCK_IRQRESTORE if !PREEMPT - select ARCH_INLINE_WRITE_LOCK if !PREEMPT - select ARCH_INLINE_WRITE_LOCK_BH if !PREEMPT - select ARCH_INLINE_WRITE_LOCK_IRQ if !PREEMPT - select ARCH_INLINE_WRITE_LOCK_IRQSAVE if !PREEMPT - select ARCH_INLINE_WRITE_UNLOCK if !PREEMPT - select ARCH_INLINE_WRITE_UNLOCK_BH if !PREEMPT - select ARCH_INLINE_WRITE_UNLOCK_IRQ if !PREEMPT - select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE if !PREEMPT - select ARCH_INLINE_SPIN_TRYLOCK if !PREEMPT - select ARCH_INLINE_SPIN_TRYLOCK_BH if !PREEMPT - select ARCH_INLINE_SPIN_LOCK if !PREEMPT - select ARCH_INLINE_SPIN_LOCK_BH if !PREEMPT - select ARCH_INLINE_SPIN_LOCK_IRQ if !PREEMPT - select ARCH_INLINE_SPIN_LOCK_IRQSAVE if !PREEMPT - select ARCH_INLINE_SPIN_UNLOCK if !PREEMPT - select ARCH_INLINE_SPIN_UNLOCK_BH if !PREEMPT - select ARCH_INLINE_SPIN_UNLOCK_IRQ if !PREEMPT - select ARCH_INLINE_SPIN_UNLOCK_IRQRESTORE if !PREEMPT + select ARCH_INLINE_READ_LOCK if !PREEMPTION + select ARCH_INLINE_READ_LOCK_BH if !PREEMPTION + select ARCH_INLINE_READ_LOCK_IRQ if !PREEMPTION + select ARCH_INLINE_READ_LOCK_IRQSAVE if !PREEMPTION + select ARCH_INLINE_READ_UNLOCK if !PREEMPTION + select ARCH_INLINE_READ_UNLOCK_BH if !PREEMPTION + select ARCH_INLINE_READ_UNLOCK_IRQ if !PREEMPTION + select ARCH_INLINE_READ_UNLOCK_IRQRESTORE if !PREEMPTION + select ARCH_INLINE_WRITE_LOCK if !PREEMPTION + select ARCH_INLINE_WRITE_LOCK_BH if !PREEMPTION + select ARCH_INLINE_WRITE_LOCK_IRQ if !PREEMPTION + select ARCH_INLINE_WRITE_LOCK_IRQSAVE if !PREEMPTION + select ARCH_INLINE_WRITE_UNLOCK if !PREEMPTION + select ARCH_INLINE_WRITE_UNLOCK_BH if !PREEMPTION + select ARCH_INLINE_WRITE_UNLOCK_IRQ if !PREEMPTION + select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE if !PREEMPTION + select ARCH_INLINE_SPIN_TRYLOCK if !PREEMPTION + select ARCH_INLINE_SPIN_TRYLOCK_BH if !PREEMPTION + select ARCH_INLINE_SPIN_LOCK if !PREEMPTION + select ARCH_INLINE_SPIN_LOCK_BH if !PREEMPTION + select ARCH_INLINE_SPIN_LOCK_IRQ if !PREEMPTION + select ARCH_INLINE_SPIN_LOCK_IRQSAVE if !PREEMPTION + select ARCH_INLINE_SPIN_UNLOCK if !PREEMPTION + select ARCH_INLINE_SPIN_UNLOCK_BH if !PREEMPTION + select ARCH_INLINE_SPIN_UNLOCK_IRQ if !PREEMPTION + select ARCH_INLINE_SPIN_UNLOCK_IRQRESTORE if !PREEMPTION select ARCH_KEEP_MEMBLOCK select ARCH_USE_CMPXCHG_LOCKREF select ARCH_USE_QUEUED_RWLOCKS @@ -69,6 +69,7 @@ config ARM64 select ARCH_SUPPORTS_ATOMIC_RMW select ARCH_SUPPORTS_INT128 if GCC_VERSION >= 50000 || CC_IS_CLANG select ARCH_SUPPORTS_NUMA_BALANCING + select ARCH_SUPPORTS_RT select ARCH_WANT_COMPAT_IPC_PARSE_VERSION if COMPAT select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT select ARCH_WANT_FRAME_POINTERS @@ -159,6 +160,7 @@ config ARM64 select HAVE_PERF_EVENTS select HAVE_PERF_REGS select HAVE_PERF_USER_STACK_DUMP + select HAVE_PREEMPT_LAZY select HAVE_REGS_AND_STACK_ACCESS_API select HAVE_FUNCTION_ARG_ACCESS_API select HAVE_RCU_TABLE_FREE diff --git a/arch/arm64/crypto/sha256-glue.c b/arch/arm64/crypto/sha256-glue.c index e273faca924f..999da59f03a9 100644 --- a/arch/arm64/crypto/sha256-glue.c +++ b/arch/arm64/crypto/sha256-glue.c @@ -97,7 +97,7 @@ static int sha256_update_neon(struct shash_desc *desc, const u8 *data, * input when running on a preemptible kernel, but process the * data block by block instead. */ - if (IS_ENABLED(CONFIG_PREEMPT) && + if (IS_ENABLED(CONFIG_PREEMPTION) && chunk + sctx->count % SHA256_BLOCK_SIZE > SHA256_BLOCK_SIZE) chunk = SHA256_BLOCK_SIZE - sctx->count % SHA256_BLOCK_SIZE; diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index b8cf7c85ffa2..2cc0dd8bd9f7 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -699,8 +699,8 @@ USER(\label, ic ivau, \tmp2) // invalidate I line PoU * where