With sane audit rules, audit logging would only be triggered
infrequently. Keeping this in mind, annotate audit_in_mask() as
unlikely() to allow the compiler to pessimize the call to
audit_filter_rules().
This allows GCC to invert the branch direction for the audit_filter_rules()
basic block in this loop:
list_for_each_entry_rcu(e, &audit_filter_list[AUDIT_FILTER_EXIT], list) {
if (audit_in_mask(&e->rule, major) &&
audit_filter_rules(tsk, &e->rule, ctx, NULL,
&state, false)) {
...
such that it executes the common case in a straight line fashion.
On a Skylakex system change in getpid() latency (all results
aggregated across 12 boot cycles):
Min Mean Median Max pstdev
(ns) (ns) (ns) (ns)
- 196.63 207.86 206.60 230.98 (+- 3.92%)
+ 173.11 182.51 179.65 202.09 (+- 4.34%)
Performance counter stats for 'bin/getpid' (3 runs) go from:
cycles 805.58 ( +- 4.11% )
instructions 1654.11 ( +- .05% )
IPC 2.06 ( +- 3.39% )
branches 430.02 ( +- .05% )
branch-misses 1.55 ( +- 7.09% )
L1-dcache-loads 440.01 ( +- .09% )
L1-dcache-load-misses 9.05 ( +- 74.03% )
to:
cycles 706.13 ( +- 4.13% )
instructions 1654.70 ( +- .06% )
IPC 2.35 ( +- 4.25% )
branches 430.99 ( +- .06% )
branch-misses 0.50 ( +- 2.00% )
L1-dcache-loads 440.02 ( +- .07% )
L1-dcache-load-misses 5.22 ( +- 82.75% )
(Both aggregated over 12 boot cycles.)
cycles: performance improves on average by ~100 cycles/call. IPC
improves commensurately. Two reasons for this improvement:
* one fewer branch mispred: no obvious reason for this
branch-miss reduction. There is no significant change in
basic-block structure (apart from the branch inversion.)
* the direction of the branch for the call is now inverted, so it
chooses the not-taken direction more often. The issue-latency
for not-taken branches is often cheaper.
Signed-off-by: Ankur Arora <ankur.a.arora(a)oracle.com>
---
kernel/auditsc.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 533b087c3c02..bf26f47b5226 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -789,7 +789,7 @@ static enum audit_state audit_filter_task(struct task_struct *tsk,
char **key)
return AUDIT_STATE_BUILD;
}
-static int audit_in_mask(const struct audit_krule *rule, unsigned long val)
+static bool audit_in_mask(const struct audit_krule *rule, unsigned long val)
{
int word, bit;
@@ -850,12 +850,13 @@ static void audit_filter_syscall(struct task_struct *tsk,
rcu_read_lock();
list_for_each_entry_rcu(e, &audit_filter_list[AUDIT_FILTER_EXIT], list) {
- if (audit_in_mask(&e->rule, major) &&
- audit_filter_rules(tsk, &e->rule, ctx, NULL,
- &state, false)) {
- rcu_read_unlock();
- ctx->current_state = state;
- return;
+ if (unlikely(audit_in_mask(&e->rule, major))) {
+ if (audit_filter_rules(tsk, &e->rule, ctx, NULL,
+ &state, false)) {
+ rcu_read_unlock();
+ ctx->current_state = state;
+ return;
+ }
}
}
rcu_read_unlock();
--
2.31.1