Re: Lost events during boot

Monday, 20 March 2017

On Monday, March 20, 2017 10:55:43 AM EDT Paul Moore wrote:
...
 On Mon, Mar 20, 2017 at 10:44 AM, Paul Moore
<paul(a)paul-moore.com&gt; wrote:
 > On Mon, Mar 20, 2017 at 8:08 AM, Paul Moore <paul(a)paul-moore.com&gt; wrote:
 >> On Sun, Mar 19, 2017 at 9:46 PM, Steve Grubb <sgrubb(a)redhat.com&gt; wrote:
 >>> Hello Richard and Paul,
 >>> 
 >>> I was going to do a blog write up about booting the system with
 >>> audit_backlog_limit=8192 for STIG users and have stumbled on to a
 >>> mystery. The kernel initializes the variable to 64 at power on. During
 >>> boot, if audit == 1, then it holds events in the hopes that an audit
 >>> daemon will show up later and drain all the events. Anything over 64
 >>> events should fall off the end and increment the lost counter and put a
 >>> notice in syslog.
 >>> 
 >>> However, when booting with audit_backlog_limit=8192, as soon as I log in
 >>> I run "auditctl -s" I can see I've lost 73 events. The I run
"aureport
 >>> --start boot" and I see 644 total events. This is nowhere near the
8192
 >>> limit that I asked for. So, why am I losing events?
 >>> 
 >>> Additionally, I checked the logs and there is absolutely no message in
 >>> syslog showing that I've lost events. This is with failure mode set to
 >>> 1 - which is default at power on. And this is in spite of the the fact
 >>> that the source code seems to show that it should have printk'ed
 >>> something.
 >>> 
 >>> Any ideas? Can you replicate this finding?
 >> 
 >> It's funny, I just noticed this for the first time on Friday (the
 >> exact same lost count too), although it was a development kernel build
 >> with a *heavily* modified audit subsystem so I just assumed I had
 >> broken something with the queuing, the lost counter, or both.  It's
 >> possible I still may have broken something in the v4.10 queue rework,
 >> or something broke a long time ago and we are just noticing it now.
 >> 
 >> First off, can you create a GitHub issue for this and include your
 >> kernel build (e.g. 'uname -r')?  Second, if you are seeing this on a
 >> +v4.10 kernel, do you see the same results with a +v4.9 kernel?
 > 
 > Quick follow-up, and completely untested, but it would appear that the
 > problem lies in kauditd_hold_skb()/kauditd_print_skb();
 > kauditd_print_skb() registers a false lost record when the printk
 > ratelimit is tripped.  The fix is rather simple, and I'll include that
 > in an upcoming patchset.

 ... and a quick question, if the kernel is booted without "audit=1" do
 we want to count lost records in the case where the backlog overflows? 
If audit == 0, then we should not care because auditing may never be enabled. 
If for some reason audit == 2, then I suppose we should care.

-Steve

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: Lost events during boot