On Mon, Mar 20, 2017 at 11:05 AM, Steve Grubb <sgrubb(a)redhat.com> wrote:
On Monday, March 20, 2017 8:08:27 AM EDT Paul Moore wrote:
> On Sun, Mar 19, 2017 at 9:46 PM, Steve Grubb <sgrubb(a)redhat.com> wrote:
> > Hello Richard and Paul,
> >
> > I was going to do a blog write up about booting the system with
> > audit_backlog_limit=8192 for STIG users and have stumbled on to a mystery.
> > The kernel initializes the variable to 64 at power on. During boot, if
> > audit == 1, then it holds events in the hopes that an audit daemon will
> > show up later and drain all the events. Anything over 64 events should
> > fall off the end and increment the lost counter and put a notice in
> > syslog.
> >
> > However, when booting with audit_backlog_limit=8192, as soon as I log in I
> > run "auditctl -s" I can see I've lost 73 events. The I run
"aureport
> > --start boot" and I see 644 total events. This is nowhere near the 8192
> > limit that I asked for. So, why am I losing events?
> >
> > Additionally, I checked the logs and there is absolutely no message in
> > syslog showing that I've lost events. This is with failure mode set to 1
> > - which is default at power on. And this is in spite of the the fact that
> > the source code seems to show that it should have printk'ed something.
> >
> > Any ideas? Can you replicate this finding?
>
> It's funny, I just noticed this for the first time on Friday (the
> exact same lost count too), although it was a development kernel build
> with a *heavily* modified audit subsystem so I just assumed I had
> broken something with the queuing, the lost counter, or both. It's
> possible I still may have broken something in the v4.10 queue rework,
> or something broke a long time ago and we are just noticing it now.
>
> First off, can you create a GitHub issue for this
Lost events during boot #38.
See it, thanks.
> and include your kernel build (e.g. 'uname -r')?
# uname -r
4.9.13-101.fc24.x86_64
Well, at least I can say I didn't break it with the queue rework ;)
> Second, if you are seeing this on a +v4.10 kernel, do you see the
same
> results with a +v4.9 kernel?
Yes, and I tried a 4.8.10 and see it there as well.
I then checked a 3.10 RHEL 7 kernel and don't see any lost events and that
even has a backlog_limit of the default of 64.
I then found a system with a 4.5.5 kernel and it also was losing events.
It looks like it has been broken for a while. Since it was related to
this mega-patch I'm currently testing which fixes netns/locking/queue
problems, I hope to post it to the list within the next day or two and
I'm going to mark it as stable for v4.10+ so the latest kernels will
get the fix, but I'm not going to worry about kernels earlier than
that since it isn't something I would consider worthy of -stable by
itself.
--
paul moore
www.paul-moore.com