On Tuesday 18 August 2009 11:09:58 am LC Bruzenak wrote:
On Tue, 2009-08-18 at 09:02 -0400, David Flatley wrote:
> When I do "service auditd rotate" I am getting in
> the /var/log/messages the following:
>
> Error receiving audit netlink packet (No buffer space available)
> Error sending signal_info request (No buffer space available)
>
> At the same time I am running a regression test that is generating 20
> meg audit logs every six to eight minutes.
>
> Is this a concern?
It sounds like you have a system that is auditing a lot of data. Since you are
doing regression testing, I would not be too concerned. But in general, you
can increase the priority boost for auditd and see if it gets more time slots
to drain the queue, make the log files larger, but fewer of them so rotate is
faster, increase the backlog buffer some more, or adjust what you are auditing.
What I believe is happening is that you are generating an abnormal
amount of audit data in your regression test. That's OK, but I think
when you do the rotate the auditd suspends disk writes while it waits
for the rotate to complete.
IIRC, the rotate starts with the highest number log, rolls it to the
next higher number. Then it decrements the counter and repeats. So
log.13->log.14, then log.12->log.13, etc., and eventually moves
audit.log to audit.log.1. Then a new audit.log is created and the flow
resumes.
While this happens, you are stacking up events from the kernel and
eventually run out of space. On some machines where the log files are in
the hundreds (I had around 300) I have seen the rotate take an
appreciable amount of time.
This is true.
But having looked at the audit requirements and the given/suggested rules,
they are badly in need of correction. I would say those audit rules is the
root cause of the problem.
-Steve