On Mon, 2013-11-04 at 08:24 -0500, Steve Grubb wrote:
Thanks Steve.
I did a little experimentation today.
On a system that generates around 7500 audit events every five minutes I
changed, without success, the following:
In auditd.conf
- changed num_logs from 9 to 5 although I didn't expect a change as I
move out the rolled over (audit.log.?) log files as part of the
processing so there shouldn't be a big file rename impost
- changed priority_boost from 4 to 8
In audit.rules
- changed backlog from 32K to 64K to 96K to 128K
- changed rules to reduce the recorded events per 5 minute interval from
7500 to 500-600 for the same period.
This particular system is running audit-1.8.2-el5 but I see a similar
problem on a RHEL 6.4 box which I believe is running audit-2.2-2.el6.
I did note that if I executed the sync(1) command before signaling
auditd to roll over (ie execute /bin/kill -s USR1 pid) the error
SOMETIMES did not appear.
So I am a little bit lost.
I believe that the actual effect is just
- the cost of two additional lines in /var/log/messages
- the loss a few logs
My actual process is to
a. roll over the log file
b. run an ausearch --interpret like command
Perhaps my alternative is to modify my ausearch-like command to be state
full and have it process only new events as per a patch I made to
ausearch some time back
Subject: [PATCH] ausearch: Add checkpoint capability and have
incomplete logs carry forward when processing multiple audit.log
files
Date: 05/11/2013 03:59:34 PM
Am open to any suggestions ... I think the key issue is that I reduced
the generated commends into audit.log from 7500 to 600 per five minute
interval but I still see the error.
Rgds
On Monday, November 04, 2013 07:46:18 PM Burn Alting wrote:
> Hi,
>
> I have some quite busy hosts, that emit the following errors when I
> request the audit log file is rolled over (via a kill -s USR1
> auditdpid).
>
> Error receiving audit netlink packet(No buffer space available)
> Error sending signal_info request (No buffer space available)
>
> >From reading earlier posts (circa 2009) it would appear my options are
>
> a. Increase backlog buffer (currently 32768)
> b. Increase priority_boost (currently 4)
> c. Reduce the number of log files (currently 9)
Another corollary to this is that you can increase the file size and decrease
the total files which would help on rotation.
> Does anyone have a feel for which of the above should offer the best
> return?
There are 2 more options:
1) Review the rules to make sure you are not getting events that you really do
not need. If you have a lot of false positives, then you might add some
arguments that better narrow the results. For example, perhaps you have this
rule:
-a always,exit -F arch=b64 -S clock_settime -k time-change
This can give a lot of false positives. The one that really matters is when a
program sets CLOCK_REALTIME (the wall clock). So, the rule can be re-written
as:
-a always,exit -F arch=b64 -S clock_settime -F a0=0 -k time-change
which narrows its scope.
2) You might experiment with cgroups.
> Are their other configuration parameters I could adjust (aside from
> changing my ruleset in audit.rules)?
There might be general disk tuning parameters in sysctl that could help as
well. Choice of file system also has performance impacts. I haven't done any
experimenting on the performance side, but I know there are people here that
also have very busy systems.
-Steve