Hi Steve -
Thanks a lot for your thorough reply. I've moved forward building the dispatcher
plugin
mentioned below taking your suggestions into account. Things have been working out nicely
until I put the system under load and began to notice some problems keeping up under high
file system loads. I'm wondering if you have any suggestions as to something I may
be
doing wrong. Here's what I'm doing and seeing...
Using a single simple rule like this (at your suggestion):
-a always,exit -F dir=/dir/test/ -F perm=wa
I want to receive events in my dispatcher for all file creates, deletes, etc. in that
folder.
I generated a fairly heavy load with a test program that simply creates a number of
files,
writes some data to the files, then deletes the files in a loop. I started seeing
problems
almost immediately with the kernel backlog being exceeded. I originally suspected that
my
plugin wasn't reading fast enough from audispd, so I removed all plugins from the
dispatcher
(except the built-in af_unix one, which isn't being used) and still see the overloads
in the
system described below. Syslog shows this pattern each time...
audispd: audispd initialized with q_depth=10000 and 1 active plugins
...
auditd[24881]: Init complete, auditd 1.7.7 listening for events (startup state enable)
kernel: audit: audit_backlog=4097 > audit_backlog_limit=4096
kernel: audit: audit_lost=697992 audit_rate_limit=0 audit_backlog_limit=4096
kernel: audit: backlog limit exceeded
...
kernel: __ratelimit: 79338 callbacks suppressed
...
out of memory [24881] (note - this is the pid of auditd)
...
kernel: audit: *NO* daemon at audit_pid=0
<some random audit events written to the log by the kernel...>
<auditd and audispd no longer running in ps -ef>
The configuration settings of interest I have in effect are:
(from auditd.conf...)
log_format = NOLOG (disabled log file writing altogether)
priority_boost = 4
disp_qos = lossless
backlog = 4096 (set in audit.rules file)
q_depth = 10000 (audispd.conf)
priority_boost = 4 (audispd.conf)
This is a SUSE 11 x86 box running a 2.6.27.19-5-pae kernel, audit version 1.7.7.
It looks to me for all the world like auditd just can't keep up with the kernel
with this rule in place and a heavy file system load, nevermind the dispatcher or any
plugins.
Is this something you've seen before? Are there some other configuration settings I
may
have missed that would improve the performance somehow? Keep in mind that I come from
recently developing a similar product for AIX, an operating system that pretty much
refuses
under any circumstance to drop a single audit event, so my perspective may be somewhat
skewed :)
I'm hoping that there may be a way to make this implementation work.
Any insight or advice is greatly appreciated.
Thanks,
- Andy
-----Original Message-----
From: Steve Grubb [mailto:sgrubb@redhat.com]
Sent: Tuesday, October 05, 2010 9:51 AM
To: linux-audit(a)redhat.com
Cc: Andy Fanton
Subject: Re: Audit for Filesystem monitoring tool
On Tuesday, October 05, 2010 11:27:37 am Andy Fanton wrote:
I have a current working implementation that uses an LKM to hook the
syscall table and intercept calls to things like creat(), open(), unlink,
rename(), mkdir(), rmdir(), and worst of all write().
FWIW, the audit system hooks syscall entry and therefore all syscalls.
1) Data Format - I wrote a simple audit dispatcher plugin to get a
look at
the data stream. With "string" format set in the configuration, it
appears as if the data one formatted line of string data just like you see
in the normal audit log. With "binary" format set, does it use the
audit_dispatcher_header structure followed by the event data? If so, what
format is the data and how do I extract values from it? (could be faster
than parsing the strings from the string version).
The kernel writes the event as a string. The binary format means that audispd
plugins get the same data that the kernel hands the audit daemon itself.
Either format will require parsing strings if you need to dig something out of
the event. As for "faster" I would be happy to take performance improvement
patches and am currently looking at performance improvements.
2) Watches - At first glance watches look like a good way to do
this,
except for one big problem. I need to monitor arbitrarily large sections
of the file system without necessarily knowing the paths to specific
objects up front. I suppose I could recursively enumerate a parent
directory, insert watches on every child directory and all files within
those directories, but it seems to me that could take a really long time
to do on huge filesystems, and potentially consume huge amounts of kernel
memory (not sure how the watches are implemented).
A watch is really an alias for:
-a always,exit -F file=/dir/file -F perm=wa
(Note the missing syscall specification) Once you are using this format, you
can use -F dir instead of -F file. This will watch a whole directory tree. I
think mount points under the tree may cause a problem, but there is the "-q"
directive to tell the kernel they are equivalent.
3) Filtering - In order to accomplish what I am trying to do, I need
to
audit a large number of syscalls.
By using the rule as specified above, it selects the syscalls for you.
Obviously there would be a significant performance impact on the
system by
doing so.
Using the format above where it selects the syscalls for you, the overhead is
about like the CPU cache noise when benchmarking.
Even if the customer was willing to live with that impact, there is
an
additional problem with the fact that adding all of those syscall audit
rules means that they will end up being logged to the log file by auditd (if
that logging is turned on) and dispatched to any other audispd plugins that
might be in operation.
True.
I've thought of some workarounds for this like (a) have the users
turn off
the auditd logging and perform filtered logging from my plugin instead,
(b) replace the audispd dispatcher completely so I can filter out
extraneous records from other downstream plugins, (c) some combination of
a and b, or various other schemes. Anybody have an better ideas on this?
(AIX has a cool feature that allows audit listeners to specify a "class"
(subest) of audit event types that they want to receive.
We use keys to label events. To do anything like this would require parsing
the whole event. That would impact performance and until I have things running
faster, would not think about it.
That way any application interested in receiving audit events can
define
which events it wants without affecting which events other listeners receive.
Sort of a built-in filtering mechanism. It would be cool to see that in the
Linux audit subsystem someday)
Yes, someday after performance tuning of libauparse.
-Steve