On Tuesday, October 05, 2010 11:27:37 am Andy Fanton wrote:
I have a current working implementation that uses an LKM to hook the
syscall table and intercept calls to things like creat(), open(), unlink,
rename(), mkdir(), rmdir(), and worst of all write().
FWIW, the audit system hooks syscall entry and therefore all syscalls.
1) Data Format - I wrote a simple audit dispatcher plugin to get a
look at
the data stream. With "string" format set in the configuration, it
appears as if the data one formatted line of string data just like you see
in the normal audit log. With "binary" format set, does it use the
audit_dispatcher_header structure followed by the event data? If so, what
format is the data and how do I extract values from it? (could be faster
than parsing the strings from the string version).
The kernel writes the event as a string. The binary format means that audispd
plugins get the same data that the kernel hands the audit daemon itself.
Either format will require parsing strings if you need to dig something out of
the event. As for "faster" I would be happy to take performance improvement
patches and am currently looking at performance improvements.
2) Watches - At first glance watches look like a good way to do
this,
except for one big problem. I need to monitor arbitrarily large sections
of the file system without necessarily knowing the paths to specific
objects up front. I suppose I could recursively enumerate a parent
directory, insert watches on every child directory and all files within
those directories, but it seems to me that could take a really long time
to do on huge filesystems, and potentially consume huge amounts of kernel
memory (not sure how the watches are implemented).
A watch is really an alias for:
-a always,exit -F file=/dir/file -F perm=wa
(Note the missing syscall specification) Once you are using this format, you
can use -F dir instead of -F file. This will watch a whole directory tree. I
think mount points under the tree may cause a problem, but there is the "-q"
directive to tell the kernel they are equivalent.
3) Filtering - In order to accomplish what I am trying to do, I need
to
audit a large number of syscalls.
By using the rule as specified above, it selects the syscalls for you.
Obviously there would be a significant performance impact on the
system by
doing so.
Using the format above where it selects the syscalls for you, the overhead is
about like the CPU cache noise when benchmarking.
Even if the customer was willing to live with that impact, there is
an
additional problem with the fact that adding all of those syscall audit
rules means that they will end up being logged to the log file by auditd (if
that logging is turned on) and dispatched to any other audispd plugins that
might be in operation.
True.
I've thought of some workarounds for this like (a) have the users
turn off
the auditd logging and perform filtered logging from my plugin instead,
(b) replace the audispd dispatcher completely so I can filter out
extraneous records from other downstream plugins, (c) some combination of
a and b, or various other schemes. Anybody have an better ideas on this?
(AIX has a cool feature that allows audit listeners to specify a "class"
(subest) of audit event types that they want to receive.
We use keys to label events. To do anything like this would require parsing
the whole event. That would impact performance and until I have things running
faster, would not think about it.
That way any application interested in receiving audit events can
define
which events it wants without affecting which events other listeners receive.
Sort of a built-in filtering mechanism. It would be cool to see that in the
Linux audit subsystem someday)
Yes, someday after performance tuning of libauparse.
-Steve