On Thursday 14 August 2008 03:14:07 Kay Hayen wrote:
I would like to present our plan for using audit briefly. We have
made a
prototype implementation, and discovered some things along the way.
Nice. I'll skip straight to the parts that I thnk I can comment on.
Now one issue, I see is that the times that we get from auditd
through the
socket from its child daemon may not match the start_date exactly.
All time hacks in the audit logs come from the kernel at the instant the
record is created. They all start by calling audit_log_start, and right here
is where time is written:
http://lxr.linux.no/linux+v2.6.26.2/kernel/audit.c#L1194
The source that is used is current_kernel_time();
I think they could. Actually we would prefer to receive the tick at
which a
process started,
The audit system has millisecond resolution.This was considered adequate due
to system ticks being < 1000 Hz. The current_kernel_time90 is a broken down
time struct similar to pselect's. This is how its used:
audit_log_format(ab, "audit(%lu.%03lu:%u): ",
t.tv_sec, t.tv_nsec/1000000, serial);
Currently we feel we should apply a delta around the times to match
them,
and that's somehow unstable methinks. We would prefer delta to be 0.
Otherwise we may e.g. run into pid number overruns much easier.
I'm thinking the audit resolution is higher than the scheduler's ticks. If you
take the absolute ticks and turn them into <sys/time.h>,
struct timespec {
long tv_sec; /* seconds */
long tv_nsec; /* nanoseconds */
};
Would they match?
The other thing is sequence numbers. We see in the output sequence
numbers
for each audit event. Very nice. But can you confirm where these sequence
numbers are created? Are they done in the kernel, in auditd or in its child
daemon?
They are done in the kernel and are incremented for each audit_log_start so
that no 2 audit events within the same millisecond have the same serial
number. Their source is here:
http://lxr.linux.no/linux+v2.6.26.2/kernel/audit.c#L1085
The underlying question is, how safe can we be that we didn't
miss anything
when sequence numbers don't suggest so. We would like to use the lossless
mode of auditd. Does that simply mean that auditd may get behind in worst
case?
Yes. You would want to do a couple things. Increase the kernel backlog,
increase auditd priority, & increase audispd's internal queue.
Can you confirm that a type=EOE delimits every event (is that even
the correct term to use, audit trace, how is it called).
It delimits every multipart event. you can use something like this to
determine if you have an event:
if ( r->type == AUDIT_EOE || r->type < AUDIT_FIRST_EVENT ||
r->type >= AUDIT_FIRST_ANOM_MSG) {
have full event...
}
We can't build the rpm due to dependency problems?
If you are on RHEL 5, just edit the spec file to remove --with-prelude. And
delete any packaging of egginfo files.
, so I was using the hard way, ./configure --prefix=/opt/auditd-1.7
and that
works fine on our RHEL 5.2 it seems. What's not so clear to (me) is which
kernel dependency there really is. Were there interface changes at all?
The best bet is to take the last RHEL5 audit srpm and install it. Modify that
to have the new tar file. Then remove some of the patches. I have not build
current for RHEL5 so I can't say much except to remove one, rpmbuild -bp and
see if that is ok. then delete another if so. You do not need to do an
rpmbuild -ba.
The changelog didn't suggest so.
There are likely dependency issues for the selinux policy used for the
zos-remote plugin.
BTW: Release-wise, will RHEL 5.3 include the latest auditd?
That is the plan. But there will be a point where audit development continues
and bugfixes are backported rather than new version. At a minimum,
audit-1.7.5 will be in RHEL5.3. Maybe 1.7.6 if we have another quick release.
One thing I observed with 1.7.4-1 from Debian Testing amd64 that we
won't
ever see any clone events on the socket (and no forks, but we only know of
cron doing these anyway), but all execs and exit_groups.
That may be distro dependent. And you should use strace to confirm what you
are looking for. On x86_64, note there are 2 clone syscall and you should
have -F arch=b64 and -F arch=b32 for each rule.
The rules we use are:
# First rule - delete all
-D
# Increase the buffers to survive stress events.
# Make this bigger for busy systems
-b 320
bump this up. maybe 8192. That's what we use for CAPP.
# Feel free to add below this line. See auditctl man page
-a entry,always -S clone -S fork -S vfork
If you are on amd64, I would suggest:
-a entry,always -F arch=b32 -S clone -S fork -S vfork
-a entry,always -F arch=b64 -S clone -S fork -S vfork
and similar for other syscall rules.
-a entry,always -S execve
-a entry,always -S exit_group -S exit
Very strange. Works fine with self-compile RHEL 5.2, I understand
that you
are not Debian guys, I just wanted to ask you briefly if you were aware of
anything that could cause that. I am going to report that as a bug (to
them) otherwise.
There might be tunables that different distros can used with glibc. strace is
your friend...and having both 32/64 bit rules if amd64 is the target
platform.
With our rules file, we have grouped only similar purpose syscalls
that we
care about. The goal we have is to track all newly created processes, their
exits and the code they run. If you are aware of anything we miss, please
point it out.
This is a really tricky area. The could mmap a file and execute it. They can
pass file descriptors between processes and execve /proc/<pid>/fd/4. or maybe
take advantage of a hole in a program and overlay memory with another program
so that /proc shows one thing but its really another. Its really hard to make
airtight. SE Linux is your best bet to make sure people stay within the
bounds that you intend - which means that the real processes are auditable.
Also, it is true (I read that yesterday) that every syscall is slowed
down
for every new rule?
Yes, if they are syscall rules. Its best to group as many together as
possible.
That means, we are making a mistake by not having only
one line?
I wouldn't say a mistake. Its that there will be a performance difference and
it may not be enough to worry about. You would have to benchmark it.
And is open() performance really affected by this?
Yes.
Does audit not (yet?) use other tracing interface like SystemTap,
etc.
where people try to have 0 cost for inactive traces.
They have a cost. :) Also, systemtap while good for some things not good for
auditing. For one, systemtap recompiles the kernel to make new modules. You
may not want that in your environment. It also has not been tested for
CAPP/LSPP compilance.
Also on a general basis. Do you recommend using the sub-daemon for
the job
or should we rather use libaudit for the task instead? Any insight is
welcome here.
It really depends on what your environment allows. Do you need an audit trail?
With search tools? And reporting tools? Do you need the system to halt if
auditing problems occur? Do you need any certifications?
What we would like to achieve is:
1. Monitor every created process if it (was) relevant to something. We
don't want to miss a process however briefly it ran.
This is hard, but can be achieved with help from SE Linux.
2. We don't want to poll periodically, but rather only wake up
(and then
with minimal latency) when something interesting happened. We would want to
poll a periodic check that forks are still reported, so we would detect a
loss of service from audit.
You might write a audispd plugin for this.
-Steve