Steve,
One of the issues with ausearch's checkpoint code is how to recover from
failures. A classic failure is to perform a checkpoint on a busy system
and then delay too long before running the next invocation of ausearch
and as a result of the delay, the checkpointed event cannot be found in
the files in /var/log/audit. There are other failures, such as re-use of
inodes etc.
For those of you who haven't noted the ausearch --checkpoint change, it
basically records the details of the last complete audit event it
processed or printed in a checkpoint file. It records not only the event
time, but also the event node, serial, type and the file device and
inode. Thus, when you next invoke ausearch with this option, the next
event to process is the next complete event since the one recorded.
Should an error occur when attempting to find the next complete event to
process, ausearch will exit. At this point, I believe the best recovery
action is to extract only the event time from the checkpoint file and
ask for all complete events after that time (i.e. as opposed to the
usual action of comparing time, event id, type, log file details etc).
There are at last two solutions:
a. We can patch ausearch to take a --checkpoint-time-only flag which
means ausearch will look for all events since the time in the checkpoint
file. This provides the best granularity in time as it goes down to
msecs.
b. We extract the timestamp from the checkpoint file, convert it to a
date and time and use ausearch's --start option to find all events since
the time in the checkpoint file.
The first provides greater granularity in time as it goes to msecs.
Steve,
I can provide a patch. Do you want it?
Rgds
Burn