On Thu, 2008-08-28 at 14:47 -0400, DJ Delorie wrote:
Third in a series.
(
http://www.redhat.com/archives/linux-audit/2008-August/msg00118.html)
The goal of this patch is to robustify the error handling in the
client end of the remote protocol. The following changes are made
by this patch:
* Failure to send a record to the aggregator results in a series of
retry attempts, tunable by the administrator.
* Overall network failure (after retries) and server-indicated error
conditions now have admin-specified actions associated with them.
* Miscellaneous additional error handling for reads and writes.
Comments?
I regret not having time to properly review the patches, but I just read
the discussion around the possibility of queuing messages that happened
around Aug. 14th.
(I remember having a similar discussion with Steve a couple of months
ago), but I was wondering if it would be possible to stop reading from
stdin once we detect a temporary network outage (or could be extended to
other errors conditions).
Having the last serial that was successfully sent, once the error
condition is gone we could use auparse calls to read the logs, starting
exactly at (event serial + 1) and processing all events until we 'meet
again' at the last event seen on stdin (which are immediately discarded
while we process the queue), and then resuming from there (back to
reading stdin).
I don't know the details of AMQP but it's possible that it involves a
secondary storage on disk. I wonder if we could achieve the same
robustness without relying on duplicating the data (audit logs and
queue).
On another subject, I liked how you are capable of returning error
conditions from the server, and then have an action for each of those.
If I have the time, I'll see what I can do to mimic this behavior in the
zos-remote plugin.
Thanks!
-Klaus
--
Klaus Heinrich Kiwi <klausk(a)linux.vnet.ibm.com>
Linux Security Development, IBM Linux Technology Center