Hey Steve,
Reaching out regarding the same issue of syslog containing "auditd dispatch error (pipe full) event lost messages".
Post excluding the default events(LOGIN, USER_START etc) mentioned in our previous chat, there has been a significant drop in the log volume and hence I was expecting these error messages to be resolved.
But unfortunately, even after increasing the dispatcher queue size(q_depth) and changing disp_qos to become lossless , I am still seeing mentions of these pipe full errors in my syslog.
The surprising thing is if I try to take a look at the events/keys causing this issue, there doesn't seem to be a lot of events for messages to be dropped.
Ex- Using the command "aureport --summary -ts <start time of dropped messages start reported in syslog > -te < end time
of dropped messages end reported in syslog > -i (-x/-u/--key)", the total events are around 2000 during this time period. The dispatcher queue size is close to 25,000, So I am not really sure why the dispatcher is unable to handle these messages. The queue size is sufficient enough to handle 10x the total events being seen.
Some other theoretical questions I had surrounding this are:
- The audit daemon picks events from the kernel buffer and sends it to the dispatcher buffer. Who writes these logs to /var/log/audit.log - is it the daemon or the dispatcher? And also, are the total events reported in /var/log/audit.log inclusive of the dropped events reported in syslog or exclusive? i.e is it possible that all the events have been recorded in audit.log but syslog has an issue in keeping up with the events as it is the only plugin that is being used by the dispatcher.
- Is there a way to find out what is the total number of events dropped by the dispatcher?
- In auditd v3+, the daemon itself handles dispatching capabilities. So, what does q_depth refer to in this scenario?
- In the man pages for different distros for disp_qos the following statement is common - "
There is a 128k buffer between the audit daemon and dispatcher." But different distros seem to have different default values for q_depth ranging from 80 to 1200. How is it possible that these numbers vary but the size of the buffer remains 128k.