On Thursday, September 15, 2011 02:32:59 AM Vipin Rathor wrote:
One strange thing I'm seeing in /var/log/messages w.r.t. auditd
restart.
2011-09-14T11:49:14.541661-07:00 audisp-remote: audisp-remote is
exiting on stop request
2011-09-14T11:49:18.741166-07:00 kernel: audit: *NO* daemon at
audit_pid=1652525 2011-09-14T11:49:18.741190-07:00 kernel: __ratelimit:
366 callbacks suppressed 2011-09-14T11:49:18.745558-07:00 auditd[1654362]:
Started dispatcher: /sbin/audispd pid: 1654364
2011-09-14T11:49:18.746081-07:00 audispd: max_restarts_parser called with:
10 2011-09-14T11:49:18.746099-07:00 audispd: priority_boost_parser called
with: 10 2011-09-14T11:49:18.746666-07:00 audispd: audispd initialized
with q_depth=90000 and 1 active plugins
2011-09-14T11:49:18.747047-07:00 audisp-remote: Connected to
<remote_audit_logging_server_IP>
2011-09-14T11:49:18.750761-07:00 kernel: audit: audit_lost=3823
audit_rate_limit=0 audit_backlog_limit=20480
2011-09-14T11:49:18.750773-07:00 kernel: audit: auditd dissapeared
<========= why this message?
2011-09-14T11:49:18.750777-07:00 kernel:
This comes from the following code:
http://lxr.linux.no/#linux+v3.0.4/kernel/audit.c#L401
It sort of follows this:
446 if (audit_pid)
447 kauditd_send_skb(skb);
Then
401 err = netlink_unicast(audit_sock, skb, audit_nlk_pid, 0);
402 if (err < 0) {
404 printk(KERN_ERR "audit: *NO* daemon at audit_pid=%d\n",
audit_pid);
405 audit_log_lost("auditd disappeared\n");
So, what looks like happened is you have a busy system and an event was queued to be
sent to user space, the audit_pid exited so it started the call, but by the time the
call was made, the netlink layer couldn't find the pid and then failed.
Eric, is there anything that can be done about this race?
Whenever I'm restarting the auditd using 'service auditd
restart'
command, the auditd gets restarted. But the very next moment, I get
"kernel: audit: auditd dissapeared " message & auditing stops
(actually it falls back to syslog). I've to again run 'service auditd
restart' to get the auditing back. So it is taking two restart
operation to do the job. This behavior is consistent & I can recreate
at will.
This is something strange too. But sounds like perhaps another race of some kind.
-Steve