答复: [PATCH][RFC] audit: set wait time to zero when audit failed

Monday, 16 September 2019

...
 -----邮件原件-----
 发件人: Paul Moore [mailto:paul@paul-moore.com]
 发送时间: 2019年9月17日 6:52
 收件人: Li,Rongqing <lirongqing(a)baidu.com&gt;
 抄送: Eric Paris <eparis(a)redhat.com>; linux-audit(a)redhat.com
 主题: Re: [PATCH][RFC] audit: set wait time to zero when audit failed

 On Sun, Sep 15, 2019 at 10:55 PM Li,Rongqing <lirongqing(a)baidu.com&gt; wrote:
 > > > if audit_log_start failed because queue is full, kauditd is
 > > > waiting the receiving queue empty, but no receiver, a task will be
 > > > forced to wait 60 seconds for each audited syscall, and it will be
 > > > hang for a very long time
 > > >
 > > > so at this condition, set the wait time to zero to reduce wait,
 > > > and restore wait time when audit works again
 > > >
 > > > it partially restore the commit 3197542482df ("audit: rework
 > > > audit_log_start()")
 > > >
 > > > Signed-off-by: Li RongQing <lirongqing(a)baidu.com&gt;
 > > > Signed-off-by: Liang ZhiCheng <liangzhicheng(a)baidu.com&gt;
 > > > ---
 > > > reboot is taking a very long time on my machine(centos 6u4 +kernel
 > > > 5.3) since TIF_SYSCALL_AUDIT is set by default, and when reboot,
 > > > userspace process which receiver audit message , will be killed,
 > > > and lead to that no user drain the audit queue
 > > >
 > > > git bitsect show it is caused by 3197542482df ("audit: rework
 > > > audit_log_start()")
 > > >
 > > >  kernel/audit.c | 9 +++++++--
 > > >  1 file changed, 7 insertions(+), 2 deletions(-)
 > >
 > > This is typically solved by increasing the backlog using the
 "audit_backlog_limit"
 > > kernel parameter (link to the docs below).
 >
 > It should be able to avoid my issue, but the default behaviors does not
 working for me; And not all have enough knowledge about audit, who maybe
 spend lots of effort to find the root cause, and estimate how large should be
 "audit_backlog_limit"

 The pause/sleep behavior is desired behavior and is intended to help
 kauditd/auditd process the audit backlog on a busy system.  If we didn't sleep
 the current process and give kauditd/auditd a chance to flush the backlog when
 it was full, a lot of bad things could happen with respect to audit.  We
 generally select the backlog limit so that this is not a problem for most systems,
 although there will always be edge cases where the default does not work well;
 it is impossible to pick defaults that work well for every case.

I just want to it as before 3197542482df ("audit: rework audit_log_start()"),
wait 60 seconds once if auditd/readaheaad-collector have some problem to
drain the audit backlog.

And once the auditd/readahead-collector recovers, restore the wait time to 60 seconds

...
 If you are not using audit, you can always disable it via the kernel
command line,
 or at runtime (look at what Fedora does).

 > > You might also want to investigate
 > > what is generating some many audit records prior to starting the
 > > audit daemon.
 >
 > It is /sbin/readahead-collector, in fact, we stop the auditd; We are doing a
 reboot test, which rebooting machine continue to test hardware/software.
 >
 > it is same as below:
 > auditctl -a always,exit -S all -F pid='xxx'
 > kill -s 19 `pidof auditd`
 >
 > then the audited task will be hung

 So you are seeing this problem only when you run a test, or did you provide this
 as a reproducer?

auditctl -a always,exit -S all -F ppid=`pidof sshd`
kill -s 19 `pidof auditd`
ssh root(a)127.0.0.1 

then ssh will be hung forever

-Li RongQing

...
 --
 paul moore
 www.paul-moore.com 

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

答复: [PATCH][RFC] audit: set wait time to zero when audit failed