Re: Flush the hold queue fall into an infinite loop.

Thursday, 13 January 2022

On Thu, Jan 13, 2022 at 6:57 AM cuigaosheng <cuigaosheng1(a)huawei.com&gt; wrote:
...

 When we add "audit=1" to the cmdline, kauditd will take up 100%
 cpu resource.As follows:

 configurations:
 auditctl -b 64
 auditctl --backlog_wait_time 60000
 auditctl -r 0
 auditctl -w /root/aaa  -p wrx
 shell scripts：
 #!/bin/bash
 i=0
 while [ $i -le 66 ]
 do
    touch /root/aaa
    let i++
 done
 mandatory conditions:

 add "audit=1" to the cmdline, and kill -19 pid_number(for /sbin/auditd).

  As long as we keep the audit_hold_queue non-empty, flush the hold queue will fall into
  an infinite loop.

 713 static int kauditd_send_queue(struct sock *sk, u32 portid,
  714                               struct sk_buff_head *queue,
  715                               unsigned int retry_limit,
  716                               void (*skb_hook)(struct sk_buff *skb),
  717                               void (*err_hook)(struct sk_buff *skb))
  718 {
  719         int rc = 0;
  720         struct sk_buff *skb;
  721         unsigned int failed = 0;
  722
  723         /* NOTE: kauditd_thread takes care of all our locking, we just use
  724          *       the netlink info passed to us (e.g. sk and portid) */
  725
  726         while ((skb = skb_dequeue(queue))) {
  727                 /* call the skb_hook for each skb we touch */
  728                 if (skb_hook)
  729                         (*skb_hook)(skb);
  730
  731                 /* can we send to anyone via unicast? */
  732                 if (!sk) {
  733                         if (err_hook)
  734                                 (*err_hook)(skb);
  735                         continue;
  736                 }
  737
  738 retry:
  739                 /* grab an extra skb reference in case of error */
  740                 skb_get(skb);
  741                 rc = netlink_unicast(sk, skb, portid, 0);
  742                 if (rc < 0) {
  743                         /* send failed - try a few times unless fatal error */
  744                         if (++failed >= retry_limit ||
  745                             rc == -ECONNREFUSED || rc == -EPERM) {
  746                                 sk = NULL;
  747                                 if (err_hook)
  748                                         (*err_hook)(skb);
  749                                 if (rc == -EAGAIN)
  750                                         rc = 0;
  751                                 /* continue to drain the queue */
  752                                 continue;
  753                         } else
  754                                 goto retry;
  755                 } else {
  756                         /* skb sent - drop the extra reference and continue */
  757                         consume_skb(skb);
  758                         failed = 0;
  759                 }
  760         }
  761
  762         return (rc >= 0 ? 0 : rc);
  763 }

 When kauditd attempt to flush the hold queue, the queue parameter is
&audit_hold_queue,
 and if netlink_unicast(line 741 ) return -EAGAIN, sk will be NULL(line 746), so
err_hook(kauditd_rehold_skb)
 will be call. Then continue, skb_dequeue(line 726) and err_hook(kauditd_rehold_skb,line
733) will
 fall into an infinite loop.
 I don't really understand the value of audit_hold_queue, can we remove it, or stop
droping the logs
 into kauditd_rehold_skb when the auditd is abnormal? 
Thanks Gaosheng for the bug report, I'm able to reproduce this and I'm
looking into it now.  I'll report back when I have a better idea of
the problem and a potential fix.

-- 
paul moore
www.paul-moore.com

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: Flush the hold queue fall into an infinite loop.