Log as follows:

[  257.972293] CPU: 79 PID: 550 Comm: kauditd Kdump: loaded Tainted: G           OE    --------- -t - 4.18.0-147.5.2.5.h781.eulerosv2r10.x86_64 #1
[  257.972294] Hardware name: Huawei CH121 V5/IT11SPCA1, BIOS 7.93 01/14/2021
[  257.972295] Call Trace:
[  257.972297]  <IRQ>
[  257.972307]  dump_stack+0x6f/0xab
[  257.972314]  watchdog_timer_fn+0x222/0x2e0
[  257.972316]  ? watchdog+0x50/0x50
[  257.972322]  __hrtimer_run_queues+0x125/0x2f0
[  257.972326]  ? recalibrate_cpu_khz+0x10/0x10
[  257.972329]  hrtimer_interrupt+0xe5/0x240
[  257.972331]  ? sched_clock+0x5/0x10
[  257.972334]  smp_apic_timer_interrupt+0x6a/0x130
[  257.972336]  apic_timer_interrupt+0xf/0x20
[  257.972337]  </IRQ>
[  257.972341] RIP: 0010:_raw_spin_unlock_irqrestore+0x11/0x20
[  257.972343] Code: ff ff 7f 5b 44 89 e8 5d 41 5c 41 5d c3 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 c6 07 00 0f 1f 40 00 48 89 f7 57 9d <0f> 1f 44 00 00 c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 c6 07
[  257.972344] RSP: 0018:ffffb7d90e2d3e38 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13
[  257.972347] RAX: 0000000000000286 RBX: ffff9bb017d18b00 RCX: ffff9bb017d19900
[  257.972347] RDX: ffffffff8fb8fef0 RSI: 0000000000000286 RDI: 0000000000000286
[  257.972348] RBP: ffffffff8fb8fef0 R08: 000000000002b3a0 R09: ffffffff8e7829a2
[  257.972349] R10: ffffd9126778fa00 R11: 00000000000f4240 R12: ffffffff8fb8ff04
[  257.972350] R13: 0000000000000000 R14: ffff9bb017d18bf4 R15: ffff9bb017d18b00
[  257.972356]  ? netlink_attachskb+0xb2/0x1d0
[  257.972362]  skb_dequeue+0x57/0x70
[  257.972367]  kauditd_send_queue+0x37/0x100
[  257.972369]  ? kauditd_retry_skb+0x20/0x20
[  257.972370]  ? kauditd_send_multicast_skb+0x90/0x90
[  257.972372]  kauditd_thread+0xa5/0x230
[  257.972377]  ? finish_wait+0x80/0x80
[  257.972378]  ? auditd_reset+0x90/0x90
[  257.972381]  kthread+0x10d/0x130
[  257.972383]  ? kthread_flush_work_fn+0x10/0x10
[  257.972385]  ret_from_fork+0x35/0x40
[  269.972020] Sample cputime: 3999999736 ns(HZ: 1000)
[  269.972022] Sample cpurate: 0 us, 3984966800 sy, 0 ni, 0 id, 0 wa, 15034536 hi, 0 si, 0 st
[  269.972023] Sample softirq:
[  269.972023] Sample hardirq:
[  269.972232]         no hard irqs found.
[  269.972233] watchdog: BUG: soft lockup - CPU#79 stuck for 22s! [kauditd:550]

Thanks.


在 2022/1/13 19:56, cuigaosheng 写道:
When we add "audit=1" to the cmdline, kauditd will take up 100%
cpu resource.As follows:
configurations:
	auditctl -b 64
	auditctl --backlog_wait_time 60000
	auditctl -r 0
	auditctl -w /root/aaa  -p wrx
shell scripts:
	#!/bin/bash
	i=0
	while [ $i -le 66 ]
	do
	    touch /root/aaa
	    let i++
	done
mandatory conditions:
add "audit=1" to the cmdline, and kill -19 pid_number(for /sbin/auditd).

 As long as we keep the audit_hold_queue non-empty, flush the hold queue will fall into
 an infinite loop. 

713 static int kauditd_send_queue(struct sock *sk, u32 portid,
 714                               struct sk_buff_head *queue,
 715                               unsigned int retry_limit,
 716                               void (*skb_hook)(struct sk_buff *skb),
 717                               void (*err_hook)(struct sk_buff *skb))
 718 {
 719         int rc = 0;
 720         struct sk_buff *skb;
 721         unsigned int failed = 0;
 722
 723         /* NOTE: kauditd_thread takes care of all our locking, we just use
 724          *       the netlink info passed to us (e.g. sk and portid) */
 725
 726         while ((skb = skb_dequeue(queue))) {
 727                 /* call the skb_hook for each skb we touch */
 728                 if (skb_hook)
 729                         (*skb_hook)(skb);
 730
 731                 /* can we send to anyone via unicast? */
 732                 if (!sk) {
 733                         if (err_hook)
 734                                 (*err_hook)(skb);
 735                         continue;
 736                 }
 737
 738 retry:
 739                 /* grab an extra skb reference in case of error */
 740                 skb_get(skb);
 741                 rc = netlink_unicast(sk, skb, portid, 0);
 742                 if (rc < 0) {
 743                         /* send failed - try a few times unless fatal error */
 744                         if (++failed >= retry_limit ||
 745                             rc == -ECONNREFUSED || rc == -EPERM) {
 746                                 sk = NULL;
 747                                 if (err_hook)
 748                                         (*err_hook)(skb);
 749                                 if (rc == -EAGAIN)
 750                                         rc = 0;
 751                                 /* continue to drain the queue */
 752                                 continue;
 753                         } else
 754                                 goto retry;
 755                 } else {
 756                         /* skb sent - drop the extra reference and continue */
 757                         consume_skb(skb);
 758                         failed = 0;
 759                 }
 760         }
 761
 762         return (rc >= 0 ? 0 : rc);
 763 }
When kauditd attempt to flush the hold queue, the queue parameter is &audit_hold_queue,
and if netlink_unicast(line 741 ) return -EAGAIN, sk will be NULL(line 746), so err_hook(kauditd_rehold_skb)
will be call. Then continue, skb_dequeue(line 726) and err_hook(kauditd_rehold_skb,line 733) will
fall into an infinite loop. 
I don't really understand the value of audit_hold_queue, can we remove it, or stop droping the logs
into kauditd_rehold_skb when the auditd is abnormal?

Look forward your reply. Thank you very much.
Gaosheng.