Re: why I have lost messages on boot even with very big backlog while I hunting only 2 syscalls?

Saturday, 30 September 2017

On Saturday, September 30, 2017 8:48:23 AM EDT you wrote:
...
 Re: why I have lost messages on boot even with very big backlog while
I
 hunting only 2 syscalls? 
 From:	Lev Olshvang <levonshe(a)yandex.com&gt;
 To:	Me
 CC:	&quot;linux-audit(a)redhat.com&quot; <linux-audit(a)redhat.com&gt;
 Date:	9/30/17 8:48 AM

 28.09.2017, 17:02, "Steve Grubb" <sgrubb(a)redhat.com&gt;:
 > Hello,
 > 
 > On Thursday, September 28, 2017 4:51:38 AM EDT Lev Olshvang wrote:
 >>  28.09.2017, 00:32, "Steve Grubb" <sgrubb(a)redhat.com&gt;:
 >>  > On Wednesday, September 27, 2017 4:41:29 PM EDT Lev Olshvang wrote:
 >>  >> Hello list !
 >>  >> 
 >>  >> A very technical question
 >>  >> I have Ubuntu 16.10 Virtual Box , auditd 2.7.8
 >>  >> I have audit=1 parameter in grub.cfg
 >>  >> I see that /proc/cmdline indeed sees it
 >>  >> 
 >>  >> I see that auditd is started with PID 564
 >>  >> 
 >>  >> root 312 2 0 23:12 ? 00:00:00 [kauditd]
 >>  >> root 564 1 0 23:12 ? 00:00:00 /sbin/auditd
 >>  >> 
 >>  >> And I have 15 lost messages ???
 >>  >> auditctl -s
 >>  >> enabled 1
 >>  >> failure 1
 >>  >> pid 564
 >>  >> rate_limit 0
 >>  >> backlog_limit 16384
 >>  >> lost 15
 >>  >> backlog 0
 >>  >> backlog_wait_time 30
 >>  >> loginuid_immutable 0 unlocked
 >>  >> 
 >>  >> auditctl -l
 >>  >> -a always,exit -F arch=b64 -S execve,execveat -F key=exec
 >>  >> 
 >>  >> Do I understand correctly that auiditd is indeed started by systemd
 >>  >> before
 >>  >> other services, except 2 that is listed in auditd.service
 >>  >> dependencuies
 >>  >> -
 >>  >> local-fs and some temp setup of systemd ?
 >>  > 
 >>  > Yes, it is started before most services. However. systemd-journal for
 >>  > some
 >>  > reason feels obligated to enable auditing. And sometimes people put
 >>  > audit=1 on the kernel command line. Either way, auditing is on way
 >>  > before
 >>  > auditd starts. The audit logs have a 64 entry buffer by default. So,
 >>  > as
 >>  > the system boots events pile up and eventually overflows the 64 entry
 >>  > limit.
 >>  > 
 >>  > The fix is to add another boot command option audit_backlog_limit=8192
 >>  > or
 >>  > some other suitable number. The test to check for this is to boot your
 >>  > system, login and run auditctl -s. If you have just booted and lost
 >>  > events during boot, this should fix it.
 >>  > 
 >>  > -Steve
 >>  
 >>  Hi Steve
 >>  
 >>  Thank you for your answer.
 >>  I added backlog parameter as you advised, but it did not solve the
 >>  problem
 >>  
 >>  cat /proc/cmdline
 >>  BOOT_IMAGE=/vmlinuz-4.8.0-59-generic root=/dev/mapper/kubuntu--vg-root
 >>  ro
 >>  net.ifnames=0 biosdevname=0 audit=1 audit_backlog_limit=8192 debug
 >>  splash
 >>  auditctl -s
 >>  enabled 1
 >>  failure 1
 >>  pid 672
 >>  rate_limit 0
 >>  backlog_limit 16384
 >>  lost 16
 >>  backlog 10
 >>  backlog_wait_time 30
 >>  loginuid_immutable 0 unlocked
 >>  
 >>  Perhaps something else in configuration ?
 > 
 > You have a backlog of 10. That should normally be 0 unless the system is
 > very busy. What do you have for the flush and freq settings in
 > auditd.conf?
 > 
 > -Steve

 Hi Steve,

 I overloked your mail yesterday, sorry for delay.

 Here the auditd.conf

 local_events = yes
 write_logs = yes
 log_format = RAW
 log_file = /var/log/audit/audit.log
 log_group = root
 priority_boost = 16
 flush = INCREMENTAL_ASYNC
 freq = 20
 num_logs = 5
 disp_qos = lossy

 I increased priority_boost from 4 to 16 in a hope to solve lost messages
 problem. I observed other values of backlog, it was sometimes 6, sometimes
 7.

 Today I made very big backlog, here are results
 enabled 1
 failure 1
 pid 663
 rate_limit 0
 backlog_limit 32768
 lost 15
 backlog 0
 backlog_wait_time 15000

 Still 15 losts, now events in backlog
 Perhaps I need to add some tracer to lost messages code in kernel to debug
 it. 
Maybe adjust your freq from 20 to maybe 50. Other than that, I don't know of 
any other user space tricks to improve the flow rate. Maybe Paul or Richard 
has ideas. I see you have a 4.8 kernel. I think I remember there being some 
netlink comm issues prior to 4.12.

-Steve

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: why I have lost messages on boot even with very big backlog while I hunting only 2 syscalls?