On Wed, Dec 1, 2021 at 9:25 PM zhaozixuan (C) <zhaozixuan2@huawei.com> wrote:On Mon, Nov 29, 2021 at 2:35 AM zhaozixuan (C) <zhaozixuan2@huawei.com> wrote:On Tue, Nov 23, 2021 at 2:50 AM Zixuan Zhao <zhaozixuan2@huawei.com> wrote:We used lat_syscall of lmbench3 to test the performance impact ofthis patch. We changed the number of rules and run lat_syscall with1000 repetitions at each test. Syscalls measured by lat_syscall arenot monitored by rules.Before this optimization:null read write stat fstat open0 rules 1.87ms 2.74ms 2.56ms 26.31ms 4.13ms 69.66ms10 rules 2.15ms 3.13ms 3.32ms 26.99ms 4.16ms 74.70ms20 rules 2.45ms 3.97ms 3.82ms 27.05ms 4.60ms 76.35ms30 rules 2.64ms 4.52ms 3.95ms 30.30ms 4.94ms 78.94ms40 rules 2.83ms 4.97ms 4.23ms 32.16ms 5.40ms 81.88ms50 rules 3.00ms 5.30ms 4.84ms 33.49ms 5.79ms 83.20ms100 rules 4.24ms 9.75ms 7.42ms 37.68ms 6.55ms 93.70ms160 rules 5.50ms 16.89ms 12.18ms 51.53ms 17.45ms 155.40msAfter this optimization:null read write stat fstat open0 rules 1.81ms 2.84ms 2.42ms 27.70ms 4.15ms 69.10ms10 rules 1.97ms 2.83ms 2.69ms 27.70ms 4.15ms 69.30ms20 rules 1.72ms 2.91ms 2.41ms 26.49ms 3.91ms 71.19ms30 rules 1.85ms 2.94ms 2.48ms 26.27ms 3.97ms 71.43ms40 rules 1.88ms 2.94ms 2.78ms 26.85ms 4.08ms 69.79ms50 rules 1.86ms 3.17ms 3.08ms 26.25ms 4.03ms 72.32ms100 rules 1.84ms 3.00ms 2.81ms 26.25ms 3.98ms 70.25ms160 rules 1.92ms 3.32ms 3.06ms 26.81ms 4.57ms 71.41msAs the result shown above, the syscall latencies increase as thenumber of rules increases, while with the patch the latencies remain stable.This could help when a user adds many audit rules for purposes(such as attack tracing or process behavior recording) but suffersfrom low performance.I have general concerns about trading memory and complexity for performance gains, but beyond that the numbers you posted above don't yet make sense to me.Thanks for your reply.The memory cost of this patch is less than 4KB (1820 bytes on x64 and3640 bytes on compatible x86_64) which is trivial in many cases.Besides, syscalls are called frequently on a system so a smalloptimization could bring a good income.The tradeoff still exists, even though you feel it is worthwhile.Why are the latency increases due to rule count not similar across the different syscalls? For example, I would think that if the increase in syscall latency was > >directly attributed to the audit rule processing then the increase on the "open" syscall should be similar to that of the "null" syscall. In other phrasing, if we > >can process 160 rules in ~4ms in the "null" case, why does it take us ~86ms in the "open" case?As to the test result, we did some investigations and concluded tworeasons:1. The chosen rule sets were not very suitable. Though they were nothit by syscalls being measured, some of them were hit by otherprocesses, which reduced the system performance and affected the testresult; 2. The routine of lat_syscall is much more complicated than wethought. It called many other syscalls during the test, which maycause the result not to be linear.Due to the reasons above, we did another test. We modified audit rulesets and made sure they wouldn't be hit at runtime. Then, we addedktime_get_real_ts64 to auditsc.c to record the time of executing__audit_syscall_exit. We ran "stat" syscall 10000 times for each ruleset and recorded the time interval. The result is shown below:Before this optimization:rule set time0 rules 3843.96ns1 rules 13119.08ns10 rules 14003.13ns20 rules 15420.18ns30 rules 17284.84ns40 rules 19010.67ns50 rules 21112.63ns100 rules 25815.02ns130 rules 29447.09nsAfter this optimization:rule set time0 rules 3597.78ns1 rules 13498.73ns10 rules 13122.57ns20 rules 12874.88ns30 rules 14351.99ns40 rules 14181.07ns50 rules 13806.45ns100 rules 13890.85ns130 rules 14441.45nsAs the result showed, the interval is linearly increased beforeoptimization while the interval remains stable after optimization.Note that audit skips some operations if there are no rules, so thereis a gap between 0 rule and 1 rule set.It looks like a single rule like the one below could effectively disable this optimization, is that correct?% auditctl -a exit,always -F uid=1001% auditctl -l-a always,exit -S all -F uid=1001Yes, rules like this one which monitors all syscalls could disable theoptimization. The number of the global array could exponentially increaseif we want to handle more audit fields. However, we don't that kind ofrule is practical because they might generate a great number of logs andeven lead to log loss.Before we merge something like this I think we need a betterunderstand of typical audit filter rules used across the differentaudit use cases. This patch is too much of a band-aid to mergewithout a really good promise that it will help most of the real worldaudit deployments.