On 13/11/2023 21:49, Steve Grubb wrote:
The root of the problem is the kernel flooding it's buffers when the ACK 
option is given. It really should reserve space if that socket option is 
active. I see Paul merged the patch, so that should work itself out 
eventually.

I suppose the same thing can happen when auditd needs signal information. I 
think the fix was generic enough to fix this use case also.
Yes, the kernel patch should help and might be sufficient in most cases, but it's not perfect.
I think we should look at improving what we currently have in the
meantime - what do you think about the idea of tolerating ENOBUFS?
If we ignore this error, then I think there may be other problems because we 
do not really know if the setup is correct. And what if we get ENOBUFS when 
sending an audit event to the kernel? We have uncertainty that the event was 
sent because under some requirements the application must exit if it can't 
record who is doing something. (The code setting the pid and sending events 
use the same core functions.)
I agree it's risky to ignore ENOBUFS in general. But it looks to me like audit_send() returns -errno, so we could detect and allow it specifically in audit_set_pid(). I'm not aware of anything that could cause ENOBUFS at that point in time except the kernel flooding the socket with messages, which would imply the AUDIT_SET call was successful (or else we would receive nothing / a single error-ACK).

Tangentially, did you have a chance to look at the wmode=WAIT_YES oddity I pointed out in my original email?

- Chris