On Wednesday 05 November 2008 15:56:30 Lucas C. Villa Real wrote:
>> Looking the code at lib/netlink.c, I saw that audit_send()
doesn't
>> handle -ENOBUFS. Would it be possible to replace the condition from
>> "while (retval < 0 && errno == EINTR)" to "while
(retval < 0 && (errno
>> == EINTR || errno == ENOBUFS))" to fix the problem when sending
>> packets from userspace to kernel?
>
> Have you tried that? Does it fix the problem or just hang the utility?
So far it didn't hang. However, just in case, I added a maximum number
of retries (currently set to 64). I'm about to launch a new batch to
stress the system once again, and then I'll be able to see if it works
as expected.
If it works out, send a patch to the list and I'll pull it into the next
release.
>> One interesting thing which I noticed is that 'auditctl
-s' doesn't
>> report that messages were lost,
>
> They weren't lost by the audit system so it doesn't know they didn't
> arrive.
Do you think it would make sense to add an extra member to struct
sk_buff (a pointer to a callback function) and then have
skb_queue_tail() signal if it failed to send a message? That would
allow audit to keep track of such losses, as well as any other
subsystem using netlink for communicating with userspace.
The network developers generally frown on anything getting added to sk_buff
since that affects overall system performance. You would probably want to
take this issue up with them on the net-dev mail list. I would be supportive
of anything that adds reliability. But they control the code base for that
part of the kernel.
-Steve