On Wednesday 06 April 2005 20:33, Chris Wright wrote:
What is Kris's test program? I was simply using something like:
while :
do
< /dev/null
done
With an audit rule to match that open. This causes congestion
immediately.
That's good enough.
> I decided to leave 3 openings in the backlog in hopes of
allowing
> something to be enqueued that may trigger audit_log_drain.
It shouldn't matter. The act of dropping should re-schedule a drain.
I'd rather not see magic numbers (esp. if they are somewhat arbitrary).
OK, I found the path of code that does that.
For my test, it at least always waits until the backlog to give a
too full/busy message. My experience has been that once congested
there's little to no recovery that will happen, so in that sense the
change borders a bit on academic.
I think we should get to the root cause of this. What I'm seeing is the audit
daemon sleeping because time slice ends. Audit records pile up because test
program has time slice, audit daemon wakes up drains many records, loses time
slice and records pile up.
It seems that it should either be woken up, given the time slice, or a
temporary boost in priority.
This differs from yours only in that I drop the 3, and change to
requeueing
at the head. Does it still work for you?
It'll be late today before I find out. The changes are minor so I don't expect
a regression.
One thing I haven't investigated, though, is how the skb reference counting
works out. We call skb_get, does netlink_unicast always decrement the count
even on errors?
-Steve