On Wed, 15 Mar 2006 12:27:17 EST, Valdis.Kletnieks(a)vt.edu said:
On Sun, 05 Mar 2006 19:38:28 EST, Valdis.Kletnieks(a)vt.edu said:
> Yet another leak - this time it seems to be the audit_socketcall()
> in sys_socketcall().
> 199154 c02e72b2 <sys_socketcall+0x4d/0x186>
Still present in 2.6.16-rc6-mm1:
% grep socket /proc/slab_allocators
size-1024: 3 yenta_probe+0x40/0x49f [yenta_socket]
size-64: 354 audit_socketcall+0x3e/0x84
size-32: 108634 audit_socketcall+0x3e/0x84
Uptime of just over an hour. However, this one might not be audit's fault -
looks like there's something else leaking, and this is just along for the ride:
This leak was still present in 2.6.16-mm1, and I had a long free evening, so
I chased it down. Unfortunately, the cause isn't very pretty at all, but may
explain all the multiple problem's I've seen...
Here's what's happening:
strace of /usr/bin/Xorg shows *lots* of these:
getsockopt(20, SOL_SOCKET, SO_PEERCRED, "8\n\0\0\307\3\0\0\307\3\0\0", [12]) =
0
and each of these hits audit_socketcall(), which does a kalloc(). And it's
doing this multiple times a second, so we eventually have several hundred
thousand aux entries created by audit_socketcall(),
Unfortunately, we never actually call audit_free_aux() on these suckers in a
timely fashion - eventually, we go nuts and free zillions of entries:
Mar 25 01:03:01 turing-police kernel: [ 1862.036000] audit: freed 18870 aux entries
Mar 25 01:03:01 turing-police kernel: [ 1862.036000] <c0103aeb> show_trace+0xd/0xf
<c0103b8c> dump_stack+0x17/0x19
Mar 25 01:03:01 turing-police kernel: [ 1862.036000] <c0130646>
audit_syscall_exit+0x38f/0x3c7 <c010505c> do_syscall_trace+0x60/0x16e
Mar 25 01:03:01 turing-police kernel: [ 1862.040000] <c01027d6>
syscall_exit_work+0x12/0x18
Mar 25 01:12:59 turing-police kernel: [ 2460.180000] audit: freed 6785 aux entries
Mar 25 01:12:59 turing-police kernel: [ 2460.180000] <c0103aeb> show_trace+0xd/0xf
<c0103b8c> dump_stack+0x17/0x19
Mar 25 01:12:59 turing-police kernel: [ 2460.180000] <c0130fb8>
audit_free+0x204/0x238 <c011220a> __put_task_struct+0x8c/0xf
Mar 25 01:12:59 turing-police kernel: [ 2460.180000] <c011442b>
delayed_put_task_struct+0x3b/0x42 <c0122878> __rcu_process_callbacks+0x121/0x177
Mar 25 01:12:59 turing-police kernel: [ 2460.180000] <c01228e0>
rcu_process_callbacks+0x12/0x23 <c01179a5> tasklet_action+0x3a/0x60
Mar 25 01:12:59 turing-police kernel: [ 2460.180000] <c011791f>
__do_softirq+0x43/0x8f <c01047ad> do_softirq+0x47/0x52
Mar 25 01:12:59 turing-police kernel: [ 2460.180000] =======================
Mar 25 01:12:59 turing-police kernel: [ 2460.180000] <c0117c19> ksoftirqd+0x74/0xe3
<c012461c> kthread+0x97/0xc4
Mar 25 01:12:59 turing-police kernel: [ 2460.180000] <c0100ac9>
kernel_thread_helper+0x5/0xb
Mar 25 01:13:00 turing-police kernel: [ 2460.880000] audit: freed 48848 aux entries
Mar 25 01:13:00 turing-police kernel: [ 2460.880000] <c0103aeb> show_trace+0xd/0xf
<c0103b8c> dump_stack+0x17/0x19
Mar 25 01:13:00 turing-police kernel: [ 2460.880000] <c0130fb8>
audit_free+0x204/0x238 <c011220a> __put_task_struct+0x8c/0xf1
Mar 25 01:13:00 turing-police kernel: [ 2460.884000] <c011442b>
delayed_put_task_struct+0x3b/0x42 <c0122878> __rcu_process_callbacks+0x121/0x177
Mar 25 01:13:00 turing-police kernel: [ 2460.884000] <c01228e0>
rcu_process_callbacks+0x12/0x23 <c01179a5> tasklet_action+0x3a/0x60
Mar 25 01:13:00 turing-police kernel: [ 2460.884000] <c011791f>
__do_softirq+0x43/0x8f <c01047ad> do_softirq+0x47/0x52
Mar 25 01:13:00 turing-police kernel: [ 2460.884000] =======================
Mar 25 01:13:00 turing-police kernel: [ 2460.884000] <c0117a3f> irq_exit+0x30/0x3c
<c0104759> do_IRQ+0x7a/0x87
Mar 25 01:13:00 turing-police kernel: [ 2460.888000] <c01030ae>
common_interrupt+0x1a/0x20
(I'm not sure what precipitated the kfree of 18K entries. The 6K and 48K ones
came out of restarting the X server...)
I think the problem is that we get into audit_syscall_exit() and hit this:
/* Not having a context here is ok, since the parent may have
* called __put_task_struct. */
if (likely(!context))
goto out;
and we bail, leaving the memory pending till some later time. If that goto doesn't
get
us, then we hit this:
if (context->auditable)
audit_log_exit(context, GFP_KERNEL);
which may be the source of the audit message flood I've seen - we suddenly dump
out not only *this* syscall's log entry, but also the previous 45,948 or so aux
entries that we haven't gotten around to freeing. Looks like on my box, these
only get freed when a process exits, or we generate an AVC.
And since I have selinux in "permissive" mode at the moment, we only generate
an avc the first time we hit something - and as a result, after the first 15-20
minutes of uptime, we generate avc's very rarely, so those aux entries pile up
at the rate of about 2K/minute.
I strongly suspect the dentry references kept by the AUDIT_AVC_PATH code that
finally get freed by the dput() in audit_free_aux are why I was also seeing the
leaking sock_inode_cache and selinux_inode_alloc_security slab allocations...
Unfortunately, I'm not sufficiently confident on either my analysis or what the
code is *supposed* to be doing to actually propose a patch. However, I'm
*hoping* I'm right, because it would explain 100% of the issues I'm still
seeeing with the audit code...
Comments, anybody?
Here's the debugging patch I was using:
--- linux-2.6.16-mm1/kernel/auditsc.c.lkml2 2006-03-24 23:00:08.000000000 -0500
+++ linux-2.6.16-mm1/kernel/auditsc.c 2006-03-24 23:34:34.000000000 -0500
@@ -415,6 +415,7 @@ static inline void audit_free_names(stru
static inline void audit_free_aux(struct audit_context *context)
{
struct audit_aux_data *aux;
+ int count = 0;
while ((aux = context->aux)) {
if (aux->type == AUDIT_AVC_PATH) {
@@ -430,6 +431,11 @@ static inline void audit_free_aux(struct
context->aux = aux->next;
kfree(aux);
+ count++;
+ }
+ if (count >= 50) {
+ printk(KERN_INFO "audit: freed %d aux entries\n",count);
+ dump_stack();
}
}