Re: 2.6.16-rc5-mm2 - yet another memory leak...

Saturday, 25 March 2006

On Wed, 15 Mar 2006 12:27:17 EST, Valdis.Kletnieks(a)vt.edu said:
...
 On Sun, 05 Mar 2006 19:38:28 EST, Valdis.Kletnieks(a)vt.edu said:
 
 > Yet another leak - this time it seems to be the audit_socketcall()
 > in sys_socketcall().
 
 >  199154  c02e72b2 <sys_socketcall+0x4d/0x186>
 
 Still present in 2.6.16-rc6-mm1:
 
 % grep socket /proc/slab_allocators 
 size-1024: 3 yenta_probe+0x40/0x49f [yenta_socket]
 size-64: 354 audit_socketcall+0x3e/0x84
 size-32: 108634 audit_socketcall+0x3e/0x84
 
 Uptime of just over an hour.  However, this one might not be audit's fault -
 looks like there's something else leaking, and this is just along for the ride:

This leak was still present in 2.6.16-mm1, and I had a long free evening, so
I chased it down.  Unfortunately, the cause isn't very pretty at all, but may
explain all the multiple problem's I've seen...

Here's what's happening:

strace of /usr/bin/Xorg shows *lots* of these:

getsockopt(20, SOL_SOCKET, SO_PEERCRED, "8\n\0\0\307\3\0\0\307\3\0\0", [12]) =
0

and each of these hits audit_socketcall(), which does a kalloc().  And it's
doing this multiple times a second, so we eventually have several hundred
thousand aux entries created by audit_socketcall(), 

Unfortunately, we never actually call audit_free_aux() on these suckers in a
timely fashion - eventually, we go nuts and free zillions of entries:

Mar 25 01:03:01 turing-police kernel: [ 1862.036000] audit: freed 18870 aux entries
Mar 25 01:03:01 turing-police kernel: [ 1862.036000]  <c0103aeb> show_trace+0xd/0xf 
 <c0103b8c> dump_stack+0x17/0x19
Mar 25 01:03:01 turing-police kernel: [ 1862.036000]  <c0130646>
audit_syscall_exit+0x38f/0x3c7   <c010505c> do_syscall_trace+0x60/0x16e
Mar 25 01:03:01 turing-police kernel: [ 1862.040000]  <c01027d6>
syscall_exit_work+0x12/0x18
Mar 25 01:12:59 turing-police kernel: [ 2460.180000] audit: freed 6785 aux entries
Mar 25 01:12:59 turing-police kernel: [ 2460.180000]  <c0103aeb> show_trace+0xd/0xf 
 <c0103b8c> dump_stack+0x17/0x19
Mar 25 01:12:59 turing-police kernel: [ 2460.180000]  <c0130fb8>
audit_free+0x204/0x238   <c011220a> __put_task_struct+0x8c/0xf
Mar 25 01:12:59 turing-police kernel: [ 2460.180000]  <c011442b>
delayed_put_task_struct+0x3b/0x42   <c0122878> __rcu_process_callbacks+0x121/0x177
Mar 25 01:12:59 turing-police kernel: [ 2460.180000]  <c01228e0>
rcu_process_callbacks+0x12/0x23   <c01179a5> tasklet_action+0x3a/0x60
Mar 25 01:12:59 turing-police kernel: [ 2460.180000]  <c011791f>
__do_softirq+0x43/0x8f   <c01047ad> do_softirq+0x47/0x52
Mar 25 01:12:59 turing-police kernel: [ 2460.180000]  =======================
Mar 25 01:12:59 turing-police kernel: [ 2460.180000]  <c0117c19> ksoftirqd+0x74/0xe3
  <c012461c> kthread+0x97/0xc4
Mar 25 01:12:59 turing-police kernel: [ 2460.180000]  <c0100ac9>
kernel_thread_helper+0x5/0xb
Mar 25 01:13:00 turing-police kernel: [ 2460.880000] audit: freed 48848 aux entries
Mar 25 01:13:00 turing-police kernel: [ 2460.880000]  <c0103aeb> show_trace+0xd/0xf 
 <c0103b8c> dump_stack+0x17/0x19
Mar 25 01:13:00 turing-police kernel: [ 2460.880000]  <c0130fb8>
audit_free+0x204/0x238   <c011220a> __put_task_struct+0x8c/0xf1
Mar 25 01:13:00 turing-police kernel: [ 2460.884000]  <c011442b>
delayed_put_task_struct+0x3b/0x42   <c0122878> __rcu_process_callbacks+0x121/0x177
Mar 25 01:13:00 turing-police kernel: [ 2460.884000]  <c01228e0>
rcu_process_callbacks+0x12/0x23   <c01179a5> tasklet_action+0x3a/0x60
Mar 25 01:13:00 turing-police kernel: [ 2460.884000]  <c011791f>
__do_softirq+0x43/0x8f   <c01047ad> do_softirq+0x47/0x52
Mar 25 01:13:00 turing-police kernel: [ 2460.884000]  =======================
Mar 25 01:13:00 turing-police kernel: [ 2460.884000]  <c0117a3f> irq_exit+0x30/0x3c 
 <c0104759> do_IRQ+0x7a/0x87
Mar 25 01:13:00 turing-police kernel: [ 2460.888000]  <c01030ae>
common_interrupt+0x1a/0x20

(I'm not sure what precipitated the kfree of 18K entries.  The 6K and 48K ones
came out of restarting the X server...)

I think the problem is that we get into audit_syscall_exit() and hit this:

        /* Not having a context here is ok, since the parent may have
         * called __put_task_struct. */
        if (likely(!context))
                goto out;

and we bail, leaving the memory pending till some later time. If that goto doesn't
get
us, then we hit this:

        if (context->auditable)
                audit_log_exit(context, GFP_KERNEL);

which may be the source of the audit message flood I've seen - we suddenly dump
out not only *this* syscall's log entry, but also the previous 45,948 or so aux
entries that we haven't gotten around to freeing.  Looks like on my box, these
only get freed when a process exits, or we generate an AVC.

And since I have selinux in "permissive" mode at the moment, we only generate
an avc the first time we hit something - and as a result, after the first 15-20
minutes of uptime, we generate avc's very rarely, so those aux entries pile up
at the rate of about 2K/minute.

I strongly suspect the dentry references kept by the AUDIT_AVC_PATH code that
finally get freed by the dput() in audit_free_aux are why I was also seeing the
leaking sock_inode_cache and selinux_inode_alloc_security slab allocations...

Unfortunately, I'm not sufficiently confident on either my analysis or what the
code is *supposed* to be doing to actually propose a patch.  However, I'm
*hoping* I'm right, because it would explain 100% of the issues I'm still
seeeing with the audit code...

Comments, anybody?

Here's the debugging patch I was using:

--- linux-2.6.16-mm1/kernel/auditsc.c.lkml2	2006-03-24 23:00:08.000000000 -0500
+++ linux-2.6.16-mm1/kernel/auditsc.c	2006-03-24 23:34:34.000000000 -0500
@@ -415,6 +415,7 @@ static inline void audit_free_names(stru
 static inline void audit_free_aux(struct audit_context *context)
 {
 	struct audit_aux_data *aux;
+	int	count = 0;
 
 	while ((aux = context->aux)) {
 		if (aux->type == AUDIT_AVC_PATH) {
@@ -430,6 +431,11 @@ static inline void audit_free_aux(struct
 
 		context->aux = aux->next;
 		kfree(aux);
+		count++;
+	}
+	if (count >= 50) {
+		printk(KERN_INFO "audit: freed %d aux entries\n",count);
+		dump_stack();
 	}
 }
 





    

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: 2.6.16-rc5-mm2 - yet another memory leak...