I took a look at the source code and made some tests. It seems to be a
problem with the reference count of the fsnotify_mark structure.
This error occurs because the fsnotify_mark_destroy function
(which runs in a separated kthread) is trying to iterate through a mark
that is already freed.
Looking at the fsnotify_destroy_mark function (not confuse with
fsnotify_mark_destroy), which adds a mark to destroy_list to be freed
later by fsnotify_mark_destroy, I noticed that it does not increment
the reference count for the reference added to the destroy_list and
usually the callers dispose the references they held after calling
fsnotify_destroy_mark.
The patch below increments the reference count of a mark when it is
added to the destroy list. It seems to solve the issue and it doesn't
seem to cause any memory leak. Please, can you make some tests in your
environments and let me know if there is any problem with this patch.
Regarding the synchronize_scru call, I don't think it's causing this
error. Probably it just make it more frequently because it forces all
the cpus to schedule, giving the chance to someone else to free the
mark.
---
fs/notify/mark.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index f104d56..2985fff 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -150,6 +150,7 @@ void fsnotify_destroy_mark(struct fsnotify_mark
*mark) spin_unlock(&group->mark_lock);
spin_unlock(&mark->lock);
+ fsnotify_get_mark(mark);
spin_lock(&destroy_lock);
list_add(&mark->destroy_list, &destroy_list);
spin_unlock(&destroy_lock);
--
1.7.9.4
On Tue, 17 Apr 2012 14:54:29 -0700
Peter Moody <pmoody(a)google.com> wrote:
Last thing. moving synchronize_srcu(&fsnotify_mark_srcu) out of
the
for(;;) loop in fs/notify/mark.c appears to solve the stability issues
for me. I don't know enough about kernel internals to determine if
this is doing lots of other bad things to my system or not.
Cheers,
peter
On Tue, Apr 17, 2012 at 11:24 AM, Peter Moody <pmoody(a)google.com>
wrote:
> and my config.gz
>
> On Tue, Apr 17, 2012 at 10:56 AM, Peter Moody <pmoody(a)google.com>
> wrote:
>> Here's a trace with debugging turned way up plus a few extra
>> printk's added to fs/notify/mark.c. I'm looping through
>> private_destroy_list before and after the call to synchronize_srcu.
>>
>> I can reproduce this reliably with kvm with 2 virtual processors:
>> Linux desktop 3.4.0-rc3-oops1+ #1 SMP Tue Apr 17 09:59:44 PDT 2012
>> x86_64 GNU/Linux
>>
>> Cheers,
>> peter
>>
>> On Thu, Apr 5, 2012 at 2:07 PM, Eric Paris <eparis(a)redhat.com>
>> wrote:
>>> please please please keep on list. Everything you say might help
>>> track it down!
>>>
>>> On Thu, 2012-04-05 at 14:03 -0700, Peter Moody wrote:
>>>> (please let me know if I should take this off-list)
>>>>
>>>> One other thing (again, maybe already known), but this seems to
>>>> be exacerbated by SMP. On my machine, I can't reproduce the
>>>> crash if I booth with maxcpus=1.
>>>>
>>>> Still hunting.
>>>>
>>>> Cheers,
>>>> peter
>>>>
>>>> On Tue, Apr 3, 2012 at 9:15 AM, Peter Moody <pmoody(a)google.com>
>>>> wrote:
>>>> > This may already be known, but the issue seems to be limited
>>>> > to watch rules. With any watch rules, I can reliably crash my
>>>> > machine while freeing a watch rule after only
>>>> > starting/stopping auditd a few times. With no watch rules, I
>>>> > have no issues.
>>>> >
>>>> > Cheers,
>>>> > peter
>>>> >
>>>> > On Wed, Mar 28, 2012 at 11:44 PM, Valentin Avram
>>>> > <aval13(a)gmail.com> wrote:
>>>> >> Yes, i know that patch. It made it into kernel 3.2.2. I
>>>> >> tested it successfully (oops in 3.2.1, no oops in 3.2.9), but
>>>> >> this oops i'm seeing is also in 3.2.9.
>>>> >>
>>>> >> I monitored changelogs since 3.2.1 to 3.2.12 but there were
>>>> >> no fixes either in audit subsystem or in fsnotify. I'll
try
>>>> >> to reproduce in latest 3.2.13 and repost the oops, but i'm
>>>> >> 99% confident it will be the same.
>>>> >>
>>>> >> Sadly nobody except you seems to pay attention to this
>>>> >> problem, probably because it requires special conditions to
>>>> >> reproduce (really, who starts and stops auditd every 5
>>>> >> seconds on a production server?). We only ran into it because
>>>> >> one of our servers would randomly oops and then freeze about
>>>> >> each month after stopping and then starting
>>>> >>
>>>> >> auditd
>>>> >>
>>>> >> every morning (and the stop-start sequence was needed to
>>>> >> workaround a bug somewhere that would hang a
>>>> >>
>>>> >> gzip
>>>> >>
>>>> >> running on a file outside a watched folder).
>>>> >>
>>>> >> Anyway, as a last note, i have a feeling that the oops is not
>>>> >> exactly random, there is a pattern, just that i haven't
>>>> >> figured it out completely yet.
>>>> >>
>>>> >> Will keep you
>>>> >>
>>>> >> uptodate
>>>> >>
>>>> >> with the things i find out.
>>>> >>
>>>> >> V.
>>>> >>
>>>> >> On Mar 29, 2012 4:14 AM, "Eric Paris"
<eparis(a)redhat.com>
>>>> >> wrote:
>>>> >>>
>>>> >>> That patch fixes a BUG() . The report has a NULL ptr
deref
>>>> >>> and some apparent list correuption.... Sadly they
aren't
>>>> >>> the same....
>>>> >>>
>>>> >>> On Wed, 2012-03-28 at 15:42 -0700, Peter Moody wrote:
>>>> >>> > fyi: this patch [1] seems to fix the issue for me.
The
>>>> >>> > explanation in the subject would reliably oops my
machine.
>>>> >>> >
>>>> >>> > [1]
>>>> >>> >
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit...
>>>> >>> >
>>>> >>> > On Wed, Mar 28, 2012 at 1:51 PM, Peter Moody
>>>> >>> > <pmoody(a)google.com> wrote:
>>>> >>> > > Are you still able to reliably reproduce this
oops? I'm
>>>> >>> > > trying to track this down because this bug (or a
very
>>>> >>> > > similar bug) is causing some significant
headaches here
>>>> >>> > > at work, but I haven't had a lot of luck.
I'm using
>>>> >>> > > usermode linux, though, so that might be
interfering
>>>> >>> > > with things.
>>>> >>> > >
>>>> >>> > > On Mon, Mar 5, 2012 at 12:35 AM, Valentin Avram
>>>> >>> > > <aval13(a)gmail.com> wrote:
>>>> >>> > >> Finally i found some time and spare server to
retest
>>>> >>> > >> the oops and list_add
>>>> >>> > >> corruptions i was getting with the 3.x
kernels and
>>>> >>> > >> auditd 2.1.3.
>>>> >>> > >>
>>>> >>> > >> I tested now with gentoo's latest stable
>>>> >>> > >> 3.2.1-gentoo-r2 and kernel.org's
>>>> >>> > >> 3.2.9.
>>>> >>> > >>
>>>> >>> > >> Both get the oops/BUG in the same way and
after that,
>>>> >>> > >> they keep pouring
>>>> >>> > >> list_add corruptions with
audit_prune_tre(truncated?)
>>>> >>> > >> and auditctl as comms.
>>>> >>> > >>
>>>> >>> > >> Since this is not about Gentoo's kernel
only, i'll post
>>>> >>> > >> here the oops in
>>>> >>> > >> 3.2.9 and also attach some list_add
corruptions.
>>>> >>> > >>
>>>> >>> > >> 3.2.9 BUG:
>>>> >>> > >>
>>>> >>> > >> kernel: [ 301.240011] BUG: unable to handle
kernel
>>>> >>> > >> NULL pointer dereference
>>>> >>> > >> at (null)
>>>> >>> > >> kernel: [ 301.240305] IP:
[<c1238dd0>]
>>>> >>> > >> __list_del_entry+0x20/0xe0 kernel: [
301.240481] *pdpt
>>>> >>> > >> = 0000000000000000 *pde = f000ddc8f000ddc8
>>>> >>> > >> kernel: [ 301.240698] Oops: 0000 [#1] SMP
>>>> >>> > >> kernel: [ 301.240910]
>>>> >>> > >> kernel: [ 301.241030] Pid: 642, comm:
fsnotify_mark
>>>> >>> > >> Not tainted 3.2.9-drbd-version3 #1 Dell Inc.
PowerEdge
>>>> >>> > >> 2950/0CX396 kernel: [ 301.241370] EIP:
>>>> >>> > >> 0060:[<c1238dd0>] EFLAGS: 00010287 CPU:
6 kernel:
>>>> >>> > >> [ 301.241498] EIP is at
__list_del_entry+0x20/0xe0
>>>> >>> > >> kernel: [ 301.241623] EAX: f4fae544 EBX:
f47cffa4 ECX:
>>>> >>> > >> ffffffff EDX: 00000000 kernel: [ 301.241751]
ESI:
>>>> >>> > >> f4fae544 EDI: f4fae508 EBP: f47cff7c ESP:
f47cff64
>>>> >>> > >> kernel: [ 301.241879] DS: 007b ES: 007b FS:
00d8 GS:
>>>> >>> > >> 0000 SS: 0068 kernel: [ 301.242005] Process
>>>> >>> > >> fsnotify_mark (pid: 642, ti=f47ce000
task=f4f47c00
>>>> >>> > >> task.ti=f47ce000) kernel: [ 301.242207]
Stack:
>>>> >>> > >> kernel: [ 301.242327] c10813c0 f47cffa4
f4f47c00
>>>> >>> > >> f4e70888 f47cff7c f47cffa4 f47cffb8 c10f6976
>>>> >>> > >> kernel: [ 301.242882] ffffffc3 f4f47c00
f4f47c00
>>>> >>> > >> 00000000 f4f47c00 c10530c0 f47cff9c f47cff9c
>>>> >>> > >> kernel: [ 301.243438] f4fae544 f4fae544
f4c47f58
>>>> >>> > >> 00000000 c10f68f0 f47cffe4 c1052834 00000000
>>>> >>> > >> kernel: [ 301.243995] Call Trace:
>>>> >>> > >> kernel: [ 301.244119] [<c10813c0>] ?
>>>> >>> > >> rcu_check_callbacks+0x110/0x110
>>>> >>> > >> kernel: [ 301.244248] [<c10f6976>]
>>>> >>> > >> fsnotify_mark_destroy+0x86/0x120 kernel: [
301.244377]
>>>> >>> > >> [<c10530c0>] ?
abort_exclusive_wait+0x80/0x80 kernel:
>>>> >>> > >> [ 301.244504] [<c10f68f0>] ?
>>>> >>> > >> fsnotify_put_mark+0x30/0x30 kernel: [
301.244631]
>>>> >>> > >> [<c1052834>] kthread+0x74/0x80 kernel:
[ 301.244756]
>>>> >>> > >> [<c10527c0>] ?
kthread_flush_work_fn+0x10/0x10 kernel:
>>>> >>> > >> [ 301.244885] [<c1582ab6>]
>>>> >>> > >> kernel_thread_helper+0x6/0xd kernel: [
301.245011]
>>>> >>> > >> Code: 55 f4 8b 45 f8 e9 75 ff ff ff 90 55 89
e5 53 83
>>>> >>> > >> ec 14 8b 08 8b 50 04 81 f9 00 01 10 00 74 24
81 fa 00
>>>> >>> > >> 02 20 00 0f 84 8e 00 00 00 <8b> 1a 39
d8 75 62 8b 59 04
>>>> >>> > >> 39 d8 75 35 89 51 04 89 0a 83 c4 14
>>>> >>> > >> kernel: [ 301.248195] EIP:
[<c1238dd0>]
>>>> >>> > >> __list_del_entry+0x20/0xe0 SS:ESP
>>>> >>> > >> 0068:f47cff64
>>>> >>> > >> kernel: [ 301.248414] CR2: 0000000000000000
>>>> >>> > >> kernel: [ 301.248538] ---[ end trace
>>>> >>> > >> 15082dbfb353f84c ]---
>>>> >>> > >>
>>>> >>> > >> The kernel was compiled with the following
DEBUG
>>>> >>> > >> support (the bolded one
>>>> >>> > >> were requested by Gentoo's Dev:
>>>> >>> > >> CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
>>>> >>> > >> CONFIG_SLUB_DEBUG=y
>>>> >>> > >> CONFIG_HAVE_DMA_API_DEBUG=y
>>>> >>> > >> CONFIG_X86_DEBUGCTLMSR=y
>>>> >>> > >> CONFIG_PNP_DEBUG_MESSAGES=y
>>>> >>> > >> CONFIG_AIC94XX_DEBUG=y
>>>> >>> > >> CONFIG_USB_DEBUG=y
>>>> >>> > >> CONFIG_DEBUG_KERNEL=y
>>>> >>> > >> CONFIG_SCHED_DEBUG=y
>>>> >>> > >> CONFIG_DEBUG_RT_MUTEXES=y
>>>> >>> > >> CONFIG_DEBUG_PI_LIST=y
>>>> >>> > >> CONFIG_DEBUG_BUGVERBOSE=y
>>>> >>> > >> CONFIG_DEBUG_INFO=y
>>>> >>> > >> CONFIG_DEBUG_MEMORY_INIT=y
>>>> >>> > >> CONFIG_DEBUG_LIST=y
>>>> >>> > >> CONFIG_DEBUG_STACKOVERFLOW=y
>>>> >>> > >> CONFIG_DEBUG_RODATA=y
>>>> >>> > >> CONFIG_DEBUG_RODATA_TEST=y
>>>> >>> > >>
>>>> >>> > >> I attached the kernel config i used for 3.2.9
to
>>>> >>> > >> generate this oops and
>>>> >>> > >> warnings.
>>>> >>> > >>
>>>> >>> > >> From the list_add warnings that come after,
out of 805
>>>> >>> > >> warnings i processed,
>>>> >>> > >> after masking with XXXXX the PID and next=
values that
>>>> >>> > >> kept changing in
>>>> >>> > >> every one, i got 26 types of MD5. I also
attached the
>>>> >>> > >> files relevant as an
>>>> >>> > >> archive to this email.
>>>> >>> > >>
>>>> >>> > >> The Gentoo bug i opened is sleeping, it seems
nobody
>>>> >>> > >> has the time to at
>>>> >>> > >> least test to confirm or not the problems
i'm seeing
>>>> >>> > >> (or everybody's thinking that nobody
would restart
>>>> >>> > >> auditd so often, so the bug it's not
that
>>>> >>> > >> serious).
>>>> >>> > >>
>>>> >>> > >>
>>>> >>> > >> Thank you for your time.
>>>> >>> > >>
>>>> >>> > >> On Wed, Feb 8, 2012 at 6:11 PM, Valentin
Avram
>>>> >>> > >> <aval13(a)gmail.com> wrote:
>>>> >>> > >>
>>>> >>> > >>
>>>> >>> > >> --
>>>> >>> > >> Linux-audit mailing list
>>>> >>> > >> Linux-audit(a)redhat.com
>>>> >>> > >>
https://www.redhat.com/mailman/listinfo/linux-audit
>>>> >>> > >
>>>> >>> > >
>>>> >>> > >
>>>> >>> > > --
>>>> >>> > > Peter Moody Google 1.650.253.7306
>>>> >>> > > Security Engineer pgp:0xC3410038
>>>> >>> >
>>>> >>> >
>>>> >>> >
>>>> >>>
>>>> >>>
>>>> >>
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > Peter Moody Google 1.650.253.7306
>>>> > Security Engineer pgp:0xC3410038
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>> --
>> Peter Moody Google 1.650.253.7306
>> Security Engineer pgp:0xC3410038
>
>
>
> --
> Peter Moody Google 1.650.253.7306
> Security Engineer pgp:0xC3410038