On Mon, Mar 6, 2017 at 11:03 PM, Richard Guy Briggs <rgb(a)redhat.com> wrote:
On 2017-03-06 10:10, Cong Wang wrote:
> On Mon, Mar 6, 2017 at 2:54 AM, Dmitry Vyukov <dvyukov(a)google.com> wrote:
> > Hello,
> >
> > I've got the following crash while running syzkaller fuzzer on
> > net-next/8d70eeb84ab277377c017af6a21d0a337025dede:
> >
> > kasan: GPF could be caused by NULL-ptr deref or user memory access
> > general protection fault: 0000 [#1] SMP KASAN
> > Dumping ftrace buffer:
> > (ftrace buffer empty)
> > Modules linked in:
> > CPU: 0 PID: 883 Comm: kauditd Not tainted 4.10.0+ #6
> > Hardware name: Google Google Compute Engine/Google Compute Engine,
> > BIOS Google 01/01/2011
> > task: ffff8801d79f0240 task.stack: ffff8801d7a20000
> > RIP: 0010:sock_sndtimeo include/net/sock.h:2162 [inline]
> > RIP: 0010:netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249
> > RSP: 0018:ffff8801d7a27c38 EFLAGS: 00010206
> > RAX: 0000000000000056 RBX: ffff8801d7a27cd0 RCX: 0000000000000000
> > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000002b0
> > RBP: ffff8801d7a27cf8 R08: ffffed00385cf286 R09: ffffed00385cf286
> > R10: 0000000000000006 R11: ffffed00385cf285 R12: 0000000000000000
> > R13: dffffc0000000000 R14: ffff8801c2fc3c80 R15: 00000000014000c0
> > FS: 0000000000000000(0000) GS:ffff8801dbe00000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000000020cfd000 CR3: 00000001c758f000 CR4: 00000000001406f0
> > Call Trace:
> > kauditd_send_unicast_skb+0x3c/0x70 kernel/audit.c:482
> > kauditd_thread+0x174/0xb00 kernel/audit.c:599
> > kthread+0x326/0x3f0 kernel/kthread.c:229
> > ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
> > Code: 44 89 fe e8 56 15 ff ff 8b 8d 70 ff ff ff 49 89 c6 31 c0 85 c9
> > 75 27 e8 b2 b2 f4 fd 49 8d bc 24 b0 02 00 00 48 89 f8 48 c1 e8 03 <42>
> > 80 3c 28 00 0f 85 37 06 00 00 49 8b 84 24 b0 02 00 00 4c 8d
> > RIP: sock_sndtimeo include/net/sock.h:2162 [inline] RSP: ffff8801d7a27c38
> > RIP: netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249 RSP:
> > ffff8801d7a27c38
> > ---[ end trace ad1bba9d457430b6 ]---
> > Kernel panic - not syncing: Fatal exception
> >
> >
> > This is not reproducible and seems to be caused by an elusive race.
> > However, looking at the code I don't see any proper protection of
> > audit_sock (other than the if (!audit_pid) which is obviously not
> > enough to protect against races).
>
> audit_cmd_mutex is supposed to protect it, I think.
> But kauditd_send_unicast_skb() seems not holding this mutex.
Hmmmm, I wonder if it makes sense to wrap most of the contents of the
outer while loop in kauditd_thread in the audit_cmd_mutex, or around the
first two innter while loops and the "if (auditd)" condition after the
"quick_loop:" label. The condition on auditd is supposed to catch that
case. We don't want it locked while playing with the scheduler at the
bottom of that function.
Let me look into this and play around with a few things. I suspected
there might be a problem here, so I've got thoughts on how we might
resolve it; I just need to see code them up and see what option sucks
the least.
FWIW Richard, yes wrapping most of kauditd_thread *should* resolve
this but it's pretty heavy handed and not my first choice.
--
paul moore
www.paul-moore.com