Re: netlink: GPF in netlink_unicast

Tuesday, 7 March 2017

On Tue, Mar 7, 2017 at 1:44 PM, Paul Moore <paul(a)paul-moore.com&gt; wrote:
...
 On Tue, Mar 7, 2017 at 10:55 AM, Richard Guy Briggs
<rgb(a)redhat.com&gt; wrote:
> On 2017-03-07 09:29, Paul Moore wrote:
>> On Mon, Mar 6, 2017 at 11:03 PM, Richard Guy Briggs <rgb(a)redhat.com&gt;
wrote:
>> > On 2017-03-06 10:10, Cong Wang wrote:
>> >> On Mon, Mar 6, 2017 at 2:54 AM, Dmitry Vyukov <dvyukov(a)google.com&gt;
wrote:
>> >> > Hello,
>> >> >
>> >> > I've got the following crash while running syzkaller fuzzer on
>> >> > net-next/8d70eeb84ab277377c017af6a21d0a337025dede:
>> >> >
>> >> > kasan: GPF could be caused by NULL-ptr deref or user memory access
>> >> > general protection fault: 0000 [#1] SMP KASAN
>> >> > Dumping ftrace buffer:
>> >> >    (ftrace buffer empty)
>> >> > Modules linked in:
>> >> > CPU: 0 PID: 883 Comm: kauditd Not tainted 4.10.0+ #6
>> >> > Hardware name: Google Google Compute Engine/Google Compute Engine,
>> >> > BIOS Google 01/01/2011
>> >> > task: ffff8801d79f0240 task.stack: ffff8801d7a20000
>> >> > RIP: 0010:sock_sndtimeo include/net/sock.h:2162 [inline]
>> >> > RIP: 0010:netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249
>> >> > RSP: 0018:ffff8801d7a27c38 EFLAGS: 00010206
>> >> > RAX: 0000000000000056 RBX: ffff8801d7a27cd0 RCX: 0000000000000000
>> >> > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000002b0
>> >> > RBP: ffff8801d7a27cf8 R08: ffffed00385cf286 R09: ffffed00385cf286
>> >> > R10: 0000000000000006 R11: ffffed00385cf285 R12: 0000000000000000
>> >> > R13: dffffc0000000000 R14: ffff8801c2fc3c80 R15: 00000000014000c0
>> >> > FS:  0000000000000000(0000) GS:ffff8801dbe00000(0000)
knlGS:0000000000000000
>> >> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> >> > CR2: 0000000020cfd000 CR3: 00000001c758f000 CR4: 00000000001406f0
>> >> > Call Trace:
>> >> >  kauditd_send_unicast_skb+0x3c/0x70 kernel/audit.c:482
>> >> >  kauditd_thread+0x174/0xb00 kernel/audit.c:599
>> >> >  kthread+0x326/0x3f0 kernel/kthread.c:229
>> >> >  ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
>> >> > Code: 44 89 fe e8 56 15 ff ff 8b 8d 70 ff ff ff 49 89 c6 31 c0 85
c9
>> >> > 75 27 e8 b2 b2 f4 fd 49 8d bc 24 b0 02 00 00 48 89 f8 48 c1 e8 03
<42>
>> >> > 80 3c 28 00 0f 85 37 06 00 00 49 8b 84 24 b0 02 00 00 4c 8d
>> >> > RIP: sock_sndtimeo include/net/sock.h:2162 [inline] RSP:
ffff8801d7a27c38
>> >> > RIP: netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249 RSP:
>> >> > ffff8801d7a27c38
>> >> > ---[ end trace ad1bba9d457430b6 ]---
>> >> > Kernel panic - not syncing: Fatal exception
>> >> >
>> >> >
>> >> > This is not reproducible and seems to be caused by an elusive
race.
>> >> > However, looking at the code I don't see any proper protection
of
>> >> > audit_sock (other than the if (!audit_pid) which is obviously not
>> >> > enough to protect against races).
>> >>
>> >> audit_cmd_mutex is supposed to protect it, I think.
>> >> But kauditd_send_unicast_skb() seems not holding this mutex.
>> >
>> > Hmmmm, I wonder if it makes sense to wrap most of the contents of the
>> > outer while loop in kauditd_thread in the audit_cmd_mutex, or around the
>> > first two innter while loops and the "if (auditd)" condition after
the
>> > "quick_loop:" label.  The condition on auditd is supposed to catch
that
>> > case.  We don't want it locked while playing with the scheduler at the
>> > bottom of that function.
>>
>> Let me look into this and play around with a few things.  I suspected
>> there might be a problem here, so I've got thoughts on how we might
>> resolve it; I just need to see code them up and see what option sucks
>> the least.
>>
>> FWIW Richard, yes wrapping most of kauditd_thread *should* resolve
>> this but it's pretty heavy handed and not my first choice.
>
> That's why the inner loops made a bit more sense since it wasn't really
> necessary and ran afoul of the scheduler anyways.

 One of my preferred options was to get us away from protecting
 everything with the audit_cmd_mutex by creating a new locking approach
 for the auditd connection state (using RCU/spinlocks since it rarely
 changes in practice) and leaving the audit_cmd_mutex for it's
 traditional role.  This should minimize the performance impact of the
 lock and clean things up a bit.  I'm also moving all the auditd
 connection state into a single struct (instead of several variables
 associated only by convention) which moves us oh so slightly closer to
 allowing multiple auditd connections (hey, it's something).

 It's taking a bit longer than expected as I'm dealing with a bit of a
 head cold (or something) and my mind is far less than 100% at the
 moment ... 
Ooof.  I just noticed something, and maybe this is the fever talking,
but why do we ever NULL out audit_sock and why are we bothering with
those holds/puts?  We create the audit netlink socket in
audit_net_init() and it should remain valid until we kill it in
audit_next_exit(); we sorta cheat on this now because we track the
socket both in the per-netns audit_net struct as well as audit_sock,
but that doesn't make our audit_sock manipulations right ...

Man I hate this code.  I *really* hate this code.

-- 
paul moore
www.paul-moore.com

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: netlink: GPF in netlink_unicast