On Fri, Aug 17, 2018 at 5:24 PM Preston Bennes
<preston.bennes(a)snowflake.net> wrote:
Hey Paul,
My expectations are aligned to what you've pointed out already (not-vanilla old
kernel, user space client other than auditd).
Primarily what I'm looking for is a sanity check on the null pointer deref on current
vanilla kernel code because I think the rest of the issues stem from that. My C isn't
sharp enough to try my hand at a reproduction case, though I would hope someone with the
requisite expertise could put one together relatively quickly. specifically this message:
> audit: type=1305 audit(1534170894.852:117649234): auid=501 ses=102936
op="add_rule" key=(null) list=4 res=0
Unfortunately without digging into Amazon's source for that kernel
build it is difficult to come to any strong conclusion.
The audit message you are seeing above is an AUDIT_CONFIG_CHANGE (type
== 1305), and based on the 'op="add_rule"' field it appears as though
the system is adding an exit (list == 4) filter rule to the kernel.
The operation failed (res == 0), which shouldn't be surprising given
the kernel oops.
I'm not very familiar with osquery, does it load it's own audit filter
rules? Are you loading some of your own?
On Fri, Aug 17, 2018 at 2:12 PM Paul Moore
<paul(a)paul-moore.com> wrote:
>
> On Fri, Aug 17, 2018 at 4:26 PM Preston Bennes
> <preston.bennes(a)snowflake.net> wrote:
> > Hey Audit team,
>
> Hey Audit user!
>
> > I've got a kernel null pointer deref oops on Amazon Linux kernel
4.9.51-10.52.amzn1.x86_64. After the oops all new cron processes that spawned were stuck
in D wait on some audit related syscall. This exhausted system file handles and the box
ended up needing to be rebooted out-of-bounds. The audit handle was held by
https://github.com/facebook/osquery/.
> >
> > It looks like there was an issue with audit sending to osquery (netlink_unicast
sending to pid 21233, error -111) followed by something (presumably also osquery)
attempting to 'op="add_rule" key=(null)'.
>
> The first thing I need to comment on is that since this is a distro
> specific kernel (4.9.51-10.52.amzn1.x86_64) and not an upstream kernel
> there is a limit in the amount of help we can provide. Generally the
> upstream list focuses on the upstream code and we leave support for
> the distro specific builds/packages up to the distro. If you can
> reproduce your problem on an upstream kernel, with a strong preference
> for a *recent* upstream kernel, we can probably be of more help. I
> simply don't know what patches Amazon merges into it's amzn1 kernels.
>
> With that out of the way, I will add that v4.9.x is a rather old
> kernel, and we've made a number of relevant improvements since the
> v4.9 release and it is unlikely all of them have been backported to
> your kernel. If possible, I would suggest upgrading to a newer kernel
> as it may resolve your problem.
>
> There is also a possible issue of osquery as an auditd replacement;
> it's unlikely this is being caused by osquery, but we don't test with
> osquery to that save level as we do with Steve's audit userspace
> (auditd).
>
> > Here's from /var/log/messages.
> >
> >> Aug 13 14:34:54 packer_default-10-180-21-24 kernel: [4749520.133904] audit:
netlink_unicast sending to audit_pid=21233 returned error: -111
> >> Aug 13 14:34:54 packer_default-10-180-21-24 kernel: [4749520.137982]
audit_log_lost: 12 callbacks suppressed
> >> Aug 13 14:34:54 packer_default-10-180-21-24 kernel: [4749520.137983] audit:
audit_lost=36081041 audit_rate_limit=8192 audit_backlog_limit=4096
> >> Aug 13 14:34:54 packer_default-10-180-21-24 kernel: [4749520.137985] audit:
type=1305 audit(1534170894.852:117649229): audit_pid=21233 old=21233 auid=501 ses=102936
res=0
> >> Aug 13 14:34:54 packer_default-10-180-21-24 kernel: [4749520.137986] audit:
type=1305 audit(1534170894.852:117649230): audit_enabled=1 old=1 auid=501 ses=102936
res=1
> >> Aug 13 14:34:54 packer_default-10-180-21-24 kernel: [4749520.137987] audit:
type=1305 audit(1534170894.852:117649231): audit_backlog_wait_time=1 old=1 auid=501
ses=102936 res=1
> >> Aug 13 14:34:54 packer_default-10-180-21-24 kernel: [4749520.137989] audit:
type=1305 audit(1534170894.852:117649232): audit_backlog_limit=4096 old=4096 auid=501
ses=102936 res=1
> >> Aug 13 14:34:54 packer_default-10-180-21-24 kernel: [4749520.137990] audit:
type=1305 audit(1534170894.852:117649233): audit_failure=0 old=0 auid=501 ses=102936
res=1
> >> Aug 13 14:34:54 packer_default-10-180-21-24 kernel: [4749520.137991] audit:
type=1305 audit(1534170894.852:117649234): auid=501 ses=102936 op="add_rule"
key=(null) list=4 res=0
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.138007] BUG:
unable to handle kernel NULL pointer dereference at 00000000000001e0
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.143872] IP:
[<ffffffff814a76da>] netlink_unicast+0x4a/0x230
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.148336] PGD 0
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.149822]
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.151230] Oops:
0000 [#1] SMP
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.153736] Modules
linked in: iptable_filter(E) ip_tables(E) x_tables(E) udp_diag(E) nfnetlink_queue(E)
nfnetlink_log(E) nfnetlink(E) tcp_diag(E) inet_diag(E) isofs(E) ipv6(E) crc_
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.184036] CPU: 0
PID: 21372 Comm: osqueryd Tainted: G E 4.9.51-10.52.amzn1.x86_64 #1
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.190620]
Hardware name: Xen HVM domU, BIOS 4.2.amazon 08/24/2006
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.195513] task:
ffff8807fa1c3b00 task.stack: ffffc900079a4000
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.200179] RIP:
0010:[<ffffffff814a76da>] [<ffffffff814a76da>] netlink_unicast+0x4a/0x230
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.206822] RSP:
0018:ffffc900079a7c38 EFLAGS: 00010246
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.211065] RAX:
0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.216779] RDX:
00001ffffffffffe RSI: 00000000024000c0 RDI: 0000000000000014
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.222473] RBP:
ffffc900079a7c68 R08: 0000000000000004 R09: ffff88062a375414
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.228167] R10:
ffff8808004032c0 R11: ffff88062a375400 R12: 0000000000000000
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.233738] R13:
ffff8804200a0100 R14: 00000000000052f1 R15: 0000000000016d42
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.238999] FS:
00007f3d40cc4700(0000) GS:ffff880800a00000(0000) knlGS:0000000000000000
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.244796] CS:
0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.249555] CR2:
00000000000001e0 CR3: 000000059e63d000 CR4: 00000000001406f0
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.255401] DR0:
0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.260725] DR3:
0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.266038] Stack:
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.267680]
00000000079a7c58 00000000000052f1 00000000000052f1 00000000ffffffa1
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.273540]
ffff88062a376800 ffff8804200a1d00 ffffc900079a7cf8 ffffffff811107d2
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.279379]
ffffc90000000004 ffffffff811e2f85 ffffc900079a7ca8 ffffffff8145a08c
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.285201] Call
Trace:
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.287137]
[<ffffffff811107d2>] audit_receive_msg+0x992/0xc20
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.291561]
[<ffffffff811e2f85>] ? __kmalloc_node_track_caller+0x35/0x260
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.296642]
[<ffffffff8145a08c>] ? release_sock+0x8c/0xa0
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.300718]
[<ffffffff81110ab2>] audit_receive+0x52/0xa0
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.304713]
[<ffffffff814a77ef>] netlink_unicast+0x15f/0x230
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.308943]
[<ffffffff814a7bdc>] netlink_sendmsg+0x31c/0x390
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.313236]
[<ffffffff81455818>] sock_sendmsg+0x38/0x50
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.317219]
[<ffffffff81455c4f>] SYSC_sendto+0xef/0x170
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.321142]
[<ffffffff811156ab>] ? __audit_syscall_entry+0xeb/0x100
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.325853]
[<ffffffff81003437>] ? syscall_trace_enter+0x1b7/0x290
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.330464]
[<ffffffff811158b3>] ? __audit_syscall_exit+0x1f3/0x280
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.335189]
[<ffffffff8145673e>] SyS_sendto+0xe/0x10
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.338952]
[<ffffffff81003854>] do_syscall_64+0x54/0xc0
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.342995]
[<ffffffff8153d32b>] entry_SYSCALL64_slow_path+0x25/0x25
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.347840] Code:
65 8b 05 a2 5b b6 7e 25 00 ff 00 00 83 f8 01 19 f6 81 e6 a0 00 38 00 81 c6 20 00 08 02 e8
af cf ff ff 49 89 c5 31 c0 85 db 75 08 <49> 8b 84 24 e0 01 00 00 48 89 45
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.368625] RIP
[<ffffffff814a76da>] netlink_unicast+0x4a/0x230
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.373253] RSP
<ffffc900079a7c38>
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.376013] CR2:
00000000000001e0
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.379307] ---[
end trace 6a41b2274729ba2a ]---
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.663029] audit:
type=1300 audit(1534170895.379:117649235): arch=c000003e syscall=59 success=yes exit=0
a0=271cce0 a1=2741dd0 a2=26dc3b0 a3=7ffdd7f1ea80 items=2 ppid=4556 pid=2407
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.682741] audit:
type=1309 audit(1534170895.379:117649235): argc=2 a0="date" a1="+%s"
> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.688585] audit:
type=1307 audit(1534170895.379:117649235): cwd="/"
> >> Aug 13 14:34:55 packer_default-10-180-21-24 abrt-dump-oops: Reported 1
kernel oopses to Abrt
> >
> > --
> > Linux-audit mailing list
> > Linux-audit(a)redhat.com
> >
https://www.redhat.com/mailman/listinfo/linux-audit
>
> --
> paul moore
>
www.paul-moore.com
--
paul moore
www.paul-moore.com