Re: [PATCH 4.10 070/111] audit: fix auditd/kernel connection state tracking

Wednesday, 21 February 2018

* Paul Moore <paul(a)paul-moore.com&gt; wrote:

...
 On Tue, Feb 20, 2018 at 10:18 AM, Peter Zijlstra
<peterz(a)infradead.org&gt; wrote:
 > On Tue, Feb 20, 2018 at 09:51:08AM -0500, Paul Moore wrote:
 >> On Tue, Feb 20, 2018 at 9:06 AM, Peter Zijlstra <peterz(a)infradead.org&gt;
wrote:
 >
 >> > It's not at all clear to me what that code does, I just stumbled upon
 >> > __mutex_owner() outside of the mutex code itself and went WTF.
 >>
 >> If you don't want people to use __mutex_owner() outside of the mutex
 >> code I might suggest adding a rather serious comment at the top of the
 >> function, because right now I don't see anything suggesting that
 >> function shouldn't be used.  Yes, there is the double underscore
 >> prefix, but that can mean a few different things these days.
 >
 > Find below.
 >
 >> > The comment (aside from having the most horribly style) ...
 >>
 >> Yeah, your dog is ugly too.  Notice how neither comment is constructive?
 >
 > I'm sure you've seen this one:
 >
 >   https://lkml.org/lkml/2016/7/8/625

 Yep.  I stand behind my earlier comment in this thread.

 >> > Maybe if you could explain how that code is supposed to work and why it
 >> > doesn't know if it holds a lock I could make a suggestion...
 >>
 >> I just spent a few minutes looking back over the bits available in
 >> include/linux/mutex.h and I'm not seeing anything beyond
 >> __mutex_owner() which would allow us to determine the mutex owning
 >> task.  It's probably easiest for us to just track ownership ourselves.
 >> I'll put together a patch later today.
 >
 > Note that up until recently the mutex implementation didn't even have a
 > consistent owner field. And the thing is, it's very easy to use wrong,
 > only today I've seen a patch do: "__mutex_owner() == task", where
task
 > was allowed to be !current, which is just wrong.

 Arguably all the more reason why a strongly worded warning is
 important (which I see you've included below, feel free to include my
 Reviewed-by).

 > Looking through kernel/audit.c I'm not even sure I see how you would end
 > up in audit_log_start() with audit_cmd_mutex held.
 >
 > Can you give me a few code paths that trigger this? Simple git-grep is
 > failing me.

 Basically look at the code in audit_receive_msg(), but I wasn't asking
 your opinion on how we should rewrite the audit subsystem, I was just
 asking how one could determine if the current task was holding a given
 mutex in a way that was acceptable to you.  Based on your comments,
 and some further inspection of the mutex code, it appears that is/was
 not something that the core mutex code wants to support/make-visible.
 Which is perfectly fine, I just wanted to make sure I wasn't missing
 something before I went ahead and wrote a wrapper around the mutex
 code for use by audit.

 FWIW, I just put together the following patch which removes the
 __mutex_owner() call from audit and doesn't appear to break anything
 on the audit side (you're CC'd on the patch).  It has only been
 lightly tested, but I'm going to bang on it for a day or so and if I
 hear no objections I'll merge it into audit/next.

 * https://www.redhat.com/archives/linux-audit/2018-February/msg00066.html 
Could you please explain the audit_ctl_lock()/unlock() primitive you are 
introducing there? You seem to be implementing some sort of recursive locking 
primitive, but in a strange way.

AFAICS the primary problem appears to be this code path:

  audit_receive() -> audit_receive_msg() -> AUDIT_TTY_SET ->
audit_log_common_recv_msg() -> audit_log_start()

where we can arrive already holding the lock.

I.e. recursive mutex, kinda.

What's the thinking there? Neither the changelog nor the code explains this.

Thanks,

	Ingo

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [PATCH 4.10 070/111] audit: fix auditd/kernel connection state tracking