On 06/11/2013 05:24 AM, Serge E. Hallyn wrote:
Quoting Gao feng (gaofeng(a)cn.fujitsu.com):
> On 06/07/2013 06:47 AM, Serge Hallyn wrote:
>> Quoting Serge Hallyn (serge.hallyn(a)ubuntu.com):
>>> Quoting Gao feng (gaofeng(a)cn.fujitsu.com):
>>>> On 05/07/2013 10:20 AM, Gao feng wrote:
>>>>> This patchset try to add namespace support for audit.
>>>>>
>>>>> I choose to assign audit to the user namespace.
>>>>> Right now,there are six kinds of namespaces, such as
>>>>> net, mount, ipc, pid, uts and user. the first five
>>>>> namespaces have special usage. the audit isn't suitable to
>>>>> belong to these five namespaces, so the user namespace
>>>>> may be the best choice.
>>>>>
>>>>> Through I decide to make audit related resources per user
>>>>> namespace, but audit uses netlink to communicate between kernel
>>>>> space and user space, and the netlink is a private resource
>>>>> of per net namespace. So we need the capability to allow the
>>>>> netlink sockets to communicate with each other in the same user
>>>>> namespace even they are in different net namespace. [PATCH 2/48]
>>>>> does this job, it adds a new function "compare" for per
netlink
>>>>> table to compare two sockets. it means the netlink protocols can
>>>>> has its own compare fuction, For other protocols, two netlink
>>>>> sockets are different if they belong to the different net namespace.
>>>>> For audit protocol, two sockets can be the same even they in
different
>>>>> net namespace,we use user namespace not net namespace to make the
>>>>> decision.
>>>>>
>>>>> There is one point that some people may dislike,in [PATCH 1/48],
>>>>> the kernel side audit netlink socket is created only when we create
>>>>> the first netns for the userns, and this userns will hold the netns
>>>>> until we destroy this userns.
>>>>>
>>>>> The other patches just make the audit related resources per
>>>>> user namespace.
>>>>>
>>>>> This patchset is sent as an RFC,any comments are welcome.
>>>
>>> Hi,
>>>
>>> thanks for sending this. I think you need to ping the selinux folks
>>> for comment though. It appears to me that, after this patchset, the
>>> kernel with CONFIG_USER_NS=y could not be LSPP-compliant, because
>>> the selinux-generated audit messages do not always go to init_user_ns.
>>>
>>> Additionally, the only type of namespacing selinux wants is where it
>>> is enforced by policy compiler and installer using typenames - i.e.
>>> 'container1.user_t' vs 'user_t'. Selinux does not want user
namespaces
>>> to affect selinux enforcement at all. (at least last I knew, several
>>> years ago at a mini-summit, I believe this was from Stephen Smalley).
>>
>> That sort of sounds like I'm distancing myself from that, which I
>> don't mean to do. I agree with the decison: MAC (selinux, apparmor
>> and smack) should not be confuddled by user namespaces. (posix caps
>> are, as always, a bit different).
>
>
> Thanks for your comments!
>
> Very useful information, it sounds reasonable.
>
> Let's just drop those patches.
>
Hi Gao,
proceeding then,
The netfilter related changes I think make sense. They log to the userns
which owns the netns in question, which seems right.
However looking at Audit-tty-translate-audit_log_start-to-audit_log_sta.patch,
it appears to log to the userns of the task which is doing the operation.
Keeping in mind that an unprivileged user can create a new user namespace,
this doesn't seem right.
Also, you are introducing per-userns syscall filter. It looks like I
can then create a new userns to escape my existing syscall filter, since
the filters up the user_ns parent chain are not being applied. Is that
correct?
Hi Serge,
I admit that the global resources related audit message should be logged to
parent and ancestor. but this is more complex than the way I implemented.
Because we should send message to all ancestor and we should consider not
to exceed the rate_limit of all ancestor.
I prefer to don't make these filters/rules per user namespace right now.
Did you have a particular rationale written out for what precisely you're
wanting to make per-userns? That would be helpful in trying to figure
out which bits are appropriate. Again I so far haven't seen a single
problem with the code itself, it's just a question of which bits we
actually want (and are safe).
In my option, the audit rules(inode, tree_list, filter) , some of audit
controller related resources(enabled,pid,portid...) and skb queue, audit
netlink sockets,kauditd thread should be per-userns. The audit user message
which generated by the user in container should be per-userns too.
Since netns is not implemented as a hierarchy, and the network related
resources are not global. so network related audit message should be per-userns too.
The security related audit message should be send to init user namespace
as we discussed before. Maybe tty related audit message should be send
to init user namespace too, I have no idea now.
The next step, I will post a new patchset which only make the audit user
message and the basic audit resource per userns. I think this patchset
will easy to be reviewed and accepted, And will not influence the host.
This patchset contains the below patches:
Gao feng (21):
Audit: make audit kernel side netlink sock per userns
netlink: Add compare function for netlink_table
Audit: implement audit self-defined compare function
Audit: make audit_skb_queue per user namespace
Audit: make audit_skb_hold_queue per user namespace
Audit: make kauditd_task per user namespace
Audit: make audit_pid per user namespace
Audit: make audit_nlk_portid per user namesapce
Audit: make audit_enabled per user namespace
Audit: make audit_ever_enabled per user namespace
Audit: make audit_initialized per user namespace
Audit: only allow init user namespace to change rate limit
Audit: only allow init user namespace to change audit_failure
Audit: allow to send netlink message to auditd in uninit user
namespace
Audit: make kauditd_wait per user namespace
Audit: make audit_backlog_wait per user namespace
Audit: introduce new audit logging interface for user namespace
Audit: pass proper user namespace to audit_log_common_recv_msg
Audit: Log audit config change in uninit user namespace
Audit: send reply message to the auditd in proper user namespace
Audit: Allow GET,SET,USER MSG operations in uninit user namespace
include/linux/audit.h | 39 +++-
include/linux/netlink.h | 1 +
include/linux/user_namespace.h | 33 +++-
kernel/audit.c | 422 ++++++++++++++++++++++++++---------------
kernel/audit.h | 5 +-
kernel/auditsc.c | 11 +-
kernel/user_namespace.c | 3 +
net/netlink/af_netlink.c | 32 +++-
net/netlink/af_netlink.h | 1 +
9 files changed, 369 insertions(+), 178 deletions(-)
Do you have any comments or advice to this plan? After the above patchs
been accepted, I think it's easy to push other audit namespace related
patches into upstream.
Thanks,
Gao
-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at
http://vger.kernel.org/majordomo-info.html
Please read the FAQ at
http://www.tux.org/lkml/