On 02/07/2017 06:43 PM, Kees Cook wrote:
On Tue, Feb 7, 2017 at 4:25 PM, Tyler Hicks
<tyhicks(a)canonical.com> wrote:
> On 02/07/2017 06:03 PM, Kees Cook wrote:
>> On Thu, Feb 2, 2017 at 9:37 PM, Tyler Hicks <tyhicks(a)canonical.com> wrote:
>>> This patch creates a read-only sysctl containing an ordered list of
>>> seccomp actions that the kernel supports. The ordering, from left to
>>> right, is the lowest action value (kill) to the highest action value
>>> (allow). Currently, a read of the sysctl file would return "kill trap
>>> errno trace allow". The contents of this sysctl file can be useful for
>>> userspace code as well as the system administrator.
>>>
>>> The path to the sysctl is:
>>>
>>> /proc/sys/kernel/seccomp/actions_avail
>>>
>>> libseccomp and other userspace code can easily determine which actions
>>> the current kernel supports. The set of actions supported by the current
>>> kernel may be different than the set of action macros found in kernel
>>> headers that were installed where the userspace code was built.
>>
>> This is certainly good: having a discoverable way to detect filter
>> capabilities. I do wonder if it'd still be easier to just expose the
>> max_log sysctl as a numeric value, since the SECCOMP_RET_* values are
>> all part of uapi, so we can't escape their values...
>
> I was very torn on whether to use a numeric or string representation
> here. The reason I decided on string representation is because I think
> these sysctls are mostly aimed for admins and numeric representations
> wouldn't be easy to use. I considered added a utility to libseccomp but,
> since the kernel code to do a string representation was so simple, I
> went with doing it in the kernel.
Yeah, I think I like it just because it gives a way to discover the
UAPI "level"... I will think more about this. For v3, let's keep the
string stuff.
> Another possibility is exposing the SECCOMP_RET_*_NAME macros as part of
> the uapi.
I like keeping the UAPI minimal. ;)
>>> +static int __init seccomp_sysctl_init(void)
>>> +{
>>> + struct ctl_table_header *hdr;
>>> +
>>> + hdr = register_sysctl_paths(seccomp_sysctl_path,
seccomp_sysctl_table);
>>> + kmemleak_not_leak(hdr);
>>
>> Will kmemleak complain about this if hdr is saved to a global (or not
>> saved at all)? Also, something should be reported in the failure
>> case...
>
> I have to admit to blindly following the example set by sysctl_init() in
> kernel/sysctl.c. I can test what kmemleak will/won't complain about and
> report back (tomorrow at the earliest).
Cool, no rush. I'm backlogged on reviews anyway. :)
kmemleak doesn't complain if we save it to a global. That makes sense
because it means that we have a persistent reference to the allocated
memory.
However, kmemleak doesn't complain about this allocation as-is (meaning
that I simply removed the call to kmemleak_not_leak()). From what I can
tell, this is because a reference to the allocated ctl_table_header
struct is saved when __register_sysctl_table() calls init_header(). I
think kmemleak is seeing this reference when doing scans and
(incorrectly) thinking that there's no leak.
I think the safest/cleanest thing to do is leave the call to
kmemleak_not_leak(). Let me know if you disagree.
Tyler