On Fri, 2014-07-11 at 12:32 -0400, Paul Moore wrote:
On Friday, July 11, 2014 12:23:33 PM Eric Paris wrote:
> On Fri, 2014-07-11 at 12:21 -0400, Paul Moore wrote:
> > On Friday, July 11, 2014 12:16:47 PM Eric Paris wrote:
> > > On Fri, 2014-07-11 at 12:11 -0400, Paul Moore wrote:
> > > > On Thursday, July 10, 2014 09:06:02 PM H. Peter Anvin wrote:
> > > > > Incidentally: do seccomp users know that on an x86-64 system you
can
> > > > > recevie system calls from any of the x86 architectures,
regardless
> > > > > of
> > > > > how the program is invoked? (This is unusual, so normally
denying
> > > > > those
> > > > > "alien" calls is the right thing to do.)
> > > >
> > > > I obviously can't speak for all seccomp users, but libseccomp
handles
> > > > this
> > > > by checking the seccomp_data->arch value at the start of the
filter
> > > > and
> > > > killing (by default) any non-native architectures. If you want, you
> > > > can
> > > > change this default behavior or add support for other architectures
> > > > (e.g.
> > > > create a filter that allows both x86-64 and x32 but disallows x86,
or
> > > > any
> > > > combination of the three for that matter).
> > >
> > > Maybe libseccomp does some HORRIFIC contortions under the hood, but the
> > > interface is crap... Since seccomp_data->arch can't distinguish
between
> > > X32 and X86_64. If I write a seccomp filter which says
> > >
> > > KILL arch != x86_64
> > > KILL init_module
> > > ALLOW everything else
> > >
> > > I can still call init_module, I just have to use the X32 variant.
> > >
> > > If libseccomp is translating:
> > >
> > > KILL arch != x86_64 into:
> > >
> > > KILL arch != x86_64
> > > KILL syscall_nr >= 2000
> > >
> > > That's just showing how dumb the kernel interface is... Good for
you
> > > guys, but the kernel is just being dumb :)
> >
> > You're not going to hear me ever say that I like how the x32 ABI was done,
> > it is a real mess from a seccomp filter point of view and we have to do
> > some nasty stuff in libseccomp to make it all work correctly (see my
> > comments on the libseccomp-devel list regarding my severe displeasure
> > over x32), but what's done is done.
> >
> > I think it's too late to change the x32 seccomp filter ABI.
>
> So we have a security interface that is damn near impossible to get
> right. Perfect.
What? Having to do two comparisons instead of one is "damn near impossible"?
I think that might be a bit of an overreaction don't you think?
Actually no. How can a normal userspace application coder POSSIBLY know
this? Find this thread on an e-mail list, by accident?
> I think this explains exactly why I support this idea. Make X32 look
> like everyone else ...
You do realize that this patch set makes x32 the odd man out by having
syscall_get_nr() return a different syscall number than what was used to make
the syscall? I don't understand how that makes "x32 look like everyone
else".
Ok, I buy the __X32_SYSCALL_BIT argument. It can be dealt with in
audit. No problem. We don't need to strip it in syscall_get_nr().
I'll gladly concede that part of the patch series.
But given an x86_64 kernel a seccomp filter writer has to know about X32
and how to write rules to block the X32 ABI. And I stick with my
assessment that x32 + seccomp is darn near impossible for a normal
developer to handle.
Heck, even chromium took months to realize that x32 was a weird beast.
And they got it wrong on their first try. Their original implementation
didn't handle __X32_SYSCALL_BIT quite right. Looking at their code I'm
still not sure it does the right thing. And they are the EXPERTS. They
wrote seccomp!
> Honestly, how many people are using seccomp on X32 and would be
horribly
> pissed if we just fixed it?
Okay, please stop suggesting we break the x32 kernel/user interface to
workaround a flaw in audit. I get that it sucks for audit, I really do, but
this is audit's problem.
No one is asking to break X32 to fix audit. Audit can handle itself. I
don't want anything in the kernel to pretend that X32 is X86_64. It
isn't. It has its own syscall table. Its own syscalls. Its own ABI.
I'm suggesting to fix how seccomp exposes X32 information because it is
a HORRIBLE interface that even the experts have gotten wrong, over and
over and over.
I suggest we accept it as breakage and just return AUDIT_ARCH_X32.
(Leaving the _X32_SYSCALL_BIT exposed as it is today)
But I'd love to hear some thoughts on how that is a bad thing. If no
one is using the x32 seccomp abi, lets fix it. If someone is, lets see
what the fallout from fixing it will be.
-Eric