On 05/23/2011 09:19 PM, H. Peter Anvin wrote:
On 05/23/2011 06:04 PM, Eric Paris wrote:
> 1) The audit_syscall_exit function expects a long. But if you chop off
> the upper 32 bits you can't tell positive from negative. Thus when it
> prints to userspace using %ld we have a problem:
> Aka printf("%ld", (long)(u32)(-13)) = "4294967283"
> vs printf("%ld", (long)(-13)) = "-13"
This seems like the fundamental design error.
Possibly so (I'm not convinced), but not a fixable problem given the bug
for bug compatibility requirements of the kernel. The syscall return
value (either rax or eax) is passed to audit_syscall_exit() which
believes it is a long (aka s64). It builds a string buffer using
sprintf("%ld") and then exports that buffer to userspace via a netlink
socket. That buffer gets dumped as a raw string into a file. Some
tools may later process the strings. Getting the right string into the
netlink socket is what I consider the unchangeable ABI. Prior to
5cbf1565f29eb57a this was all handled by normal 64bit C code which did
exactly what I'm describing here. It never needlessly truncated the
return code to 32 bits on ia32exit. Solving that regression is what I'm
fixing.
You're missing something fundamental: if userspace is 32 bits,
those
bits don't even exist. If userspace is 64 bits (and it is possible for
a 64-bit process to call the 32-bit entry point) those bits could at
least theoretically contain bad information.
This is at syscall exit, in 64bit mode, so rax is going to contain a
64bit version of the return code. I'm not trying to hand 64 bit values
back to a 32 bit process. The code converts the return value using %ld
and then dumps it as a string to auditd. Even if auditd was 32bit, it's
not processing the string, just writing it to a file.
It sounds like this code is broken in some very fundamental ways,
and
that you're trying to paper it over.
Obviously we agree there is a second problem not addressed in this patch
(that many arches uses +/- instead of >=-MAX_ERRNO) but the fact that we
have a regression in which the assembly removes the sign and then passes
the now truncated value to a function expecting a long is the problem of
this patch.
I could paper over the problem in the audit code, doing my own sign
craziness based on the arch, but I think the real problem is in the
assembly dropping information needlessly.....
-Eric