On Tuesday 29 January 2008 17:56:36 John Dennis wrote:
The format of audit messages from the kernel is a mess.
<humor> John, the code was hard to write, so it should be hard to parse!
The bottom line is one cannot parse the audit messages without
special
case knowledge of each audit message because the data formatting does
not follow any regular rules.
Hence the audit parsing library. The idea is to abstract this away so that
anyone wanting to write a tool does not need to study all the messages and
figure out the parsing rules.
The way forward has to be the audit parsing library. At this point, there are
tools developed around these messages and making wholesale changes will break
them. This is the reason that the SE Linux messages are such a mess. I've
raised my concern with the developers on the selinux mail list and they
essentially told me they are not willing to make any changes. But they did
agree to some keywords for the fields you mention so that we could go ahead
and code tools.
I don't know how it got this way, but it really needs to be
fixed.
Any fix will break someone's tool somewhere unless they are coded to the audit
parsing library.
Most of these problems can easily be fixed if there is exactly one
central place to format an audit field value.
This is what we've done with user space. As for the kernel, essentially there
is no maintainer or anyone interested in doing audit work. I pretty much have
to force people to touch it. So, good luck getting kernel work done.
At this point, I don't think we want to do too much to the kernel. If you know
of any cases in the kernel that has result= instead of res=, send a patch. We
should probably fix those.
Auparse is not the answer:
--------------------------
Auparse is not the answer to irregular kernel audit message
formatting. First of all it forces auparse to have special case logic
which is not 100% robust and is tied to the kernel source code
version.
This is the answer in so many ways. In order to make any change, you have to
decouple applications from the actual data structure. You cannot normalize
the data without breaking somebody somewhere.
For example, suppose we all agreed the data structure is an abomination and
had to be fixed. We get all the code into 2.6.26 kernel. meanwhile Fedora 9
is released on the 2.6.24 kernel. We get the user space pieces fixed up to be
released at the same time as 2.6.26. Then Fedora steps up to 2.6.25 kernel
and then ultimately 2.6.26. The userspace in Fedora 9 was never intended to
work with the new format. We can't keep the kernel team from doing what's
right for everyone that wants new device drivers. We're stuck.
The only answer that is sane is to convert tools to auparse. When that is
done, make some small changes over time and evolve the data slowly so any new
quirks can be adjusted over time. We can introduce the notion of aliases to
keep old tools working. A big change will be a big disruption.
auparse_get_field_str() returns the field value in it's encoded
form,
I would chose the words, raw form.
this is almost never of value to the caller. The caller wants the
field value to be unencoded so it can operate on it.
Sometimes. It depends on the situation.
If you want the field value to be unencoded you have to call
auparse_interpret_field().
Correct.
But auparse_interpret_field() performs two distinctly different
operations,
It does only one thing, that is translate the data from raw to interpreted
form.
it both decodes AND performs contextual substitution. Contextual
substitution only has meaning when applied on the same host and at
approximately the same time as when the audit record was generated.
Correct. You are talking about something the library does not handle today.
The reason is because there is no designed method to aggregate logs. So, when
that work is done, auparse will be fixed up to handle the situation.
This rant kinda sounds like you are volunteering to help when we get there.
If we do fix the format of audit messages we might as well fix some
other inconsistencies at the same time.
1) The initial part of AVC messages do not follow the standard
name=value formatting used everywhere else in audit.
Right, they said bugger off, we like syslog better anyways.
a) It includes the string "avc:" which is redundant
with the audit
record type (e.g. type=AVC), the string "avc:" should be removed,
it serves no purpose and only makes parsing much harder because of
the inconsistency.
b) denied|granted are bare words without a field name, it should be
seresult="denied", once again to avoid special case parsing.
What's worse...I'm writing IDS software. Sometimes all you have is an AVC
record and no syscall record. You don't know if the system was in permissive
or enforcing mode at the time of the syscall. (Sure you can test when you see
the record, but that's at a different time than the event.) So, you have no
way of assessing the impact of the AVC. Was the action really denied or not?
You know what it wants to do, but not if it did it. There is no success or
result field. Anyways, instead of whine about it here, I will eventually
write to the selinux mail list where that kind of discussion belongs.
So, John, if you want selinux format changes, complain on their mail list.
I've already done that and lost. :)
-Steve