Steve Grubb wrote:
On Tuesday 29 January 2008 17:56:36 John Dennis wrote:
Hence the audit parsing library. The idea is to abstract this away so
that
anyone wanting to write a tool does not need to study all the messages and
figure out the parsing rules.
The way forward has to be the audit parsing library.
The problem is auparse is just as screwed as anybody else. Unparseable
output is is just plain wrong and inexcusable. You're suggesting auparse
embed all sorts of hacks and heuristics to unravel a problem which
should never exist in the first place. It's a house of cards which in
time will collapse. You also haven't explained how auparse is going to
deal with log data generated by different kernel versions, especially
when logs are aggregated.
tools developed around these messages and making wholesale changes
will break
them.
Break what is already fundamentally broken? That's not an answer ;-)
Any fix will break someone's tool somewhere unless they are coded
to the audit
parsing library.
auparse is going to break too. The current situation is you can't
determine if a field is encoded or not by reading the output, you also
have to know the kernel source code, that's wrong.
> Auparse is not the answer to irregular kernel audit message
This is the answer in so many ways. In order to make any change, you
have to
decouple applications from the actual data structure. You cannot normalize
the data without breaking somebody somewhere.
Which is why making the output so it can be parsed independent of the
kernel version an essential requirement.
For example, suppose we all agreed the data structure is an
abomination and
had to be fixed. We get all the code into 2.6.26 kernel. meanwhile Fedora 9
is released on the 2.6.24 kernel. We get the user space pieces fixed up to be
released at the same time as 2.6.26. Then Fedora steps up to 2.6.25 kernel
and then ultimately 2.6.26. The userspace in Fedora 9 was never intended to
work with the new format. We can't keep the kernel team from doing what's
right for everyone that wants new device drivers. We're stuck.
You're only stuck if the output can only be parsed by one version, if
the output were regular the problem goes away. Isn't that the desired
result?
> auparse_get_field_str() returns the field value in it's
encoded form,
I would chose the words, raw form.
Yes, raw is a better term. Some raw values are encoded, some aren't,
that's the problem.
> this is almost never of value to the caller. The caller wants
the
> field value to be unencoded so it can operate on it.
Sometimes. It depends on the situation.
Very rarely. As an analogy 99.99% of the time you want your email client
to decode the contents from the transfer encoding it arrived in,
otherwise it's just gibberish. Raw form is really only useful when
debugging the encoding/decoding.
> If you want the field value to be unencoded you have to call
> auparse_interpret_field().
Correct.
> But auparse_interpret_field() performs two distinctly different
operations,
It does only one thing, that is translate the data from raw to interpreted
form.
Wrong :-) It does two entirely different things and those operations
cannot be separated. The two operations are:
1) decoding (e.g. decoding a field value encoded in hexadecimal form
back into it's original string)
2) interpretation (e.g. translating a uid field into a username). I call
this interpretation "contextual substitution" because it's taking a
field value and substituting in another value, often in a different
format. You cannot interpret a field value until it has been decoded.
What if I don't want auparse to change the field value and instead
simply return the field value? Currently you can't simply get the field
value! Why? Because some fields are encoded, so you either get the raw
encoded value (which is meaningless 99.99% of the time, if it had been
encoded) or you get something which is completely munged.
So, John, if you want selinux format changes, complain on their mail
list.
I've already done that and lost. :)
FWIW, I can live with not changing the message contents. But no one can
live with a situation where the data can't be parsed, it is simply
wrong. Just to be clear the problem is you can't determine as one parses
if a field value is encoded or not which means you can't decide if it
has to be decoded or not.
Here is an example from the real world, an audit message has this field
comm=df
So is the value the string "df" (e.g. disk free) or is this the
hexadecimal encoded byte value 223? The only way to know is by looking
at the kernel source code and knowing that the "comm" field in a
specific audit record is generated by calling
audit_log_untrustedstring(). What if it doesn't call that in an
different kernel version? What if a new field is added in a new kernel
version, how will the parser know what which function kernel used to
generate the string? What if in one kernel version the string was output
with audit_log_untrustedstring() but in another kernel version it wasn't?
--
John Dennis <jdennis(a)redhat.com>