On Tuesday 12 August 2008 14:16:07 John Dennis wrote:
Unfortunately string handling in audit is seriously broken and has
been
for a long time. The audit code does not know how to handle strings with
embedded spaces, quotes, etc. The fundamental problem is the format for
string encoding was never defined.
John, the format is well defined. It was decided on by a group of people lead
by an influential kernel developer. He has long since quit working on the
audit code, but it was his suggestion that this is the only thing that would
fly upstream. There was a suggestion to use XDR encoding/decoding, but that
was nixed. The main concern at the time was that they didn't want any fancy
scheme inside the kernel for a user space problem. They wanted compactness in
the algorithm.
The encoding scheme is simple, if any character in a field meets this test:
if (*p == '"' || *p < 0x21 || *p > 0x7f)
You have to encode. Its that simple. This scheme is unicode friendly as well
as taking care of spaces. I didn't invent this, but I have to ensure that
everything continues to work.
If somebody has a better idea/code in hand when we start the 2.0 code, I'd
like to consider it. The pre-requisites are it has to be backward compatible,
it has to handle unicode, it has to handle fields with odd characters.
-Steve