On Thu, 2009-03-05 at 11:48 -0500, John Dennis wrote:
LC Bruzenak wrote:
...
> environment. I also have to ensure that the participating
systems do not
> reuse old UIDs or remove expired ones from their password file.
>
I think this approach is problematic. First of all the set of
uid's from
a given machine is ephemeral. The contents of the passwd file needs to
be matched to a time interval matching the log data. Secondly, the
passwd file is sensitive data, you probably do not want to be shipping
that over the network and storing it without a lot of safeguards.
Thirdly, it's not just uid's that are have local context (machine local
and time local), gid's, ip-addr -> hostname mapping, kernel version
dependent data encodings, etc. are just some of the data you won't be
able to interpret correctly at some point in the future on some
disassociated remote machine.
Good points John; thanks. Most of the things which apply to "normal"
use-cases illustrated by what you describe don't apply to me, and so for
the most people I realize what I need is silly.
My UIDs are by fiat not ephemeral. GIDs are not either. If a user's role
changes they get a new username with the associated GID.
IP<-->hostname mapping isn't via DHCP and won't be in the foreseeable
future. Not sure about "kernel version dependent data encodings" but I
guess you mean future kernel audit event structure. That one is larger
then me anyway I think.
As I said, good points and I also agree that it is problematic. However,
given the timeline I'm working on I might have to assume some inadequacy
in order to meet schedule.
Because there are many problems with taking the audit log in it's
current form and storing it remotely for later analysis in IPA we are
not going to collect the audit log in it's default form. Instead we're
going to resolve (e.g. interpret using auparse terminology) all the data
on the local machine *before* we collect and store it. We plan on doing
this by writing a audispd plugin which resolves all the ambiguous local
data as it's generated (and hence before it's ever collected). This
solves both the locality of time and locality of host issues and
tremendously simplifies post-mortum analysis because the log data will
have been stripped of ambiguities and replaced with data that can simply
be read and used with no extra processing.
I think this is an excellent approach and when ready for action I'll
look to incorporate. So you are normalizing at the source. This is
essentially what I figured was a bigger effort than my suggested
kludge.
Thanks for the info; I really appreciate it!
LCB.
--
LC (Lenny) Bruzenak
lenny(a)magitekltd.com