On Wed, Feb 10, 2021 at 1:07 PM LC Bruzenak <lenny@magitekltd.com> wrote:
On Mon, Feb 8, 2021 at 7:44 PM Steve Grubb <sgrubb@redhat.com> wrote:
Hello,

I have recently checked in to the audit tree 2 experimental plugins. You can
enable them by passing --enable-experimental to configure. One of the new
plugins is aimed at providing audit metrics to a statsd server. The idea
being that you can use this to relay the metrics to influxdb, prometheus or
some other collector. Then you can use Grafana to visualize and alert.

Currently, it supports the following metrics:

kernel.audit.lost
kernel.audit.backlog
auditd.free_space
auditd.plugin_current_depth
auditd.plugin_max_depth
audit_events.total_count
audit_events.total_failed
audit_events.avc_count
audit_events.fanotify_count
audit_events.logins_failed
audit_events.logins_success
audit_events.anomaly_count
audit_events.response_count

I'd be interested in hearing if this would be useful. And if these are the
right metrics that people are interested in. Should something else be
measured? Should an example Grafana dashboard be included?

Let me know what you think.

-Steve


Steve,

I think this could be awesome; hoping to give it a try soon. An example dashboard would be very helpful if you could include that.
The stats you already point out a good start.

I'd also like to have a way to parse the per-machine kernel-assigned event IDs for missing ones. Might that need a separate plugin for that or could something be done within this setup?
I'm pretty sure there are more metrics that would be desired as well as some derived; e.g. take a per-user login/logoff set to identify time spent on a particular machine (screenlocks notwithstanding, but maybe eventually). Or perhaps if clients send events+heartbeats, when are they up/down? These are some of the questions I've heard from security overseers.

And while some of these may not be inspected directly by the end users, in the case of trouble calls or questions they might be the exact thing I'd ask them to relay to me in order to diagnose a problem or answer a question remotely.

Thx,
LCB


... and I forgot to ask - can you include a README there which specifies the minimum kernel/userspace level of code required?

LCB

--

LC (Lenny) Bruzenak
lenny@magitekltd.com