On Wed, 2014-10-01 at 14:44 -0400, Steve Grubb wrote:
Hello,
On Sunday, September 28, 2014 02:52:53 PM Burn Alting wrote:
> Had a play around with it. I am not sure about it's value in it's
> current form.
This is why I chose to make it separate for now. Its a strawman for people to
poke at and see what's wrong before committing to something that will be
supported long term.
> Rather than specifying the keys to print, it would be better to print
> everything in the event and only 'override' the standard formatting if
there
> is an 'snode' for a key.
Sure, perhaps that is a command line option on how to use the format string.
> Further, it has a couple of immediate issues given it's using
> libauparse.
>
> - it is "lossy" in that it wont parse poorly formed audit events (see
> the op key value pair below)
> [burn@swtf auformat]$ cat add_user.txt
>
node=swtf.swtf.dyndns.org type=ADD_USER
> msg=audit(1411871714.393:47872): user pid=13455 uid=0 auid=500
> ses=11
> subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
> msg='op=adding home directory id=502 exe="/usr/sbin/useradd"
> hostname=? addr=? terminal=pts/2 res=success'
> [burn@swtf auformat]$ ./auformat "%node %date %time %milli %
> serial: type=%TYPE msg=%msg op=%op auid=%auid pid=%pid path=%
> path exe=%exe subj=%subj hostname=%hostname terminal=%terminal
> res=%res\n" add_user.txt
>
swtf.swtf.dyndns.org 09/28/2014 12:35:14 393 47872:
> type=ADD_USER msg= op=adding auid=500 pid=13455 path=
> exe="/usr/sbin/useradd"
> subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
> hostname=? terminal=pts/2 res=success
> [burn@swtf auformat]$
>
> We loose the strings
> - 'user' before the pid key
Which is meaningless in this case.
> - op='adding home directory' becomes op'adding'
> This is particularly important for incorrectly formatted application
> level audit sent via auditd.
This is a problem in the shadow-utils package. It is the one that I'm
currently having to re-do for this reason and many more. Upstream seems to
have taken a stab at re-doming the audit events and pretty much used it like
syslog.
I suppose my concern is that until we have fixed all the incorrectly
formatted key values, auparse is going to loose information.
> - 'rewinding' the event's cursor for each possible key, the call to
> auparse_first_record() in print_item(), is probably not what one would
> want - but then again, auformat is just a mock up at the moment.
Well, if you want your fields in a specific order and its not the order in the
event, then we have no choice. Note that the event is alrady parsed at this
point so we are just literally changing the position in a linked list. The
cost is a series of strcmp calls.
> - one looses the parsing 'fix-up' that ausearch does in
> src/ausearch-report.c:output_interpreted_node()
Not sure what "fix-up" we are talking about. The intention is that auparse
completely mimicks ausearch's interpretation ability (which ausearch was
switched over to use auparse a few releases back).
By 'fix-up' I meant the code like
// Some user messages have msg='uid=500 in this case
// skip the msg= piece since the real stuff is the uid=
...
// Value side has commas and another field exists
// Known: LABEL_LEVEL_CHANGE banners=none,none
// Known: ROLL_ASSIGN new-role=r,r
// Known: any MAC LABEL can potentially have commas
etc
> - to build a complete event, having addressed the 'rewinding' issue,
> would make the format look very messy - you would need to include every
> possible key to print all key/values.
If you wanted that, yeah. But I am thinking of cases where one may not want
every field. For example, you might do something like this to check file access:
# ausearch --start today -m path --raw |
auformat 'auid=%AUID res=%SUCCESS name=%NAME\n'
> - one should add event separation so that further tools could process
> the data more easily.
I am thinking of 1 event per line. This is kind of a requirement of Map
Reduce.
So you expect the complete event of my tailing audit.log
node=swtf.swtf.dyndns.org type=SYSCALL
msg=audit(1412198543.190:141570): arch=c000003e syscall=59
success=yes exit=0 a0=1a2d530 a1=1a2d350 a2=1a06f10 a3=20
items=2 ppid=19529 pid=32647 auid=500 uid=0 gid=0 euid=0 suid=0
fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=1 comm="tail"
exe="/usr/bin/tail"
subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
key="cmds"
node=swtf.swtf.dyndns.org type=EXECVE
msg=audit(1412198543.190:141570): argc=3 a0="tail" a1="-f"
a2="/var/log/audit/audit.log"
node=swtf.swtf.dyndns.org type=CWD
msg=audit(1412198543.190:141570): cwd="/home/burn"
node=swtf.swtf.dyndns.org type=PATH
msg=audit(1412198543.190:141570): item=0 name="/usr/bin/tail"
inode=2135830 dev=fd:00 mode=0100755 ouid=0 ogid=0 rdev=00:00
obj=system_u:object_r:bin_t:s0 nametype=NORMAL
node=swtf.swtf.dyndns.org type=PATH
msg=audit(1412198543.190:141570): item=1 name=(null)
inode=524293 dev=fd:00 mode=0100755 ouid=0 ogid=0 rdev=00:00
obj=system_u:object_r:ld_so_t:s0 nametype=NORMAL
to generate one line of output?
> At the moment, the only tool I'm aware of that 'correctly' parses a log
> file is ausearch.
If there are omissions in auparse, I really want to know. It must be able to
correctly parse events.
By correctly, I meant completely. It currently, in
output_interpreted_node() handles incorrectly formed key values like
op=adding home directory
as per
[burn@swtf auformat]$ /sbin/ausearch -i -if add_user.txt
----
node=swtf.swtf.dyndns.org type=ADD_USER msg=audit(09/28/2014
12:35:14.393:47872) : user pid=13455 uid=root auid=burn ses=11
subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
msg='op=adding home directory id=freddo exe=/usr/sbin/useradd hostname=?
addr=? terminal=pts/2 res=success'
[burn@swtf auformat]$
> Perhaps we would be better served by adding another
> output option to ausearch to print events in a much more parse-able
> format (e.g. XML, JSON)
I am sort of going that way. I am thinking about logstash/elastic search and
Map reduce and how one might use the audit system when you have say 10,000
systems.
Which is my use case.
From my standpoint, I need each host to
- enrich the data e.g.
uid=500 to become uid=500(burn) (I want both the
id and interpreted name for checking id mismatches in the enterprise),
syscall=59 to become syscall=execve, etc
- not loose important data (op=adding home directory)
- turn single and multi-line events into well defined and formatted
events (xml/json),
- send the data to an aggregation point within the enterprise.
At the aggregation point I can apply capability such as logstash/elastic
search/map reduce and analyse the data.
Ideally I'd extend ausearch-report.c:output_record() to output events in
a well defined format (xml/json) - probably refactoring
output_interpreted_node() to generate it's current format or xml/json
depending on a flag so we only have one 'parser' to maintain.
> I am happy to work on this (either extending ausearch or working on
> auformat).
There are a couple needs at the moment are round writing a test suite to 1)
identify new fields that suddenly show up in a record, 2) locate dangling
values so they can be fixed.
Also, we need some performance testing and improvements of auparse. Does
switching to jemalloc make any difference? Is a linked list the best way to do
it? Can the field searching be smarter?
The auformat work is for now a prototype. I have near term plans to assign
specific meaning to each event so that events are more understandable. When I
have this working, then I think we can look at how we want to output the
event.
-Steve
Burn