On Tuesday 30 September 2008 09:34:00 Matthew Booth wrote:
To measure the overhead of libauparse on austream I initialised
auparse
as AUSOURCE_FEED, fed each received record into it, and spat them out
unmodified on receiving the AUPARSE_CB_EVENT_READY event.
This is not an apples to apples comparison. libauparse fully parses each
field. So, it is doing significantly more work. You could add a strtok_r loop
to your code to come closer to a direct comparison.
This added more than an order of magnitude to the time austream
spends in
userspace. A brief look at this overhead shows that about 40% is spent
in malloc()/free(),
Yep. The main idea was really to just get it working and then optimize in a
future release. The memory management is the low hanging fruit. I've been
thinking to fix it so that each field is not malloc'ed individually, that
there are string pointers and lengths stored and used internally.
and 25% is spent in strlen, strdup, memcpy, memmove and friends. I
suspect
that very substantial gains could be made in the performance of libauparse
by reworking the way it uses memory, and passing the length of strings
around with the strings. Unfortunately, I suspect this would amount to a
substantial rewrite.
Possibly. But I wasn't planning to do this until after solving the interlaced
record problem. I'd rather it be the current speed and correct than faster
and still wrong.
Is this something anybody else is interested in? I guess performance
isn't so important if you're just scanning log files in non-real time.
Yes, after the next release. In the mean time it might not hurt to add some
tests to the auparse_test programs so that any re-write induced regression
has a chance of being found.
[1] What I'd really like is a well-defined binary format from the
kernel.
Not likely to ever happen.
-Steve