performance tests
by Chris Wright
I was hoping to capture some data re: the issue with enabling then later
disabling the audit subsystem. My data is inconclusive, and doesn't
clearly point to any issue. So perhaps someone who is able to show the
performance hit can run with this. Attached is the script I used to
run the tests, and at the bottom are URLs with raw data.
I ran 3 rounds of tests back-to-back on two separate kernels. The first
kernel is unpatched mainline kernel, the second is patched with the
TIF_SYSCALL_AUDIT patch that Steve and I were tossing about last week.
On each kernel I boot without audit enabled and run the first round of
tests, enable audit and run second round of tests, disable audit and
run third round of tests. Each round of tests simply run lmbench, then
build a kernel. Each stage gets fresh oprofile data capture.
Here's some basic highlights:
LMBench
Processor, Processes - times in microseconds - smaller is better
------------------------------------------------------------------------------
Host OS Mhz null null open slct sig sig fork exec sh
call I/O stat clos TCP inst hndl proc proc proc
------------------------ ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
unpatched audit disabled 1994 0.09 0.26 16.1 18.6 9.26 0.22 2.31 554. 1018 2508
unpatched audit enabled 1994 0.33 0.60 16.6 19.3 19.7 0.48 2.87 601. 1083 2772
unpatched audit disabled 1994 0.09 0.25 16.0 18.4 11.8 0.22 2.47 577. 1055 2668
patched audit disabled 1994 0.09 0.23 16.4 18.4 12.9 0.22 2.39 551. 1007 2510
patched audit enabled 1994 0.42 0.60 16.8 19.3 8.61 0.47 2.91 573. 1075 2719
patched audit disabled 1994 0.09 0.30 16.5 18.9 12.6 0.22 2.49 562. 1036 2575
This excerpt of lmbench suggests fork/exec/sh tests, for example,
are sensitive and show effects that are possibly improved by the
TIF_SYSCALL_AUDIT patch (reduce a ~6.4% overhead to a ~2.6% overhead).
However, the full lmbench run generates profile data which suggests the
penalty is in the noise (granted, I've got some debugging enabled that
may hide other subtle effects).
samples % symbol name|samples % symbol name|samples % symbol name
1513465 22.3901 copy_user_g|1338383 19.8526 copy_user_g|1336757 20.2344 copy_user_g
1259839 18.6379 check_poiso|1171051 17.3705 check_poiso|1248209 18.8940 check_poiso
269874 3.9925 memset | 323399 4.7971 memset | 274733 4.1586 memset
160374 2.3726 clear_page | 188499 2.7961 sub_preemp | 165920 2.5115 clear_page
136696 2.0223 sub_preemp | 167853 2.4898 clear_page | 144600 2.1888 sub_preemp
127345 1.8839 add_preemp | 155483 2.3063 add_preemp | 139349 2.1093 add_preemp
124938 1.8483 schedule | 116873 1.7336 schedule | 126961 1.9218 schedule
87354 1.2923 getrusage | 87297 1.2949 audit_sysc_exi|91651 1.3873 find_get_pa
...
61105 0.9040 try_to_wake| 61626 0.9141 audit_filt_sys|61749 0.9347 try_to_wake
...
41928 0.6203 acpi_proces| 47810 0.7092 audit_sysc_ent|44019 0.6663 do_select
...
7179 0.1062 tcp_select_| 7749 0.1149 audit_serial | 7124 0.1078 tcp_v4_do_r
...
6 8.9e-05 audit_allo| 7 1.0e-04 __free_pages_| 7 1.1e-04 sprintf
...
6 8.9e-05 dx_probe | 7 1.0e-04 cdrom_decode| 6 9.1e-05 audit_alloc
...
kernbench
Unpatched Patched
audit disabled (never enabled)
real 5m47.271s real 5m44.002s
user 5m1.911s user 5m3.067s
sys 0m51.271s sys 0m50.835s
audit enabled
real 5m45.999s real 5m45.162s
user 5m2.647s user 5m3.443s
sys 0m52.499s sys 0m52.203s
audit disabled (after enabled)
real 5m46.010s real 5m48.170s
user 5m1.371s user 5m2.247s
sys 0m51.871s sys 0m51.719s
The data here shows that within margin of error, disable after enable
has no cost. The profile data similarly shows no effect.
Raw data is here:
http://developer.osdl.org/chrisw/audit-perf-test/
http://developer.osdl.org/chrisw/audit-perf-test/lmbench.out (self-expanatory)
http://developer.osdl.org/chrisw/audit-perf-test/perftest-noperf2.out (unpatched)
http://developer.osdl.org/chrisw/audit-perf-test/perftest.ti4140/ (profile data for unpatched run)
http://developer.osdl.org/chrisw/audit-perf-test/perftest-perf2.out (patched)
http://developer.osdl.org/chrisw/audit-perf-test/perftest.Wq4156 (profile data for patched run)
http://developer.osdl.org/chrisw/audit-perf-test/audit-performance.patch (patch)
http://developer.osdl.org/chrisw/audit-perf-test/perftest.sh (script)
19 years, 3 months
[PATCH] kerneldoc for kernel/audit*.c
by randy_dunlap
From: Randy Dunlap <rdunlap(a)xenotime.net>
for kernel/audit*.c:
- add kerneldoc for non-static functions;
- don't init static data to 0;
- limit lines to < 80 columns;
- fix long-format style;
- delete whitespace at end of some lines;
- break a for loop into 2 lines;
Signed-off-by: Randy Dunlap <rdunlap(a)xenotime.net>
diffstat:=
kernel/audit.c | 133 ++++++++++++++++++++++++++++++++++++-----------
kernel/auditsc.c | 153 ++++++++++++++++++++++++++++++++++++++++++++++++-------
2 files changed, 236 insertions(+), 50 deletions(-)
diff -Naurp linux-2613-rc1-git5/kernel/audit.c~kdoc_kernel_audit linux-2613-rc1-git5/kernel/audit.c
--- linux-2613-rc1-git5/kernel/audit.c~kdoc_kernel_audit 2005-07-03 20:38:57.000000000 -0700
+++ linux-2613-rc1-git5/kernel/audit.c 2005-07-07 11:46:21.000000000 -0700
@@ -72,7 +72,7 @@ static int audit_failure = AUDIT_FAIL_PR
* contains the (non-zero) pid. */
int audit_pid;
-/* If audit_limit is non-zero, limit the rate of sending audit records
+/* If audit_rate_limit is non-zero, limit the rate of sending audit records
* to that number per second. This prevents DoS attacks, but results in
* audit records being dropped. */
static int audit_rate_limit;
@@ -100,7 +100,7 @@ static struct sock *audit_sock;
* than AUDIT_MAXFREE are in use, the audit buffer is freed instead of
* being placed on the freelist). */
static DEFINE_SPINLOCK(audit_freelist_lock);
-static int audit_freelist_count = 0;
+static int audit_freelist_count;
static LIST_HEAD(audit_freelist);
static struct sk_buff_head audit_skb_queue;
@@ -194,8 +194,14 @@ static inline int audit_rate_check(void)
return retval;
}
-/* Emit at least 1 message per second, even if audit_rate_check is
- * throttling. */
+/**
+ * audit_log_lost - conditionally log an audit message
+ * @message: the message to be logged
+ *
+ * Emit at least 1 message per second, even if audit_rate_check is
+ * throttling.
+ * Always increment the lost messages counter.
+*/
void audit_log_lost(const char *message)
{
static unsigned long last_msg = 0;
@@ -226,7 +232,6 @@ void audit_log_lost(const char *message)
audit_backlog_limit);
audit_panic(message);
}
-
}
static int audit_set_rate_limit(int limit, uid_t loginuid)
@@ -307,6 +312,19 @@ int kauditd_thread(void *dummy)
}
}
+/**
+ * audit_send_reply - send an audit reply message via netlink
+ * @pid: process id of the listener
+ * @seq: sequence number
+ * @type: audit message type
+ * @done: done (last) flag
+ * @multi: multi-part message flag
+ * @payload: payload data
+ * @size: payload size
+ *
+ * Allocates an skb, builds the netlink message, and sends it to the pid.
+ * No failure notifications.
+ */
void audit_send_reply(int pid, int seq, int type, int done, int multi,
void *payload, int size)
{
@@ -381,7 +399,8 @@ static int audit_receive_msg(struct sk_b
if (err)
return err;
- /* As soon as there's any sign of userspace auditd, start kauditd to talk to it */
+ /* As soon as there's any sign of userspace auditd,
+ * start kauditd to talk to it */
if (!kauditd_task)
kauditd_task = kthread_run(kauditd_thread, NULL, "kauditd");
if (IS_ERR(kauditd_task)) {
@@ -468,9 +487,11 @@ static int audit_receive_msg(struct sk_b
return err < 0 ? err : 0;
}
-/* Get message from skb (based on rtnetlink_rcv_skb). Each message is
+/*
+ * Get message from skb (based on rtnetlink_rcv_skb). Each message is
* processed by audit_receive_msg. Malformed skbs with wrong length are
- * discarded silently. */
+ * discarded silently.
+ */
static void audit_receive_skb(struct sk_buff *skb)
{
int err;
@@ -597,7 +618,10 @@ err:
return NULL;
}
-/* Compute a serial number for the audit record. Audit records are
+/**
+ * audit_serial - compute a serial number for the audit record
+ *
+ * Compute a serial number for the audit record. Audit records are
* written to user-space as soon as they are generated, so a complete
* audit record may be written in several pieces. The timestamp of the
* record and this serial number are used by the user-space tools to
@@ -611,7 +635,8 @@ err:
* audit context (for those records that have a context), and emit them
* all at syscall exit. However, this could delay the reporting of
* significant errors until syscall exit (or never, if the system
- * halts). */
+ * halts).
+ */
unsigned int audit_serial(void)
{
static atomic_t serial = ATOMIC_INIT(0xffffff);
@@ -638,12 +663,20 @@ static inline void audit_get_stamp(struc
}
}
-/* Obtain an audit buffer. This routine does locking to obtain the
+/**
+ * audit_log_start - obtain an audit buffer
+ * @ctx: audit_context (may be NULL)
+ * @type: audit netlink message type
+ *
+ * Returns audit_buffer pointer on success or NULL on error.
+ *
+ * Obtain an audit buffer. This routine does locking to obtain the
* audit buffer, but then no locking is required for calls to
- * audit_log_*format. If the tsk is a task that is currently in a
+ * audit_log_*format. If the task (ctx) is a task that is currently in a
* syscall, then the syscall is marked as auditable and an audit record
- * will be written at syscall exit. If there is no associated task, tsk
- * should be NULL. */
+ * will be written at syscall exit. If there is no associated task, then
+ * task context (ctx) should be NULL.
+ */
struct audit_buffer *audit_log_start(struct audit_context *ctx, int type)
{
struct audit_buffer *ab = NULL;
@@ -681,6 +714,7 @@ struct audit_buffer *audit_log_start(str
/**
* audit_expand - expand skb in the audit buffer
* @ab: audit_buffer
+ * @extra: space to add at tail of the skb
*
* Returns 0 (no space) on failed expansion, or available space if
* successful.
@@ -697,10 +731,12 @@ static inline int audit_expand(struct au
return skb_tailroom(skb);
}
-/* Format an audit message into the audit buffer. If there isn't enough
+/*
+ * Format an audit message into the audit buffer. If there isn't enough
* room in the audit buffer, more room will be allocated and vsnprint
* will be called a second time. Currently, we assume that a printk
- * can't format message larger than 1024 bytes, so we don't either. */
+ * can't format message larger than 1024 bytes, so we don't either.
+ */
static void audit_log_vformat(struct audit_buffer *ab, const char *fmt,
va_list args)
{
@@ -725,7 +761,8 @@ static void audit_log_vformat(struct aud
/* The printk buffer is 1024 bytes long, so if we get
* here and AUDIT_BUFSIZ is at least 1024, then we can
* log everything that printk could have logged. */
- avail = audit_expand(ab, max_t(unsigned, AUDIT_BUFSIZ, 1+len-avail));
+ avail = audit_expand(ab,
+ max_t(unsigned, AUDIT_BUFSIZ, 1+len-avail));
if (!avail)
goto out;
len = vsnprintf(skb->tail, avail, fmt, args2);
@@ -736,8 +773,14 @@ out:
return;
}
-/* Format a message into the audit buffer. All the work is done in
- * audit_log_vformat. */
+/**
+ * audit_log_format - format a message into the audit buffer.
+ * @ab: audit_buffer
+ * @fmt: format string
+ * @...: optional parameters matching @fmt string
+ *
+ * All the work is done in audit_log_vformat.
+ */
void audit_log_format(struct audit_buffer *ab, const char *fmt, ...)
{
va_list args;
@@ -749,9 +792,18 @@ void audit_log_format(struct audit_buffe
va_end(args);
}
-/* This function will take the passed buf and convert it into a string of
- * ascii hex digits. The new string is placed onto the skb. */
-void audit_log_hex(struct audit_buffer *ab, const unsigned char *buf,
+/**
+ * audit_log_hex - convert a buffer to hex and append it to the audit skb
+ * @ab: the audit_buffer
+ * @buf: buffer to convert to hex
+ * @len: length of @buf to be converted
+ *
+ * No return value; failure to expand is silently ignored.
+ *
+ * This function will take the passed buf and convert it into a string of
+ * ascii hex digits. The new string is placed onto the skb.
+ */
+void audit_log_hex(struct audit_buffer *ab, const unsigned char *buf,
size_t len)
{
int i, avail, new_len;
@@ -780,10 +832,16 @@ void audit_log_hex(struct audit_buffer *
skb_put(skb, len << 1); /* new string is twice the old string */
}
-/* This code will escape a string that is passed to it if the string
- * contains a control character, unprintable character, double quote mark,
+/**
+ * audit_log_unstrustedstring - log a string that may contain random characters
+ * @ab: audit_buffer
+ * @string: string to be logged
+ *
+ * This code will escape a string that is passed to it if the string
+ * contains a control character, unprintable character, double quote mark,
* or a space. Unescaped strings will start and end with a double quote mark.
- * Strings that are escaped are printed in hex (2 digits per char). */
+ * Strings that are escaped are printed in hex (2 digits per char).
+ */
void audit_log_untrustedstring(struct audit_buffer *ab, const char *string)
{
const unsigned char *p = string;
@@ -822,10 +880,15 @@ void audit_log_d_path(struct audit_buffe
kfree(path);
}
-/* The netlink_* functions cannot be called inside an irq context, so
- * the audit buffer is places on a queue and a tasklet is scheduled to
+/**
+ * audit_log_end - end one audit record
+ * @ab: the audit_buffer
+ *
+ * The netlink_* functions cannot be called inside an irq context, so
+ * the audit buffer is placed on a queue and a tasklet is scheduled to
* remove them from the queue outside the irq context. May be called in
- * any context. */
+ * any context.
+ */
void audit_log_end(struct audit_buffer *ab)
{
if (!ab)
@@ -846,9 +909,17 @@ void audit_log_end(struct audit_buffer *
audit_buffer_free(ab);
}
-/* Log an audit record. This is a convenience function that calls
- * audit_log_start, audit_log_vformat, and audit_log_end. It may be
- * called in any context. */
+/**
+ * audit_log - log an audit record
+ * @ctx: the audit_context
+ * @type: the audit message type
+ * @fmt: format string to use
+ * @...: variable parameters matching the format string
+ *
+ * This is a convenience function that calls audit_log_start,
+ * audit_log_vformat, and audit_log_end. It may be
+ * called in any context.
+ */
void audit_log(struct audit_context *ctx, int type, const char *fmt, ...)
{
struct audit_buffer *ab;
diff -Naurp linux-2613-rc1-git5/kernel/auditsc.c~kdoc_kernel_audit linux-2613-rc1-git5/kernel/auditsc.c
--- linux-2613-rc1-git5/kernel/auditsc.c~kdoc_kernel_audit 2005-07-03 20:38:57.000000000 -0700
+++ linux-2613-rc1-git5/kernel/auditsc.c 2005-07-07 10:57:44.000000000 -0700
@@ -268,10 +268,21 @@ static int audit_copy_rule(struct audit_
d->fields[i] = s->fields[i];
d->values[i] = s->values[i];
}
- for (i = 0; i < AUDIT_BITMASK_SIZE; i++) d->mask[i] = s->mask[i];
+ for (i = 0; i < AUDIT_BITMASK_SIZE; i++)
+ d->mask[i] = s->mask[i];
return 0;
}
+/**
+ * audit_receive_filter - apply all rules to the specified message type
+ * @type: audit message type
+ * @pid: target pid for netlink audit messages
+ * @uid: target uid for netlink audit messages
+ * @seq: netlink audit message sequence (serial) number
+ * @data: payload data
+ * @loginuid: loginuid of sender
+ *
+ */
int audit_receive_filter(int type, int pid, int uid, int seq, void *data,
uid_t loginuid)
{
@@ -467,7 +478,7 @@ static enum audit_state audit_filter_tas
/* At syscall entry and exit time, this filter is called if the
* audit_state is not low enough that auditing cannot take place, but is
* also not high enough that we already know we have to write an audit
- * record (i.e., the state is AUDIT_SETUP_CONTEXT or AUDIT_BUILD_CONTEXT).
+ * record (i.e., the state is AUDIT_SETUP_CONTEXT or AUDIT_BUILD_CONTEXT).
*/
static enum audit_state audit_filter_syscall(struct task_struct *tsk,
struct audit_context *ctx,
@@ -597,10 +608,15 @@ static inline struct audit_context *audi
return context;
}
-/* Filter on the task information and allocate a per-task audit context
+/**
+ * audit_alloc - allocate an audit context block for a task
+ * @tsk: task
+ *
+ * Filter on the task information and allocate a per-task audit context
* if necessary. Doing so turns on system call auditing for the
* specified task. This is called from copy_process, so no lock is
- * needed. */
+ * needed.
+ */
int audit_alloc(struct task_struct *tsk)
{
struct audit_context *context;
@@ -785,8 +801,12 @@ static void audit_log_exit(struct audit_
}
}
-/* Free a per-task audit context. Called from copy_process and
- * __put_task_struct. */
+/**
+ * audit_free - free a per-task audit context
+ * @tsk: task whose audit context block to free
+ *
+ * Called from copy_process and __put_task_struct.
+ */
void audit_free(struct task_struct *tsk)
{
struct audit_context *context;
@@ -806,13 +826,24 @@ void audit_free(struct task_struct *tsk)
audit_free_context(context);
}
-/* Fill in audit context at syscall entry. This only happens if the
+/**
+ * audit_syscall_entry - fill in an audit record at syscall entry
+ * @tsk: task being audited
+ * @arch: architecture type
+ * @major: major syscall type (function)
+ * @a1: additional syscall register 1
+ * @a2: additional syscall register 2
+ * @a3: additional syscall register 3
+ * @a4: additional syscall register 4
+ *
+ * Fill in audit context at syscall entry. This only happens if the
* audit context was created when the task was created and the state or
* filters demand the audit context be built. If the state from the
* per-task filter or from the per-syscall filter is AUDIT_RECORD_CONTEXT,
* then the record will be written at syscall exit time (otherwise, it
* will only be written if another part of the kernel requests that it
- * be written). */
+ * be written).
+ */
void audit_syscall_entry(struct task_struct *tsk, int arch, int major,
unsigned long a1, unsigned long a2,
unsigned long a3, unsigned long a4)
@@ -822,7 +853,8 @@ void audit_syscall_entry(struct task_str
BUG_ON(!context);
- /* This happens only on certain architectures that make system
+ /*
+ * This happens only on certain architectures that make system
* calls in kernel_thread via the entry.S interface, instead of
* with direct calls. (If you are porting to a new
* architecture, hitting this condition can indicate that you
@@ -886,11 +918,18 @@ void audit_syscall_entry(struct task_str
context->auditable = !!(state == AUDIT_RECORD_CONTEXT);
}
-/* Tear down after system call. If the audit context has been marked as
+/**
+ * audit_syscall_exit - deallocate audit context after a system call
+ * @tsk: task being audited
+ * @valid: success/failure flag
+ * @return_code: syscall return value
+ *
+ * Tear down after system call. If the audit context has been marked as
* auditable (either because of the AUDIT_RECORD_CONTEXT state from
* filtering, or because some other part of the kernel write an audit
* message), then write out the syscall information. In call cases,
- * free the names stored from getname(). */
+ * free the names stored from getname().
+ */
void audit_syscall_exit(struct task_struct *tsk, int valid, long return_code)
{
struct audit_context *context;
@@ -925,7 +964,13 @@ void audit_syscall_exit(struct task_stru
put_task_struct(tsk);
}
-/* Add a name to the list. Called from fs/namei.c:getname(). */
+/**
+ * audit_getname - add a name to the list
+ * @name: name to add
+ *
+ * Add a name to the list of audit names for this context.
+ * Called from fs/namei.c:getname().
+ */
void audit_getname(const char *name)
{
struct audit_context *context = current->audit_context;
@@ -954,10 +999,13 @@ void audit_getname(const char *name)
}
-/* Intercept a putname request. Called from
- * include/linux/fs.h:putname(). If we have stored the name from
- * getname in the audit context, then we delay the putname until syscall
- * exit. */
+/* audit_putname - intercept a putname request
+ * @name: name to intercept and delay for putname
+ *
+ * If we have stored the name from getname in the audit context,
+ * then we delay the putname until syscall exit.
+ * Called from include/linux/fs.h:putname().
+ */
void audit_putname(const char *name)
{
struct audit_context *context = current->audit_context;
@@ -994,8 +1042,13 @@ void audit_putname(const char *name)
#endif
}
-/* Store the inode and device from a lookup. Called from
- * fs/namei.c:path_lookup(). */
+/**
+ * audit_inode - store the inode and device from a lookup
+ * @name: name being audited
+ * @inode: inode being audited
+ *
+ * Called from fs/namei.c:path_lookup().
+ */
void audit_inode(const char *name, const struct inode *inode)
{
int idx;
@@ -1030,6 +1083,14 @@ void audit_inode(const char *name, const
context->names[idx].rdev = inode->i_rdev;
}
+/**
+ * auditsc_get_stamp - get local copies of audit_context values
+ * @ctx: audit_context for the task
+ * @t: timespec to store time recorded in the audit_context
+ * @serial: serial value that is recorded in the audit_context
+ *
+ * Also sets the context as auditable.
+ */
void auditsc_get_stamp(struct audit_context *ctx,
struct timespec *t, unsigned int *serial)
{
@@ -1039,6 +1100,15 @@ void auditsc_get_stamp(struct audit_cont
ctx->auditable = 1;
}
+/**
+ * audit_set_loginuid - set a task's audit_context loginuid
+ * @task: task whose audit context is being modified
+ * @loginuid: loginuid value
+ *
+ * Returns 0.
+ *
+ * Called (set) from fs/proc/base.c::proc_loginuid_write().
+ */
int audit_set_loginuid(struct task_struct *task, uid_t loginuid)
{
if (task->audit_context) {
@@ -1057,11 +1127,26 @@ int audit_set_loginuid(struct task_struc
return 0;
}
+/**
+ * audit_get_loginuid - get the loginuid for an audit_context
+ * @ctx: the audit_context
+ *
+ * Returns the context's loginuid or -1 if @ctx is NULL.
+ */
uid_t audit_get_loginuid(struct audit_context *ctx)
{
return ctx ? ctx->loginuid : -1;
}
+/**
+ * audit_ipc_perms - record audit data for ipc
+ * @qbytes: msgq bytes
+ * @uid: msgq user id
+ * @gid: msgq group id
+ * @mode: msgq mode (permissions)
+ *
+ * Returns 0 for success or NULL context or < 0 on error.
+ */
int audit_ipc_perms(unsigned long qbytes, uid_t uid, gid_t gid, mode_t mode)
{
struct audit_aux_data_ipcctl *ax;
@@ -1085,6 +1170,13 @@ int audit_ipc_perms(unsigned long qbytes
return 0;
}
+/**
+ * audit_socketcall - record audit data for sys_socketcall
+ * @nargs: number of args
+ * @args: args array
+ *
+ * Returns 0 for success or NULL context or < 0 on error.
+ */
int audit_socketcall(int nargs, unsigned long *args)
{
struct audit_aux_data_socketcall *ax;
@@ -1106,6 +1198,13 @@ int audit_socketcall(int nargs, unsigned
return 0;
}
+/**
+ * audit_sockaddr - record audit data for sys_bind, sys_connect, sys_sendto
+ * @len: data length in user space
+ * @a: data address in kernel space
+ *
+ * Returns 0 for success or NULL context or < 0 on error.
+ */
int audit_sockaddr(int len, void *a)
{
struct audit_aux_data_sockaddr *ax;
@@ -1127,6 +1226,15 @@ int audit_sockaddr(int len, void *a)
return 0;
}
+/**
+ * audit_avc_path - record the granting or denial of permissions
+ * @dentry: dentry to record
+ * @mnt: mnt to record
+ *
+ * Returns 0 for success or NULL context or < 0 on error.
+ *
+ * Called from security/selinux/avc.c::avc_audit()
+ */
int audit_avc_path(struct dentry *dentry, struct vfsmount *mnt)
{
struct audit_aux_data_path *ax;
@@ -1148,6 +1256,14 @@ int audit_avc_path(struct dentry *dentry
return 0;
}
+/**
+ * audit_signal_info - record signal info for shutting down audit subsystem
+ * @sig: signal value
+ * @t: task being signaled
+ *
+ * If the audit subsystem is being terminated, record the task (pid)
+ * and uid that is doing that.
+ */
void audit_signal_info(int sig, struct task_struct *t)
{
extern pid_t audit_sig_pid;
@@ -1164,4 +1280,3 @@ void audit_signal_info(int sig, struct t
}
}
}
-
---
19 years, 3 months
VFS hooks analysis (pass 1)
by Amy Griffis
Hello,
I've been investigating audit's needs for VFS hook placement, to
determine whether audit would require additional events or additional
information about events, other than what is currently provided by
Inotify.
I've written up my findings in hopes that some of you who may be more
familiar with the situation than I can correct my mistakes, and that
we can all be on the same page with regard to audit's needs.
At this point, I haven't addressed the issue of race conditions. For
the purpose of the first pass, I make the assumption that the current
fsnotify hook placement is sufficient. I'll address the possibility
of race conditions in a second pass.
Background:
The purpose of the VFS hooks is to capture information about an
object's identity. In this implementation, identity means inode and
name. Identity information is needed for filtering and for log record
detail.
To narrow the field, I limited the set of syscalls to those which are
defined by CAPP as security-relevant and are also filesystem-related.
I believe this is the correct list:
sys_access
sys_chdir
sys_chmod
sys_chown
sys_creat
sys_execve
sys_fchmod
sys_fchown
sys_fremovexattr
sys_fsetxattr
sys_lchown
sys_link
sys_lremovexattr
sys_lsetxattr
sys_mkdir
sys_mknod
sys_mount
sys_open
sys_removexattr
sys_rename
sys_rmdir
sys_setxattr
sys_swapon
sys_symlink
sys_truncate
sys_unlink
sys_utime(s)
Available Object Identity Info:
The upstream audit code uses getname() and path_lookup() hooks to
collect object identity information during syscall processing. This
is sufficient for the following syscalls:
sys_access
sys_chdir
sys_chmod
sys_chown
sys_execve
sys_lchown
sys_link
sys_lremovexattr
sys_lsetxattr
sys_removexattr
sys_setxattr
sys_swapon
sys_truncate
sys_utime(s)
Lacking Object Identity Info:
The information audit needs isn't available when path_lookup is called
with LOOKUP_PARENT, or when getname/path_lookup are not called at all.
In these situations, audit needs more VFS hooks to get the necessary
information.
Attribute Changes:
The getname/path_lookup hooks are never called for these syscalls:
sys_fchmod
sys_fchown
sys_fremovexattr
sys_fsetxattr
The fsnotify_change() and fsnotify_xattr() hooks capture the relevant
dentry, so the dentry's name & inode could be passed to the event
callback.
Mounts:
For sys_mount, audit does not currently capture the pathname for the
device. This is because getname is not called before path_lookup. We
could modify audit_inode() to save the name from path_lookup for
"nameless" inodes. Note the FIXME in audit_inode().
The rest of the syscalls are mostly lacking information due to
path_lookup being called with LOOKUP_PARENT.
Created Objects:
sys_creat
sys_mkdir
sys_mknod
sys_open
The fsnotify_create() and fsnotify_mkdir() hooks have the same
placement as audit_notify_watch(), but audit needs the dentry's inode.
The fsnotify hooks could be modified to capture the dentry, instead of
just the dentry->d_name.name.
Removed Objects:
sys_rmdir
sys_unlink
The fsnotify_inoderemove() hook provides the inode, but see note below
on capturing attempted removals.
Symlinks:
sys_symlink
The inodes for the target path and symlink are not available from the
getname/path_lookup hoooks. The PATH records do not indicate which
name is the symlink. Maybe this isn't necessary since you can tell
from the syscall args.
As previously mentioned, the fsnotify_create() hook could be modified
to capture the dentry instead of the dentry->d_name.name. This would
provide audit with the symlink's inode.
I haven't found a good way to capture the target inode. We don't do
it in the current implementation, which means we don't log events when
someone makes a symlink to a watched inode or pathname. It seems like
we should, though.
Renames:
sys_rename
The inode for the relevant object is not available from the
getname/path_lookup hooks. There is also no way to indicate the old
name versus the new name, other than by the syscall args.
The fsnotify_move() hook could be modified to capture the old and new
dentrys, instead of just the names, which would provide audit with the
inode (twice). Inotify provides separate to/from events.
Capturing Unsuccessful Attempts:
CAPP dictates that audit must capture unsuccessful attempts from open,
rename, truncate, and unlink. The open and truncate calls are a
non-issue, the inode is obtained from the path_lookup hook.
The rename and unlink calls present the problem. Since the inode
wouldn't be captured unless the rename/remove was successful, an
inode-based filter wouldn't catch unsuccessful attempts.
Summary:
To summarize, I haven't found a situation requiring the current
permission() and exec_permission_lite() hooks.
The issues seem to be with symlinks and logging unsuccessful
rename/remove attempts.
I can think of three possible solutions for the latter:
(1) Require filters based on filenames to capture unsuccessful
attempts. This seems undesirable.
(2) Have an audit-specific hook in may_delete(), as is currently
done, but have it tie in to inode-based filters as well.
(3) Request an additional Inotify event that would not be included
in IN_ALL_EVENTS, so wouldn't impact Inotify's other users.
I think 2 or 3 are our best options, with the answer likely depending
on what Inotify is willing to add. It is probably doable to add
audit-specific hooks if they are few and if Inotify isn't interested.
I do not yet have a solution for capturing the target inode from a
symlink operation. However since it's not currently done, maybe it
isn't a hard requirement?
I'd appreciate any comments on my analysis.
Thanks,
Amy
19 years, 3 months
New development
by Steve Grubb
Hello,
I am in the process of reviewing the requirements for the next round of
development for the audit system. I've worked out a rough schedule for the
user space side of things. I will produce more documentation over the next
couple of days describing what is needed and what would be nice to add. I
would like for this to be an open discussion among all parties as this
affects the whole linux community.
The rough schedule for the next series goes something like this:
1.1 -> 1.2 event dispatcher, plugin framework, and some basic plugins
1.2 -> 1.3 label support + more plugins
1.3 -> 1.4 add new config options, summary reports, binary format
1.4 -> 1.5 audit explorer & gui config
There are several reasons for doing plugins first. Partly due to limited time
of people working on it and also to give file system auditing a chance to get
upstream. This way we are working in parallel.
If you have ideas about nice things to add, lets start the discussion. We
don't need to talk about LSPP as that will be by-the-book. (I want that
discussion to be its own thread, but not yet. This is just pie in the sky
planning.) I'm looking for usability and neat to have items.
Another thing I'd like to point out is that the plugin architecture will let
us eventually layer an IDS on top of the audit system. This is a long range
goal that will take some time to get to.
-Steve
19 years, 3 months
audit_receive_skb
by Steve Grubb
Hi,
I was looking through the source to the .88 kernel and ran across this:
static int audit_receive_skb(struct sk_buff *skb)
{
int err;
struct nlmsghdr *nlh;
u32 rlen;
while (skb->len >= NLMSG_SPACE(0)) {
nlh = (struct nlmsghdr *)skb->data;
if (nlh->nlmsg_len < sizeof(*nlh) || skb->len <
nlh->nlmsg_len)
return 0;
rlen = NLMSG_ALIGN(nlh->nlmsg_len);
if (rlen > skb->len)
rlen = skb->len;
if ((err = audit_receive_msg(skb, nlh))) {
netlink_ack(skb, nlh, err);
} else if (nlh->nlmsg_flags & NLM_F_ACK)
netlink_ack(skb, nlh, 0);
skb_pull(skb, rlen);
}
return 0;
}
It only returns 0. Is this a mistake or should this be made void? The reason I
ask is that the return code is used like this:
if (audit_receive_skb(skb) && skb->len)
skb_queue_head(&sk->sk_receive_queue, skb);
else
kfree_skb(skb);
The way the code is, we will never put the skb back on the queue head. Should
this be refactored or do we have a problem in the .88 kernel?
-Steve
19 years, 3 months
Possible performance bug
by Steve Grubb
Hi,
I was looking at the case where a user boots up with audit daemon installed.
It turns on auditing. This means that all processes that fork will start
getting a context built. Then the user decides to do a benchmark and turns
the audit system off by auditctl -e 0.
The system doesn't really get performance back as if auditing was never turned
on. If you look at audit_syscall_exit, there is this check:
if (likely(!context))
goto out;
Don't all the running processes still have a context? Shouldn't this also have
a check that if audit_enabled == 0, that the context is reclaimed and context
set to NULL? What reaps the context for these processes. They all still seem
to be penalized.
-Steve
19 years, 3 months
Audit Dispatcher Design
by Steve Grubb
Hello,
I am attaching an Open Office presentation that contains the slides for the
audit dispatcher preliminary design review. The audit dispatcher will be
implemented using C++ to provide some organization and abstraction for some
of the design elements.
The audit dispatcher will be configured by a file /etc/audisp.conf that will
instruct it on how to configure the input plugins and the output filter
plugin. Some plugins will be active - meaning that they have their own thread
of execution. Others will be passive and use the caller's thread.
The Filter plugin is a Composite of two classes - The filter and an output.
The filter part does the data transformation or filtering. The output plugin
takes the data passed to it and outputs it. The plugin class is a wrapper for
a shared object file that gets loaded and unloaded.
Events will be gathered by input plugins and placed into the applications
event queue. Filter plugins will have previously registered for callbacks for
new events. They will all receive the event and begin processing it. When and
if the event needs to be output, the filter will call its output plugin.
The audisp daemon will receive a reconfigure event whenever SIGHUP is sent to
the audit daemon. It will re-read its config and remove, add, or modify
plugins on the fly.
There are some rules regarding the implementation in C++. The ground rules
are: No dynamic class creation or deletion except at startup/shutdown; No
exceptions; and No templates.
This is a preliminary design. If there are any concerns, comments,
suggestions, please follow up on this. This was modeled with Umbrello - which
is part of Kdesdk. The PDR model will be placed on
people.redhat.com/~sgrubb/audit.
Thanks,
-Steve Grubb
19 years, 3 months
Perfromance issues
by Steve Grubb
Hello,
Attached is a patch against .88 kernel that should help performance. It
removes unconditional loads and changes a couple switch/case constructs to a
lookup & assignment.
There are other performance issues that need checking. In calls that hook the
file system, it is probably better to check that audit is enabled at the
point of the hook rather than in the function:
+++ linux-2.6.9~pre75/fs/attr.c
@@ -68,6 +69,8 @@
unsigned int ia_valid = attr->ia_valid;
int error = 0;
+ audit_notify_watch(inode, MAY_WRITE);
+
if (ia_valid & ATTR_SIZE) {
if (attr->ia_size != i_size_read(inode)) {
error = vmtruncate(inode, attr->ia_size);
+void audit_notify_watch(struct inode *inode, int mask)
+{
+ struct audit_inode_data *data;
+
+ if (likely(!audit_enabled))
+ return;
This means that the variables have to be pushed onto the stack, a call
performed, the enabled test, do a return instruction, and then pop the stack.
Its probably faster to do:
+ if (unlikely(audit_enabled))
+ audit_notify_watch(inode, MAY_WRITE);
+
Same thing with audit_syscall_entry & audit_syscall_exit.
NOTE: I am in no way advocating rolling a .89 release just for this.
-Steve
19 years, 3 months
[RFC][PATCH] Inotify kernel API
by Amy Griffis
Attached is a patch (against Linus' git tree) that implements a basic
kernel API for inotify. Here is a description of the patch:
The Inotify kernel API provides these functions:
inotify_init - initialize an inotify instance
inotify_add_watch - add a watch on an inode
inotify_ignore - remove a watch on an inode
inotify_free - destroy an inotify instance
The kernel API differs from the userspace API in the following ways:
- Instead of a file descriptor, inotify_init() returns a pointer to
struct inotify_dev, which is an incomplete type to kernel consumers.
- The consumer provides a callback to inotify_init(), which is used
for filesystem event notification instead of the kevents used for
userspace.
- Watches are added on inodes rather than paths.
- inotify_add_watch() takes a callback argument, which is used to
provide the consumer with a quick-access method back to its own data
structure, rather than needing to hash the watch descriptor or walk
a list.
- The path is given to the event callback as an additional argument
rather than being appended to the inotify_event structure;
inotify_event.len is unused.
- User-based limits on number of watches, etc. are ignored.
Here is a list of other things I've been working on, but are not
included in this patch:
- Adding inode information to the event callback.
- Allowing for adding/removing inotify watches from an event callback.
I've also sketched out some data structures and written some prototype
audit code that makes use of this patch.
Please take a look and let me know what you think!
Regards,
Amy
19 years, 3 months
Re: [Fwd: Re: audit]
by Steve Grubb
Hello,
I created the audit patch. I'll see if I can address some off these questions.
On Tuesday 06 September 2005 05:58, Peter Vrabec wrote:
> I'm look again and agin on auditing chages and I discover some IMO
> strange things in this changes. I'm just starting reading
> documetation for auditing support so please correct me if I'm wrong.
>
> [src]$ grep AUDIT_ *c | awk '{ print $3}' | sort | uniq -c
> 129 (AUDIT_USER_CHAUTHTOK,
>
> Hmm .. *all places* where logging auditing records are injected are
> reported as AUDIT_USER_CHAUTHTOK .. even from error handling (?!?). Is it
> realy correct ?
Yes. The audit system right now is trying to standardize the messages. Some
may not be an exact fit and may change if needed. Because pam is so tightly
tied to authentication of users and allowing the change of account
attributes, the naming is pamish.
We are following CAPP guidelines which basically state that an audit event
should show: who made the changes, to what account, when, what the operation
was, and the outcome. So for the most part, nearly everything shadow utils
does is changing the authentication tokens in pam terminology. The outcome of
the operation is stated as failure in the event of error handling. But the
admin needs to know what was attempted.
> First from edge .. chage.c:
>
> if (!amroot && !lflg) {
> fprintf (stderr, _("%s: Permission denied.\n"), Prog);
> #ifdef WITH_AUDIT
> audit_logger (AUDIT_USER_CHAUTHTOK, Prog, "change age",
> NULL, getuid (), 0);
> #endif
> exit (E_NOPERM);
> }
>
> In this place auditing comment is "change age" like on case changing user
> account age but it is *error* report *not* performing this chage.
> Many other places where was injected audit_logger() are very simillar.
What would be a better description of the operation? We cannot get too
descriptive as the shadow utils patch has about 325 messages added for
auditing. I also need the text to be short as each audit message consumes
disk space. So we are trying to be sensitive to that as well.
> >From libadit.h:
>
> #define AUDIT_USER_AUTH 1100 /* User space authentication */
> #define AUDIT_USER_ACCT 1101 /* User space acct change */
> #define AUDIT_USER_MGMT 1102 /* User space acct management */
> #define AUDIT_CRED_ACQ 1103 /* User space credential acquired
> */ #define AUDIT_CRED_DISP 1104 /* User space credential
> disposed */ #define AUDIT_USER_START 1105 /* User space session
> start */ #define AUDIT_USER_END 1106 /* User space session end
> */ #define AUDIT_USER_AVC 1107 /* User space avc message */
> #define AUDIT_USER_CHAUTHTOK 1108 /* User space acct attr changed */
> #define AUDIT_USER_ERR 1109 /* User space acct state err */
> #define AUDIT_CRED_REFR 1110 /* User space credential refreshed
> */ #define AUDIT_USYS_CONFIG 1111 /* User space system config
> change */
>
> On first look on this list loging all auditing records as
> AUDIT_USER_CHAUTHTOK is incorrect.
Remember this is pamish. We may need a new message type for adding and
deleting a user account or group. That make more sense to me.
> Probaly using "usedadd -D <other_options>" will be good report as
> AUDIT_USYS_CONFIG (?).
This is for changes to the system config like hwclock that are mandated by the
CAPP specification.
> Succesfull changing account propertiees as
> AUDIT_USER_ACCT (what about changing group properties ?).
I didn't see any properties other than adding a user to a group. This should
be recorded from the user's perspective as changes to the account.
> Probaly start/stop su, login, newgrp session will be good mark as
> AUDIT_USER_START/AUDIT_USER_END (?).
Yes. I don't think newgrp has session start/end, but it probably should.
> Questions like above after spending more time will be probably much more.
Please cc me on these questions as I can help explain what was done. There is
also an audit mail list just in case you are interested.
www.redhat.com/mailman/listinfo/linux-audit. I'm cc'ing this to that mail
list since it looks like I may have a few action items.
Hope this helps...
-Steve
19 years, 3 months