On Fri, Mar 10, 2006 at 06:37:02AM -0500, Alexander Viro wrote:
Several notes:
> +/* Remove given krule from its associated watch's rules list and clean up any
> + * last instances of associated watch and parent.
> + * Caller must hold exit filterlist lock */
> +static inline void audit_remove_watch(struct audit_krule *krule,
> + struct list_head *in_list)
> +{
> + struct audit_watch *watch = krule->watch;
> + struct audit_parent *parent = watch->parent;
> +
> + list_del(&krule->rlist);
> + if (list_empty(&watch->rules)) {
> + list_del(&watch->wlist);
> + audit_put_watch(watch);
> +
> + if (list_empty(&parent->watches)) {
> + /* put parent on the inotify un-registration list */
> + list_add(&parent->ilist, in_list);
> + audit_get_parent(parent);
> + }
> + }
> }
Umm... What guarantees that parent survives audit_put_watch()?
The put that balances the initial get is in audit_remove_parent().
Other than the case where we couldn't add the inotify watch, this is
only reached after receiving the IN_IGNORED event from inotify. We
receive IN_IGNORED after explicitly telling inotify to remove a watch
or if the parent is being removed from the filesystem. In the latter
case, we first receive an event that tells us the parent is going
away, to which we respond by taking the filterlist lock and removing
all watches & rules associated with the parent. We leave the parent
around until we receive the IN_IGNORED confirmation from inotify.
In the code above, the audit_netlink_mutex prevents another explicit
removal, and the filterlist lock requires any of the inotify events
preceding an IN_IGNORED to wait.
The inotify device semaphore prevents us from explicitly removing a
watch between the IN_UNMOUNT and IN_IGNORED events. Since we take an
extra parent reference before releasing the lock, we're okay in
audit_inotify_unregister() and the inotify_ignore() call will silently
fail.
The inotify device semaphore is dropped between processing the
IN_DELETE_SELF and IN_IGNORED events, so it is possible to do an
explicit removal between them. In the situation where we receive
IN_DELETE_SELF (a no-op when the parent.watches list is empty), then
explicitly remove the watch via inotify_ignore(), we won't receive a
second IN_IGNORED event because our watch will no longer be in the
inode's inotify_watches list. I believe this addresses all of the
possible removal scenarios.
Another thing that looks rather fishy: audit_inotify_register()
+ if (wd < 0) {
+ audit_remove_parent_watches(p);
+ audit_remove_parent(p);
+ audit_put_parent(p);
+ ret = wd;
+ } else {
followed by _another_ audit_put_parent(), balancing audit_get_parent()
before. Since audit_remove_parent() also drops a reference... Something
doesn't add up.
AFAICS, audit_get_watch() puts new parent on list, with refcount 2
and one watch attached. We bump refcount to 3. Then we have
audit_remove_parent_watches() do audit_put_watch() on the only watch,
which will drop its reference to parent, leaving us with refcount 2.
Then audit_remove_parent() will call audit_put_parent(), getting the
refcount to 1. Then we do audit_put_parent(), releaseing the last
reference and freeing p; *then* we do audit_put_parent(p) again. Oops...
The put after audit_remove_parent() is a mistake. Fixed.
Could you put at least short comments on refcounting in there?
Yes, will do.
One more: audit_put_nd() blocks, so
+ spin_lock(&flist->lock);
+ if (watch) {
+ err = audit_add_watch(&entry->rule, ndp, ndw, &inotify_list);
+ if (err) {
+ audit_put_nd(ndp, ndw);
deadlocks - audit_add_watch() won't drop the spinlock for us. Forgotten
spin_unlock()?
Yes, thanks.
In audit_list():
+ rcu_read_lock();
+ list_for_each_entry_rcu(e, &audit_filter_list[i].head, list) {
struct audit_rule_data *data;
data = audit_krule_to_data(&e->rule);
if (unlikely(!data))
break;
audit_send_reply(pid, seq, AUDIT_LIST_RULES, 0, 1,
- data, sizeof(*data));
+ data, sizeof(*data) + data->buflen);
audit_send_reply() quite obviously blocks. To quote include/linux/rcupdate.h:
/**
* rcu_read_lock - mark the beginning of an RCU read-side critical section.
*
...
*
* It is illegal to block while in an RCU read-side critical section.
*/
#define rcu_read_lock() preempt_disable()
In audit_list_rules(): ditto.
As discussed on IRC, will drop the per-filterlist spinlocks in favor
of the audit_netlink_mutex. The mutex will synchronize list
manipulations and blocking reads (list rules). The rcu iterator will
protect non-blocking reads (filtering).
Thanks for the review.
Amy