I want to reactivate this thread
of discussion to come to a closure on this subject and implement the solution
as fast as possible.
First, I will summarize
what we are trying to do. Then I will state where we left off, I will restate
the original proposal and the responses to the original proposal. Finally,
I will provide another proposal to start the discussion. This is a long
note.
The Background
One of the CAPP requirements and probably the LSPP
as well is when audit records cannot be generated, for a particular process,
the process need to be halted. We are considering 2 separate cases when
the audit records cannot be generated.
1) The first, is when the audit log is full and the
audit subsystem cannot write the audit record.
2) The second, is when the kernel cannot allocate
memory to generate the audit buffer.
One of the reasons these 2 cases are treated separately
is because for the first case (disk full), the audit subsystem can know
ahead of time that the audit record cannot be written out to disk (auditd
can for example send a message informing the audit system of this situation).
So, the audit subsystem has the ability to suspend the process (or all
the processes) before they perform auditable action(s). In contrast, for
the second case (no kernel resources) the audit subsystem cannot know ahead
of time that the kernel resources are exhausted. It is only, when the audit
subsystem is trying to generate the audit record, when it discovers
that no resources are available. The auditable action already took place.
The Initial proposal
1) For handling disk full:
Whenever the disk full (or log reached its limit) is detected the
auditd sends an AUDIT_SUSPEND message to the kernel. On receipt of
this message the kernel will set a flag "disk_full_flag".
If this
disk_full_flag is set audit_log_start will call audit_suspend to
queue the process in a wait queue. Whenever the disk_full_flag is
reset all the processes in the wait queue will be rescheduled.
2) For suspending the process whenever there are no kernel resources:
I was thinking of using sigsuspend whenever audit_log_lost is called
depending on the "failure flag". The failure flag currently can
be set
only, to: i) do nothing, ii) print a message or iii) panic. I was
thinking of
adding a fourth option to this flag to suspend the processes.
Responses to the initial proposal
1) Should not change the audit_log* functions because
they can be called from different context(Chris White).
This can only safely be done from either:
a) audit_syscall_exit, or
b) some new audit_log* functions that
are explicitly identified as potentially blocking.
(Stephen Smalley)
2) Sigsuspend is not safe (Stephan Smalley).
There may not be any local process associated
with the event, e.g.
SELinux can generate audit data during processing of received packets.
the processing you describe can only
safely be done from either:
a) audit_syscall_exit, or
b) some new audit_log* functions that
are explicitly identified as potentially blocking.
Current proposal
1) For handling disk full:
Instead of calling the audit_suspend from audit_log_start.
I will call it from audit_syscall_entry
if the context is auditable. audit_suspend will place
the process in a wait_queue until the disk_full_flag is reset. At that
time all the processes in the wait queue will be awakened.
Hopefully this is acceptable and I can go ahead and
implement this.
The question is how SELinux should treat its audit
records in this case? For the current CAPP work this is not an issue. However,
it will be for LSPP evaluation.
2) For suspending the process whenever there are no
kernel resources:
Audit_log_lost is called from many places
for many reasons no memory, socket is busy, etc.. I need to think a little
bit more about this. If we don't want to sleep in any audit_log* function.
Any suggestion?
Mounir Bsaibes
Linux Security
Tel: (512) 838-1301
Cell: (512) 762-9957
Fax: (512) 838-8858
e-mail: bsaibes@us.ibm.com