I want to reactivate this thread of discussion to come to a closure on
this subject and implement the solution as fast as possible.
First, I will summarize what we are trying to do. Then I will state
where we left off, I will restate the original proposal and the responses
to the original proposal. Finally, I will provide another proposal to
start the discussion. This is a long note.
The Background
One of the CAPP requirements and probably the LSPP as well is when audit
records cannot be generated, for a particular process, the process need to
be halted. We are considering 2 separate cases when the audit records
cannot be generated.
1) The first, is when the audit log is full and the audit subsystem cannot
write the audit record.
2) The second, is when the kernel cannot allocate memory to generate the
audit buffer.
One of the reasons these 2 cases are treated separately is because for the
first case (disk full), the audit subsystem can know ahead of time that
the audit record cannot be written out to disk (auditd can for example
send a message informing the audit system of this situation). So, the
audit subsystem has the ability to suspend the process (or all the
processes) before they perform auditable action(s). In contrast, for the
second case (no kernel resources) the audit subsystem cannot know ahead of
time that the kernel resources are exhausted. It is only, when the audit
subsystem is trying to generate the audit record, when it discovers that
no resources are available. The auditable action already took place.
The Initial proposal
1) For handling disk full:
Whenever the disk full (or log reached its limit) is detected the
auditd sends an AUDIT_SUSPEND message to the kernel. On receipt of
this message the kernel will set a flag "disk_full_flag". If this
disk_full_flag is set audit_log_start will call audit_suspend to
queue the process in a wait queue. Whenever the disk_full_flag is
reset all the processes in the wait queue will be rescheduled.
2) For suspending the process whenever there are no kernel resources:
I was thinking of using sigsuspend whenever audit_log_lost is called
depending on the "failure flag". The failure flag currently can be set
only, to: i) do nothing, ii) print a message or iii) panic. I was
thinking of
adding a fourth option to this flag to suspend the processes.
Responses to the initial proposal
1) Should not change the audit_log* functions because they can be called
from different context(Chris White).
This can only safely be done from either:
a) audit_syscall_exit, or
b) some new audit_log* functions that are explicitly identified as
potentially blocking.
(Stephen Smalley)
2) Sigsuspend is not safe (Stephan Smalley).
There may not be any local process associated with the event, e.g.
SELinux can generate audit data during processing of received packets.
the processing you describe can only safely be done from either:
a) audit_syscall_exit, or
b) some new audit_log* functions that are explicitly identified as
potentially blocking.
Current proposal
1) For handling disk full:
Instead of calling the audit_suspend from audit_log_start. I will call it
from audit_syscall_entry
if the context is auditable. audit_suspend will place the process in a
wait_queue until the disk_full_flag is reset. At that time all the
processes in the wait queue will be awakened.
Hopefully this is acceptable and I can go ahead and implement this.
The question is how SELinux should treat its audit records in this case?
For the current CAPP work this is not an issue. However, it will be for
LSPP evaluation.
2) For suspending the process whenever there are no kernel resources:
Audit_log_lost is called from many places for many reasons no memory,
socket is busy, etc.. I need to think a little bit more about this. If we
don't want to sleep in any audit_log* function. Any suggestion?
Mounir Bsaibes
Linux Security
Tel: (512) 838-1301
Cell: (512) 762-9957
Fax: (512) 838-8858
e-mail: bsaibes(a)us.ibm.com