On Thu, Dec 22, 2022 at 12:07 PM Paul Moore <paul(a)paul-moore.com> wrote:
On Thu, Dec 22, 2022 at 2:59 PM Paul Moore <paul(a)paul-moore.com> wrote:
>
> On Thu, Dec 22, 2022 at 2:40 PM <sdf(a)google.com> wrote:
> > On 12/22, Paul Moore wrote:
> > > On Thu, Dec 22, 2022 at 12:19 PM <sdf(a)google.com> wrote:
> > > > On 12/21, Paul Moore wrote:
> > > > > When changing the ebpf program put() routines to support being
called
> > > > > from within IRQ context the program ID was reset to zero prior
to
> > > > > generating the audit UNLOAD record, which obviously rendered the
ID
> > > > > field bogus (always zero). This patch resolves this by adding a
new
> > > > > field, bpf_prog_aux::id_audit, which is set when the ebpf
program is
> > > > > allocated an ID and never reset, ensuring a valid ID field,
> > > > > regardless of the state of the original ID field,
bpf_prox_aud::id.
> > > >
> > > > > I also modified the bpf_audit_prog() logic used to associate
the
> > > > > AUDIT_BPF record with other associated records, e.g. @ctx !=
NULL.
> > > > > Instead of keying off the operation, it now keys off the
execution
> > > > > context, e.g. '!in_irg && !irqs_disabled()',
which is much more
> > > > > appropriate and should help better connect the UNLOAD operations
with
> > > > > the associated audit state (other audit records).
> > > >
> > > > [..]
> > > >
> > > > > As an note to future bug hunters, I did briefly consider
removing the
> > > > > ID reset in bpf_prog_free_id(), as it would seem that once the
> > > > > program is removed from the idr pool it can no longer be found
by its
> > > > > ID value, but commit ad8ad79f4f60 ("bpf: offload: free
program id
> > > > > when device disappears") seems to imply that it is
beneficial to
> > > > > reset the ID value. Perhaps as a secondary indicator that the
ebpf
> > > > > program is unbound/orphaned.
> > > >
> > > > That seems like the way to go imho. Can we have some extra
'invalid_id'
> > > > bitfield in the bpf_prog so we can set it in bpf_prog_free_id and
> > > > check in bpf_prog_free_id (for this offloaded use-case)? Because
> > > > having two ids and then keeping track about which one to use,
depending
> > > > on the context, seems more fragile?
> >
> > > I would definitely prefer to keep just a single ID value, and that was
> > > the first approach I took when drafting this patch, but when looking
> > > through the git log it looked like there was some desire to reset the
> > > ID to zero on free. Not being an expert on the ebpf kernel code I
> > > figured I would just write the patch up this way and make a comment
> > > about not zero'ing out the ID in the commit description so we could
> > > have a discussion about it.
> >
> > Yeah, the commit you reference is resetting the id for the offloaded
> > progs. But it also mentions that even though we reset the id,
> > it won't leak into the userspace:
> >
> > Note that orphaned offload programs will return -ENODEV on
> > BPF_OBJ_GET_INFO_BY_FD so user will never see ID 0.
> >
> > It talks about the "if (!aux->offload)" check in
bpf_prog_offload_info_fill.
> > So I'm assuming that having some extra "this id is already free"
signal
> > in the bpf_prog shouldn't be a problem here.
>
> FWIW, the currently-work-in-progress v2 patch adds a getter for the ID
> with a WARN() check to flag callers who are trying to access a
> bad/free'd bpf_prog. Unfortunately it touches a decent chunk of code,
> but I think it might be a nice additional check at runtime.
>
> +u32 bpf_prog_get_id(const struct bpf_prog *prog)
> +{
> + if (WARN(!prog->valid_id, "Attempting to use invalid eBPF
program"))
> + return 0;
> + return prog->aux->__id;
> +}
I should add that the getter is currently a static inline in bpf.h.
I don't see why we need to WARN on !valid_id, but I might be missing something.
There are no places currently where we report 'id == 0' to the
userspace, so we only need to take care of the offloaded case that
resets id to zero early (instead of resetting it during regular
__bpf_prog_put path).
> > > I'm not seeing any other comments, so I'll go
ahead with putting
> > > together a v2 that sets an invalid flag/bit and I'll post that for
> > > further discussion/review.
--
paul-moore.com