On Fri, Jan 6, 2023 at 2:45 PM Stanislav Fomichev <sdf(a)google.com> wrote:
On Fri, Jan 6, 2023 at 7:44 AM Paul Moore <paul(a)paul-moore.com>
wrote:
>
> When changing the ebpf program put() routines to support being called
> from within IRQ context the program ID was reset to zero prior to
> calling the perf event and audit UNLOAD record generators, which
> resulted in problems as the ebpf program ID was bogus (always zero).
> This patch addresses this problem by removing an unnecessary call to
> bpf_prog_free_id() in __bpf_prog_offload_destroy() and adjusting
> __bpf_prog_put() to only call bpf_prog_free_id() after audit and perf
> have finished their bpf program unload tasks in
> bpf_prog_put_deferred(). For the record, no one can determine, or
> remember, why it was necessary to free the program ID, and remove it
> from the IDR, prior to executing bpf_prog_put_deferred();
> regardless, both Stanislav and Alexei agree that the approach in this
> patch should be safe.
>
> It is worth noting that when moving the bpf_prog_free_id() call, the
> do_idr_lock parameter was forced to true as the ebpf devs determined
> this was the correct as the do_idr_lock should always be true. The
> do_idr_lock parameter will be removed in a follow-up patch, but it
> was kept here to keep the patch small in an effort to ease any stable
> backports.
>
> I also modified the bpf_audit_prog() logic used to associate the
> AUDIT_BPF record with other associated records, e.g. @ctx != NULL.
> Instead of keying off the operation, it now keys off the execution
> context, e.g. '!in_irg && !irqs_disabled()', which is much more
> appropriate and should help better connect the UNLOAD operations with
> the associated audit state (other audit records).
>
> Cc: stable(a)vger.kernel.org
> Fixes: d809e134be7a ("bpf: Prepare bpf_prog_put() to be called from irq
context.")
> Reported-by: Burn Alting <burn.alting(a)iinet.net.au>
> Reported-by: Jiri Olsa <olsajiri(a)gmail.com>
> Suggested-by: Stanislav Fomichev <sdf(a)google.com>
> Suggested-by: Alexei Starovoitov <alexei.starovoitov(a)gmail.com>
> Signed-off-by: Paul Moore <paul(a)paul-moore.com>
Acked-by: Stanislav Fomichev <sdf(a)google.com>
Thank you! There might be a chance it breaks test_offload.py (I don't
remember whether it checks this prog-is-removed-from-id part or not),
but I don't think it's fair to ask to address it :-)
Since it doesn't trigger in CI, I'll take another look next week when
doing a respin of my 'xdp-hints' series.
No problem, I'm glad we found a solution that works for everyone; and
thank you for chasing down any test changes that may be necessary.
I'd like to get this patch into Linus' tree sooner rather than later
as it fixes a kinda ugly problem, would you be okay if this went in
via the bpf tree? With the appropriate ACKs I could send it to Linus
via the audit tree, but I think it would be much better to send it via
the bpf/netdev tree.
--
paul-moore.com