On Fri, Dec 23, 2022 at 8:49 PM Stanislav Fomichev <sdf(a)google.com> wrote:
On Fri, Dec 23, 2022 at 10:55 AM Paul Moore
<paul(a)paul-moore.com> wrote:
>
> When changing the ebpf program put() routines to support being called
> from within IRQ context the program ID was reset to zero prior to
> calling the perf event and audit UNLOAD record generators, which
> resulted in problems as the ebpf program ID was bogus (always zero).
> This patch resolves this by adding a new flag, bpf_prog::valid_id, to
> indicate when the bpf_prog_aux ID field is valid; it is set to true/1
> in bpf_prog_alloc_id() and set to false/0 in bpf_prog_free_id(). In
> order to help ensure that access to the bpf_prog_aux ID field takes
> into account the new valid_id flag, the bpf_prog_aux ID field is
> renamed to bpf_prog_aux::__id and a getter function,
> bpf_prog_get_id(), was created and all users of bpf_prog_aux::id were
> converted to the new caller. Exceptions to this include some of the
> internal ebpf functions and the xdp trace points, although the latter
> still take into account the valid_id flag.
>
> I also modified the bpf_audit_prog() logic used to associate the
> AUDIT_BPF record with other associated records, e.g. @ctx != NULL.
> Instead of keying off the operation, it now keys off the execution
> context, e.g. '!in_irg && !irqs_disabled()', which is much more
> appropriate and should help better connect the UNLOAD operations with
> the associated audit state (other audit records).
>
> Fixes: d809e134be7a ("bpf: Prepare bpf_prog_put() to be called from irq
context.")
> Reported-by: Burn Alting <burn.alting(a)iinet.net.au>
> Reported-by: Jiri Olsa <olsajiri(a)gmail.com>
> Signed-off-by: Paul Moore <paul(a)paul-moore.com>
>
> --
> * v2
> - change subj
> - add mention of the perf regression
> - drop the dedicated program audit ID
> - add the bpf_prog::valid_id flag, bpf_prog_get_id() getter
> - convert prog ID users to new ID getter
> * v1
> - subj was: "bpf: restore the ebpf audit UNLOAD id field"
> - initial draft
> ---
> drivers/net/netdevsim/bpf.c | 6 ++++--
> include/linux/bpf.h | 11 +++++++++--
> include/linux/bpf_verifier.h | 2 +-
> include/trace/events/xdp.h | 4 ++--
> kernel/bpf/arraymap.c | 2 +-
> kernel/bpf/bpf_struct_ops.c | 2 +-
> kernel/bpf/cgroup.c | 2 +-
> kernel/bpf/core.c | 2 +-
> kernel/bpf/cpumap.c | 2 +-
> kernel/bpf/devmap.c | 2 +-
> kernel/bpf/syscall.c | 27 +++++++++++++++------------
> kernel/events/core.c | 6 +++++-
> kernel/trace/bpf_trace.c | 2 +-
> net/core/dev.c | 2 +-
> net/core/filter.c | 3 ++-
> net/core/rtnetlink.c | 2 +-
> net/core/sock_map.c | 2 +-
> net/ipv6/seg6_local.c | 3 ++-
> net/sched/act_bpf.c | 2 +-
> net/sched/cls_bpf.c | 2 +-
> 20 files changed, 52 insertions(+), 34 deletions(-)
...
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 9e7d46d16032..18e965bd7db9 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -1688,6 +1689,12 @@ void bpf_prog_inc(struct bpf_prog *prog);
> struct bpf_prog * __must_check bpf_prog_inc_not_zero(struct bpf_prog *prog);
> void bpf_prog_put(struct bpf_prog *prog);
>
> +static inline u32 bpf_prog_get_id(const struct bpf_prog *prog)
> +{
> + if (WARN(!prog->valid_id, "Attempting to use an invalid eBPF
program"))
> + return 0;
> + return prog->aux->__id;
> +}
I'm still missing why we need to have this WARN and have a check at all.
I believe I explained my reasoning in the other posting, but as I also
mentioned, it's your subsystem so I don't really care about the
details as long as we fix the bug/regression in the ebpf code.
IIUC, we're actually too eager in resetting the id to 0, and need
to
keep that stale id around at least for perf/audit.
Agreed.
Why not have a flag only to protect against double-idr_remove
bpf_prog_free_id and keep the rest as is?
I'll send an updated patch next week with the only protection being a
check in bpf_prog_free_id().
--
paul-moore.com