On 02/21, Richard Guy Briggs wrote:
On 14/02/20, Oleg Nesterov wrote:
> On 01/23, Richard Guy Briggs wrote:
> >
> > task->tgid is an error prone construct and results in duplicate
maintenance.
> > Start it's demise by modifying task_tgid_nr to not use it.
>
> Well, I disagree.
>
> Yes I agree that ->tgid should probably die. But this change itself doesn't
> help, it only makes task_tgid_nr() slower. We need to convert other users
> first, then consider this change along with ->tgid removal.
I thought I recently saw a statistic that there were only 7 instances of
->tgid, but a quick search comes up with >50 outside of audit!
Yes, so I don't think it makes sense to start from task_tgid_nr().
> > - return tsk->tgid;
> > + return pid_nr(task_tgid(tsk));
> > }
>
> And what protect task_tgid? This is racy.
>
> The race is very unlikely, pid_nr() will likely hit pid == NULL if tsk
> exits. But still it can use the freed/unmapped/reused memory.
So at the very least, I'd need
static inline pid_t task_tgid_nr(struct task_struct *tsk)
{
return pid_nr(is_alive(tsk) ? task_tgid(tsk) : NULL);
}
And it sounds like I might even need this since the status of is_alive
could change between the time I call it and the time I call task_tgid:
static inline pid_t task_tgid_nr(struct task_struct *tsk)
{
pid_t pid;
task_lock(&tsk);
pid = pid_nr(is_alive(tsk) ? task_tgid(tsk) : NULL);
task_unlock(&tsk);
return pid;
}
Is task_lock() sufficient, or do I need the heavier
read_lock(&tasklist_lock)?
You need rcu_read_lock(), I think. In any case task_lock() has nothing to
do with pids.
> And even if we add rcu_read_lock() the patch will add the
semantics change,
> task_tgid_nr() can return 0 if tsk has already exited. At least this should
> be documented, but you also need to audit the users.
Basically check for an inline error return of 0 signalling a failure
rather than the idle task.
Perhaps...
Oleg.