On 15/04/27, Eric W. Biederman wrote:
Richard Guy Briggs <rgb(a)redhat.com> writes:
> On 15/04/24, Eric W. Biederman wrote:
>> Richard Guy Briggs <rgb(a)redhat.com> writes:
>> > On 15/04/22, Richard Guy Briggs wrote:
>> >> On 15/04/20, Eric W. Biederman wrote:
>> >> > Richard Guy Briggs <rgb(a)redhat.com> writes:
>> > Do I even need to report the device number anymore since I am concluding
>> > s_dev is never set (or always zero) in the nsfs filesystem by
>> > mount_pseudo() and isn't even mountable?
>>
>> We still need the dev. We do have a device number get_anon_bdev fills it in.
>
> Fine, it has a device number. There appears to be only one of these
> allocated per kernel. I can get it from &nsfs->fs_supers (and take the
> first instance given by hlist_for_each_entry and verify there are no
> others). Why do I need it, again?
Because if we have to preserve the inode number over a migration event I
want to preserve the fact that we are talking about inode numbers from a
superblock with a device number.
Otherwise known as I am allergic to kernel global identifiers, because
they can be major pains. I don't want to have to go back and implement
a namespace for namespaces.
Alright, I'll change the device over to that... We can figure out how
to select the correct device number of nsfs instances if it increases
beyond one.
>> >> They are all covered:
>> >> sys_unshare > unshare_userns > create_user_ns
>> >> sys_unshare > unshare_nsproxy_namespaces > create_new_namespaces
> copy_mnt_ns
>> >> sys_unshare > unshare_nsproxy_namespaces > create_new_namespaces
> copy_utsname > clone_uts_ns
>> >> sys_unshare > unshare_nsproxy_namespaces > create_new_namespaces
> copy_ipcs > get_ipc_ns
>> >> sys_unshare > unshare_nsproxy_namespaces > create_new_namespaces
> copy_pid_ns > create_pid_namespace
>> >> sys_unshare > unshare_nsproxy_namespaces > create_new_namespaces
> copy_net_ns
>>
>> Then why the special change to fork? That was not reflected on
>> the unshare path as far as I could see.
>
> Fork can specify more than one CLONE flag at once, so collecting them
> all in one statementn seemed helpful. setns can only set one at a time.
unshare can also specify more than one CLONE flag at once.
I just pointed that out becase that seemed really unsymmetrical.
Ah sorry, my mistake, I was thinking setns... I've added a call in
sys_unshare().
> Ok, understood, we can't just punt this one to a higher
layer...
>
> So this comes back to a question above, which is how do we determine
> which device it is from? Sounds like we need something added to
> ns_common or one of the 6 namespace types structs.
Or we can just hard code reading it off of the appropriate magic
filesystem. Probably what we want is a well named helper function that
does the job.
There is a bit of overhead to read that, so I've added a dev_t member to
ns_common. Simplest way I found was to call iterate_supers() since
struct file_system_type *nsfs isn't exposed.
I just care that when we talk about these things we are talking
about
inode numbers from a superblock that is associated with a given device
number. That way I don't have nightmares about dealing with a namespace
for namespaces.
Eric
- RGB
--
Richard Guy Briggs <rbriggs(a)redhat.com>
Senior Software Engineer, Kernel Security, AMER ENG Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635, Alt: +1.613.693.0684x3545