On Fri, Jun 2, 2017 at 10:18 PM, Arnd Bergmann <arnd(a)arndb.de> wrote:
On Fri, Jun 2, 2017 at 2:18 PM, Yan, Zheng <ukernel(a)gmail.com>
wrote:
> On Fri, Jun 2, 2017 at 7:33 PM, Arnd Bergmann <arnd(a)arndb.de> wrote:
>> On Fri, Jun 2, 2017 at 1:18 PM, Yan, Zheng <ukernel(a)gmail.com> wrote:
>> What I meant is another related problem in ceph_mkdir() where the
>> i_ctime field of the parent inode is different between the persistent
>> representation in the mds and the in-memory representation.
>>
>
> I don't see any problem in mkdir case. Parent inode's i_ctime in mds is set
to
> r_stamp. When client receives request reply, it set its in-memory inode's ctime
> to the same time stamp.
Ok, I see it now, thanks for the clarification. Most other file systems do this
the other way round and update all fields in the in-memory inode structure
first and then write that to persistent storage, so I was getting confused about
the order of events here.
If I understand it all right, we have three different behaviors in ceph now,
though the differences are very minor and probably don't ever matter:
- in setattr(), we update ctime in the in-memory inode first and then send
the same time to the mds, and expect to set it again when the reply comes.
- in ceph_write_iter write() and mmap/page_mkwrite(), we call
file_update_time() to set i_mtime and i_ctime to the same
timestamp first once a write is observed by the fs and then take
two other timestamps that we send to the mds, and update the
in-memory inode a second time when the reply comes. ctime
is never older than mtime here, as far as I can tell, but it may
be newer when the timer interrupt happens between taking the
two stamps.
We don't use request to send i_mtime/i_ctime to mds in this case.
Instead, we use cap flush message. i_mtime/i_ctime are directly
encoded in cap flush message. When mds receives the cap flush message,
it writes i_mtime/i_ctime to persistent storage and sends a cap flush
ack message to client. (when client receives the cap flush ack
message, it does not update i_mtime/i_ctime). There is no issue as you
described.
- in all other calls, we only update the inode (and/or parent inode)
after the reply arrives.
There are two cases. 1. Client updates in-memory inode's ctime, it
sends the new ctime to mds through cap flush message. 2. client set
mds request's r_stamp and send the request to mds. MDS updates
relavent inodes' ctime and sends reply to client. Client updates
in-memory inodes' ctime according to the reply.
Regards
Yan, Zheng
Arnd