ROMIO and MPI_FILE_SYNC

The MPI specification notes that a call to MPI_FILE_SYNC ``causes all previous writes to fh by the calling process to be transferred to the storage device.'' Likewise, calls to MPI_FILE_CLOSE have this same semantic. Further, ``if all processes have made updates to the storage device, then all such updates become visible to subsequent reads of fh by the calling process.''

The intended use of MPI_FILE_SYNC is to allow all processes in the communicator used to open the file to see changes made to the file by each other (the second part of the specification). The definition of ``storage device'' in the specification is vague, and it isn't necessarily the case that calling MPI_FILE_SYNC will force data out to permanent storage.

Since users often use MPI_FILE_SYNC to attempt to force data out to permanent storage (i.e. disk), the ROMIO implementation of this call enforces stronger semantics for most underlying file systems by calling the appropriate file sync operation when MPI_FILE_SYNC is called (e.g. fsync). However, it is still unwise to assume that the data has all made it to disk because some file systems (e.g. NFS) may not force data to disk when a client system makes a sync call.

For performance reasons we do not make this same file system call at MPI_FILE_CLOSE time. At close time ROMIO ensures any data has been written out to the ``storage device'' (file system) as defined in the standard, but does not try to push the data beyond this and into physical storage. Users should call MPI_FILE_SYNC before the close if they wish to encourage the underlying file system to push data to permanent storage.

Rob Latham 2016-08-01