RPM Community Forums

Mailing List Message of <rpm-devel>

Re: fsync()

From: Andy Green <andy@warmcat.com>
Date: Sat 30 Jun 2007 - 11:52:35 CEST
Message-ID: <468627E3.8040103@warmcat.com>
Russell Coker wrote:

> On Saturday 30 June 2007 00:35, Andy Green <andy@warmcat.com> wrote:
>> I don't think fsync() for individual files is really a fair answer,
> 
> Why not?

$ man fsync
...
DESCRIPTION
       fsync()  transfers  ("flushes")  all  modified  in-core data of
(i.e., modified buffer cache pages for) the file referred to by the file
       descriptor fd to the disk device (or other permanent storage
device) where that file resides.  The call blocks until the device  reports
       that the transfer has completed.  It also flushes  metadata
information associated with the file (see stat(2)).
...

You're proposing that doing an fsync() after every unpacked file is
righteous for all cases?  RPM will slow down dramatically for no real
benefit.  If power is lost partway through an archive unpack the package
is still in an inconsistent partial state on the drive despite that the
the atomic unit of inconsistency is supposedly now one file.  One way or
another you end up with half a kernel package or whatever.

>> it's 
>> fine if it just uses the normal filesystem APIs per-file.  But after the
>> transaction is complete, and you walk away thinking you did complete an
>> rpm transaction, there is a case for adding a sync() to make sure
>> everything you think you have done is truly committed to physical
>> storage (maybe it does it already, I dunno).  On the one hand this is a
>> relatively low probability issue for a desktop box but on the other hand
>> it is pretty cheap.
> 
> The time taken for a sync() system call can be very large when you have a 
> system under high write load.  Under some older versions of Linux the time 
> taken for sync() appeared to be unbounded (it apparently kept looping through 
> the list of data to write while more data was being added to the list), a 
> brief test suggests that recent versions of Linux may have solved this.

Well then why mention this as an issue.

> sync() is not the way to get some files committed to disk.

Sure it is.  A sync() at the end is aimed at closing the window between
rpm completing a transaction (and feeding back that it is completed),
and the completed actions not being on physical storage.  With a single
sync() at the end you don't come back to the prompt from rpm until the
transaction is completed not only in cache but at the physical storage.
 (In the case of HDDs neither fsync() nor sync() guarantee that the data
is committed from the HDD private cache to the nonvolitile storage, but
that should normally happen very shortly afterwards).

Anyway I just mentioned it has value for embedded flash devices.  I
don't know it or fsync() has much value for PCs.

-Andy
Received on Sat Jun 30 11:52:40 2007
Driven by Jeff Johnson and the RPM project team.
Hosted by OpenPKG and Ralf S. Engelschall.
Powered by FreeBSD and OpenPKG.