On Friday 29 June 2007 05:37, Jeff Johnson <n3npq@mac.com> wrote:
> On Jun 28, 2007, at 2:28 AM, Russell Coker wrote:
> > When upgrading a package with RPM version 4.4.2 in SUSE doesn't
> > call fsync()!
> > It creates a temporary file (without using O_SYNC), writes all the
> > data to
> > it, closes it, and then renames it to replace the original file.
>
> The temporary file has the /path/to/file;12345678 transaction id
> appended?
I don't recall the name.
> A close should sync the data, should it not?
Not necessarily. Some filesystems (such as XFS) try to deduce what a
user-space program desires by the pattern of system-calls and implements it
(EG a certain combination of create and rename can cause the data to be
sync'd faster).
> Or do you mean the rpmdb files?
No, I mean file data.
> SuSE has a very different usage case
> for a rpmdb, and insists on avoiding sync whenever possible for
> "performance"
> reasons.
Then SuSE are butt-heads.
> > Has this horrible mistake been fixed in the upstream tree?
>
> I believe the problem is a change in behavior in libio in glibc.
I believe that it has nothing to do with glibc or any other user-space code.
> But adding an explicit fsync() is trivial as soon as I can get a
> reproducer.
You want to be able to repeatably trigger a race-condition before you fix it?
To cause this race condition you must first use a file-system that is
optimised for performance (EG XFS) so that it will allow long cache
write-back times and also do write-related tasks after closing the file (EG
assigning disk blocks to the file after close() so that it knows the length).
Then put some load on the system while installing an RPM, and then trigger a
hardware reset shortly after rpm exits.
--
russell@coker.com.au
http://etbe.coker.com.au/ My Blog
http://www.coker.com.au/sponsorship.html Sponsoring Free Software development
Received on Fri Jun 29 15:52:07 2007