RPM Community Forums

Mailing List Message of <rpm-devel>

Re: life RPMDB got broken (RPM 5.1.0)

From: Jeff Johnson <n3npq@mac.com>
Date: Fri 25 Apr 2008 - 13:49:36 CEST
Message-id: <8152595F-6E39-4785-8426-9B3AF84C388A@mac.com>

On Apr 25, 2008, at 7:26 AM, Ralf S. Engelschall wrote:

> An OpenPKG instance I've recently upgraded to RPM 5.1.0 first worked
> just fine for a few days and today I noticed that on a simple "openpkg
> rpm -qa" I get mostly all packages in the output but near the end  
> of the
> output I receive:
>
> | error: rpmdb: skipping h#     157 region trailer: BAD, tag 256  
> type 256 offset -256 count 256
>
> So, seems like an entry of a package in the RPMDB crashed for totally
> unknown reasons. There was no system crash or any other package
> manipulations recently, etc. Hence I've no clue how this can happen  
> and
> I see it the first time.
>

There are no "unknown reasons" on computers. Well gamma rays and such
just need to be lived with. But its more likely a bug somewhere.

I think "crashed" is a bit excessive a term for an explicitly  
detected problem, but YMMV.

Yes, problems can show up weeks/months after whatever was the cause.
However, "rpm -qa" verifies every digets/signature and loads every  
header.
Add a "rpm -qa"  cron job in order to pin down the point in time of  
the occurrence.

> I fixed the problem for me with a full BDB based dump/restore of the
> database, followed by a "rpm --rebuilddb" and a "rpm -i --justdb"  
> of the
> single lost package. Nevertheless I wanted to drop you a note about  
> this
> here. I can provide a copy of the RPMDB (in the state before I  
> fixed it)
> in case someone like Jeff wants to peek at it in more detail...
>

FYI: dump/restore is needed iff "db_verify Packages" indicates a problem
with the db structure.

And removing the cache contained in __db* files is recommended for  
careful
work. OTOH, the cache is seldom corrupted, and usually there are  
other indications
that something is wrong, as failure to provide reliable locking for  
the cache is the
usual cause of cache corruption.

Doing a post-mortem on an rpmdb is usually fruitless. What I really  
need is some
hint about how to reproduce. Your OpenPKG procedures are wonderfully  
reliable
compared to what I'm usually offered for diagnosis.

So I need a bit more info to chase down rpmdb bugs.

73 de Jeff
Received on Fri Apr 25 13:49:43 2008
Driven by Jeff Johnson and the RPM project team.
Hosted by OpenPKG and Ralf S. Engelschall.
Powered by FreeBSD and OpenPKG.