Skip to main content.
home | support | download

Back to List Archive

Re: indexing and index file copy

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Tue Aug 31 2004 - 14:10:30 GMT
On Tue, Aug 31, 2004 at 03:00:14AM -0700, Jonas Wolf wrote:
> I am using swish-e to run a web search which accesses the index files via 
> SWISH::API. I am indexing large amounts of data every night, which takes a 
> long time. We have users all around the world, so there will always be 
> access to the search page around the clock. Now normally everything works 
> fine, however, sometimes, indexing aborts and I get a file sharing 
> violation.
> 
> Failed to unlink 'index_nostem.idx' before renaming. : Permission denied

Windows?  And you are receiving that error because the index is open for
reading?  Geeze.

Here's the deal: The index is created with a .temp file name.  Then
when indexing is done the files are renamed removing the .temp suffix
replacing the old index files.

On non-Windows machines that works fine[1] -- anyone that has the old
index open will still use that and it won't be deleted until it's
closed.  New opens will use the new index.  Everyone is happy and the
OS cleans up things automatically.

Of course, On Windows you can't do something that makes sense like
that.  You can't rename a file to something that already exists.  So
in db_native.c there's this code to first remove the file:

#ifdef _WIN32
        if(!stat(newname, &stbuf) && ((stbuf.st_mode & S_IFMT) == S_IFREG))
            if (remove(newname))
                progerrno("Failed to unlink '%s' before renaming. : ", newname);
#endif
        if (rename(*filename, newname))
            progerrno("Failed to rename '%s' to '%s' : ", *filename, newname);

But even that fails, I guess, if the file is in use.  Why are you
using Windows?

So how do people normally handle this under Windows?  Is there a
"correct" way to do this under Windows?

[1] Almost fine: now that the index is made up of multiple files the
renaming is not atomic.   Also, programs that keep the index open
should probably check for inode changes every so often and then close
and reopen the index if the file changes.

> Does swish-e just try to do this once, and then aborts if it fails, or 
> does it attempt more than once? Note that this is swish-e 2.42 on Windows.

It only tries once.  It tried something and it said permission denied,
no "try back later" -- so it has to abort.

> Of course I could disable the search while the files are copied across, 
> but maybe somebody has had this problem and a better solution?

Maybe one of the Windows users here has a suggestion.  Maybe some kind
of atomic lockfile that the search processes open in shared read mode,
and then in you reindex script do a exclusive block, wait, and then
rename.  But you might wait forever.

What about this?

http://www-1.ibm.com/servers/eserver/linux/home.html?c=serversintro&n=Linux2001&t=ad


-- 
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list: 
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs
   swish-e@sunsite.berkeley.edu
Received on Tue Aug 31 07:11:39 2004