Re: wondering why?

From: Bill Moseley <moseley(at)>
Date: Wed Apr 28 2004 - 18:22:46 GMT
On Wed, Apr 28, 2004 at 11:40:38AM -0400, Weir James K Contr ASC/ENOI wrote:
 > Are you using the same exact program for both indexing and 
> > searching?  I ask because it looks like a data offset error.  
> > Another possibility is that you copied the index file but are 
> > using the wrong associated .prop file.  Did you copy or move 
> > the index by chance?

> I move the index file and .prop to another folder after I index it. 
> Just incase I need to use the search while indexing.. Ie a Temp folder, is 
> This not a good idea;that is moving the files?

Swish-e does that for you.  When indexing it writes to .temp files of
the same name as the index and when done renames the files.  It's not
atomic so there's a slight chance that the someone could open the index
right in the middle of the rename and open the old index and the new
.prop file and fail.

I suspect it's a tiny bit faster if create the index in a temporary
directory and then rename the directories, but it still isn't atomic
since two files are opened.

> > You posted about this in early March.  Bill Schell found an 
> > off-by-one error that was causing problems, but I'm not sure 
> > if that's related to your issue as I think that was only when 
> > not using HTML2|XML2|TXT2 parser (i.e. using the old system 
> > where the entire file is read into memory).  You are using 
> > HTML* which will use libxml2.
> I am only indexing TXT files, should I use some other parser for that?

Yes, then you would be using the old system that reads the entire file
into memory and could get hit by that off-by-one bug.  The TXT2 parser
reads and indexes in chunks.

Bill Moseley
