Re: [swish-e] using incremental mode

From: Peter Karman <peter(at)>
Date: Tue Oct 23 2007 - 13:47:40 GMT
On 10/23/2007 07:58 AM, Judith Retief wrote:

>   1) The only documentation I can find on creating an incremental index is
> using -N (or Update-Mode), providing a filename - only files newer than the
> given file will be indexed. But we don't index files - our data resides in a
> database so we use stdin to provide the data to swish-e on the cmdline. Is
> it possible to create an incremental index if your data doesn't live in
> files?

You likely need to build an incremental version of swish-e. You won't see the
options using -h (help) unless you've built it with --enable-incremental.

Also check out the -S prog header for updating:

>   2) How stable is incremental indexing by now? It's been available, from
> what I can see, since December 2003. And there are some references in the
> archive of people that use it sucessfully. But there are also a post or two
> about stability problems, and the latest documentation still advises to ask
> on this forum before using. 

There are some reports of folks using it, but also some unresolved issues with
segfaults, etc. I don't personally use it because my doc collections are not
big enough at present to warrant it. The main developer of the feature has been
absent from Swish-e dev for the last couple of years, but has just started work
on a Berkeley DB-backed version of Swish-e that will enable better incremental
work. It's not production-ready, but if you were interested in trying it out,
I'm sure your feedback would be most helpful in getting it there.

Both the BDB version in branches/2.6 and the incremental version in trunk are
index incompatible with the current native 2.4/2.5 version in svn trunk. So
you'd have to re-index your doc corpus if you switch index formats.

You can check out the BDB version from svn here:
