On Tue, Jul 20, 2004 at 03:10:40AM -0700, Volker wrote:
> But what I still did not understand: Each day I want to add a
> MySQL-generated output to an existing index. that meanwhile does have a
> size of 700MB.
> The scripts that retrieves information from the MySQL database and
> "feeds" swish-e works fine.
>
> But HOW can I add 10 new pages (my scripts feeds swish-e with html pages
> located in a MySQL database like mentioned above) to an existing index file?
Basically, the normal response to that question is: swish-e doesn't
support incremental indexing.
Here's your options:
1) recreate you index when needed. This doesn't work when pushing the
limits of swish by indexing a huge number of docs.
2) work out a system where you only reindex once a (day|week|month)
and then create a temporary index containing just docs added since
your last big indexing. Then search:
swish-e -w $query -f $main_index $files_since_main_was_indexed
3) do number 2 but merge the indexes. That will be faster when
searching and sorting by properties other than rank.
4) do some variation of #2 or #3 and use more indexes -- useful for
indexing lots of files that change often
5) try the incremental indexing option in swish-e. Run ./configure
--help to see how to build swish-e to support incremental indexing.
It won't be compatible with other versions of swish or other indexes,
and you can only *add* new files, not updated or remove existing ones.
6) consider if what you are indexing is so big that maybe swish-e
isn't the right tool.
Is this a FAQ?
--
Bill Moseley
moseley@hank.org
Received on Tue Jul 20 09:01:55 2004