On Fri, Jun 27, 2003 at 06:43:24AM -0700, Aaron Bazar wrote:
> Hi!
>
> Is there a way to remove entries from the swish-e database? It does not have
> to be an automatic way.. can the file be edited?
No, not currently. And I doubt it could easily be edited. Reindexing
is usually best.
> I have many URL's in my
> database, from many different domains... I would like to remove ALL the
> entries from the database that are from one of these domains.
>
> One idea, is to re-spider the URL in question, but give it a bogus IP
> address such that when the URL is spidered, no information is found. I could
> then merge this small file with my huge database file... would this delete
> all the entries?
If you wanted to "remove" a file, say "test.html" from the index you
might be able to create a new "test.html" that contained a dummy word
(I think swish-e won't index empty files) and then merge that index with
your initial index. That file will still be in the index, and it will
get returned no "not" searches.
I have not looked at the merge code in quite a while. But I suppose you
could pass in a list of file names and have swish-e write out a new
index with those files skipped. merge.c is the place to look.
--
Bill Moseley
moseley@hank.org
Received on Mon Jun 30 03:43:36 2003