At 05:55 PM 09/29/02 -0700, Trond Nilsen wrote:
>Am I right in assuming that when Swish-E performs a search on multiple
indexes
>that when the results are merged, they are done so with case sensitivity?
Results are not merged when searching multiple indexes.
~/swish-e/src $ ./swish-e -i index.c -f 1 -v0
~/swish-e/src $ ./swish-e -i index.c -f 2 -v0
~/swish-e/src $ ./swish-e -w not dkdkd -f 1 2 -H0
1000 index.c "index.c" 81446
1000 index.c "index.c" 81446
Are you talking about -M type of merge where indexes are merged before
searching the combined single index?
>So, is there any way to get Swish to ignore case when merging? I'm having
>trouble spidering a large site over which I have no editorial control, where
>the writers have been lazy and specified pages with both cases. I can solve
>the problem with some post-processing, but I figured I'd check first :)
If you are talking about -M merge then check out:
http://swish-e.org/current/docs/SWISH-CONFIG.html#item_PropertyNamesCompareC
ase
I think you can set the swishdocpath as case-insensitve.
The other thing is to lowercase the URL when spidering by editing the
spider program ( swishspider or spider.pl ).
The right solution is to convince the site owner to fix their broken URLs.
All it would take is a short perl script...
--
Bill Moseley
mailto:moseley@hank.org
Received on Mon Sep 30 01:28:56 2002