Kaplan, Andrew H. wrote on 06/17/2004 07:43 AM:
> Here is the text of the swish.conf file without spider.pl:
>
> IndexDir /www
> StoreDescription HTML* <body> 200000
> MetaNames swishdocpath swishtitle
> ReplaceRules replace "/www/" "http://192.168.1.156/"
>
>>The command syntax that is used here is /usr/local/bin/swish-e -c swish.conf -v
>>3
>>
>>This approach does appear to index the pdf and doc files, but error messages
>>appear saying the program is substituting
>>embedded null characters in the pdf and doc files that I am indexing. I did a
>>check of the discussion lists and the issue
>>has to do with the fact the files being indexed are binary. I tried adding
>>several lines to the swish.conf file including
>>IndexOnly, IndexContents and NoContents. That did not make a difference. Does
>>anyone have suggestions on where to
>>go from here?
So: spider.pl does NOT work.
no spider.pl DOES work.
Test the index created without the spider. Does searching your PDFs work?
If you turn off the -v 3 option, you don't get warnings.
If search works, and you get no warnings, then you don't have a problem.
Right? :)
--
Peter Karman - Software Publications Programmer - Cray Inc
phone: 651-605-9009 - mailto:karman@cray.com
Received on Thu Jun 17 13:30:18 2004