On Tue, May 13, 2003 at 08:12:00AM -0700, Jody Cleveland wrote:
> Hello,
>
> Well, I've got someone who wants me to index:
> \\vision\www\keetra\wip\digitization\picbooks\current\pdfs\
>
> Which is on our test windows 2000 server. I run swish-e on a redhat 8 server
> and spider that location. When I do that, I get this message:
>
> ./spider.pl: Reading parameters from
> '/var/www/cgi-bin/search/vision/spider/visionSpiderConfig.pl'
>
> -- Starting to spider:
> http://199.242.176.180/www/keetra/wip/digitization/picbooks/current/pdfs/ --
>
> Summary for:
> http://199.242.176.180/www/keetra/wip/digitization/picbooks/current/pdfs/
> Skipped: 1 (1.0/sec)
> Indexing Data Source: "External-Program"
> Indexing "stdin"
>
> Removing very common words...
> no words removed.
> Writing main index...
> err: No unique words indexed!
> .
>
> So, since that didn't work, I had her copy all her files to
> http://199.242.176.180/picbooks and that works fine. Is swish-e only happy
> with one subdirectory, or is there a configuration somewhere I need to
> change?
Sorry, I don't really follow your question.
If you want to know why something is not sent to swish-e by the spider run
SPIDER_DEBUG=skipped swish-e -S prog ....
before running it and it will tell you why it was skipped.
--
Bill Moseley
moseley@hank.org
Received on Tue May 13 16:56:24 2003