Skip to main content.
home | support | download

Back to List Archive

spider.pl

From: Z <techlistreader(at)not-real.yahoo.com>
Date: Wed Aug 16 2006 - 16:02:33 GMT
I am trying to index with spider.pl using a conf file that is rather basic:

##################################################

# Use spider.pl for indexing (location of spider.pl set at installation time)
IndexDir spider.pl

# Use spider.pl's default configuration and specify the URL to spider
SwishProgParameters default http://dev.site.com/index.html

# Allow extra searching by title, path
Metanames swishtitle swishdocpath

# Set StoreDescription for each parser
#  to display context with search results
StoreDescription TXT* 10000
StoreDescription HTML* <body> 10000

##################################################

It doesn't seem to get any results and this is the error I get.

##################################################
 
E:\INETPUB\WWWROOT\SITE\WINDOWS>perl spider | swish-e.exe -S prog -c test.conf

Indexing Data Source: "External-Program"
Indexing "spider.pl"
External Program found: E:\INETPUB\WWWROOT\SITE\WINDOWS\lib\swish-e/spider.pl
spider.pl: Reading parameters from 'SwishSpiderConfig_test.pl'
Skipping Server Config: http://dev.site.com/index.hmtl
E:\INETPUB\WWWROOT\SITE\WINDOWS\lib\swish-e\spider.pl: Reading parameters from 'default'

Summary for: http://dev.site.com/
Connection: Close: 1  (0.2/sec)
      Unique URLs: 1  (0.2/sec)
Removing very common words...
no words removed.
Writing main index...
err: No unique words indexed!
.

##################################################
 
I have tried a completely new reinstall, and get the same results.
 
Z
 		
---------------------------------
Yahoo! Messenger with Voice. Make PC-to-Phone Calls to the US (and 30+ countries) for 2/min or less.


*********************************************************************
Due to deletion of content types excluded from this list by policy,
this multipart message was reduced to a single part, and from there
to a plain text message.
*********************************************************************
Received on Wed Aug 16 09:02:36 2006