Hi Everyone!
I have problem getting HTTP method indexing to work, at the
same time FS method works great. Symptoms of the problem
look silimar to old case from "swish-e archive":
http://sunsite.berkeley.edu/SWISH-E/archive/0901.html
when running following script:
---------
cd /home/web/search
/home/bin/swish-e
-S http
-i http://www.mysite.com/index.php3?date=2000/06/10
-f /home/web/day.swe
-c /home/web/search/pp.cfg
---------
following response will be generated:
---------
Indexing Data Source: "HTTP-Crawler"
retrieving http://www.mysite.com/index.php3?date=2000/06/10 (0)...
Removing very common words... no words removed.
Writing main index... no unique words indexed.
Writing file index... no files indexed.
Running time: 21 seconds.
Indexing done!
---------
File /home/web/2000/06/10/day.swe will be created, but without
any keywords.
When I found thread http://sunsite.berkeley.edu/SWISH-E/archive/0901.html
from archives I thought that PERL needs reconfiguring. I have to say that I
have
not the owner of the server, and cannot configure server software. But, I
found
to my surprise that when running helper script:
/home/web/search/swishspider.pl ./ss http://www.mysite.com/index.php3
?date=2000/06/10
I get the files ss.response, ss.links and so on with status code 200
So it works, but I can't understand why I cannot index this through
swish-e (-S http). Maybe my config file is not correct
(I've double-triple-checked it but who knows):
I give the config options of http method which are turned on:
---------
# DIRECTIVES for HTTP METHOD ONLY
MaxDepth 2
Delay 20
TmpDir /home/tmp
SpiderDirectory /home/web/search
---------
Other parameters are given at command line - IndexDir, IndexFile.
TmpDir is perm 777, for debugging I set the /home/web/search
dir perms to 777 too.
Can this be still PERL fault? Helper script works.
Uh, long posting, but I hope anyone who has more experience than
me will help.
Desperately waiting for hints,
Angel Parn
angel@mv.parnu.ee
Received on Mon Jun 12 06:07:41 2000