home | support | download Search for

# Re: AW: AW: Error on win2k when spidering

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Fri Jan 18 2002 - 19:24:33 GMT
[moving back to the list]

At 05:04 PM 01/18/02 +0100, andreas.spielvogel@conti.de wrote:
>I still try to run the http method. I thought that potentially SWISH-E does
>not find  spider.pl (or swishspider.pl; I don't actually know which is ment.
>And I found a config file which is even more confusing.). So I added
>	SpiderDirectory C:\\Programme\\SWISH-E

When confused the best solution is to remove config options, not add them ;)

>	err: SpiderDirectory. C:\Programme\SWISH-E/ is not a directory

Yes, another reason not to like -S http method.

Over the last few days I've posted a number of examples of spidering on
Windows using both -S http and -S prog.

Please review the archives for specific examples.

As far as this SpiderDirectory problem, first, it assumes that the path
separator is the forward slash.  Swish deals with it better in other
places, but since I for one never use -S http (and not windows) I've never
seen this before.  Regardless, I would not probably use SpiderDirectory
anyway since I'd just put swishspider.pl in my current directory.

Not exactly user friendly, is it.

Anyway, It's a bit odd that it doesn't work in windows.  At least the check
to see if SpiderDirectory is a directory should work (I thought) with
either forward or backward slash.

SpiderDirectory e:/     << works on my machine

SpiderDirectory e:/tmp  << doesn't work on my machine.
err: SpiderDirectory. e:/tmp/ is not a directory

although it really is a directory.  I don't have a way to debug that on
Windows.

>	SpiderDirectory C:\\Programme\\SWISH-E
>	SpiderDirectory C:\Programme\SWISH-E
>	SpiderDirectory C:\\Programme\\SWISH-E\\
>	SpiderDirectory C:\Programme\SWISH-E\
>	SpiderDirectory 'C:\\Programme\\SWISH-E'
>	....
>Nothing helps. The messages concerning the path varies, but the error stays
>the same.

Like they say, if it hurts, quit doing it.  So, don't use SpiderDirectory
and just put swishspider.pl in the current directory.

>Concerning the question where to mention that perl is needed: I think it
>should be mentioned as reference in all places where the http method or a
>perl script is used internally. And there should be one paragraph (like I
>found in TOMCAT) descibing how perl has be be installed and in which
>configurations (OS and perl tools) it was tested (successfull and
>unsuccessful).

And what's that paragraph?

Well, this will be a bit abstract, but -S http and -S prog work differently.

-S http expects to run perl and expects it to be in the path.  So it
doesn't matter where it's installed.

-S prog, on the other hand, since it can run any program, will actually
make sure that what is set with -i or IndexDir is really a program (exists
and is executable).  Therefore, with -S prog you have to specify the full
path to the program.  BUT, that path is also used to run the program via
the shell (command.com) in Windows so you have to be careful how you
specify the program.

In other words, you could say (in Windows with -S prog)

IndexDir c:/path/to/program.exe

and swish would find that program and say it's ok to use, but then when
swish goes to run the program the command is passed through the shell and
command.com chokes on the forward slashes.

So, to get around that Windows issue, you have to use backslashes:

IndexDire c:\\path\\to\\program.exe

But of course, I just went over this a few days ago.

Now, what was the question?

>A final question: how to bring our communication to the list? Does it me I
>don't response to you but forward my response to the list. Our to put the
>list in cc? Or do I  have to send a special code? I think from the help file
>I didn't understand the procedure completely. And as I'm under very high
>time pressure I didn't had the chance to spend more time to get it.

Just send messages to the list.

Or is the problem you send messages to the list and they do not get posted??

--
Bill Moseley
mailto:moseley@hank.org

Received on Fri Jan 18 19:25:37 2002