Skip to main content.
home | support | download

Back to List Archive

Re: white space in directory names and filen ames

From: <moseley(at)not-real.hank.org>
Date: Wed Oct 01 2003 - 06:31:54 GMT
On Tue, Sep 30, 2003 at 10:47:39PM -0700, jchen@hdc.org.nz wrote:
> I tried to install swish-e on win2k server, there is no problem indexing 
> html file, but I found swish-e can not indexing word document which has 
> white space in the file name, like "Wordfile.doc" & "copy of Wordfile.doc"

That's probably due to how the filter is written.

> FileFilter .pdf c:/wwwroot/cgi-bin/xpdf/pdftotext.exe '"%p" -'
> FileFilter .doc  C:/wwwroot/cgi-bin/catdoc/catdoc.exe "%p"


FileFilter .doc  C:/wwwroot/cgi-bin/catdoc/catdoc.exe '"%p"'

That will then include the quotes as part of the command passed to your 
Windows shell.  The way you had it those double quotes were just seen by 
swish.




I can't test right now on Windows, but I can try 2.4.0 on my laptop.
i.e. different version of swish and a different OS, so doesn't really 
apply... ;)

moseley@laptop:~$ cat c
SwishProgParameters default http://localhost/apache/index.html
IndexDir spider.pl

moseley@laptop:~$ swish-e -c c -S prog -v0         
/usr/local/lib/swish-e/spider.pl: Reading parameters from 'default'

Summary for: http://localhost/apache/index.html
Connection: Keep-Alive:      4  (4.0/sec)
           Total Bytes: 17,156  (17156.0/sec)
            Total Docs:      5  (5.0/sec)
           Unique URLs:      5  (5.0/sec)


moseley@laptop:~$ swish-e -w not dkdkd -H0
1000 http://localhost/apache/test.txt "test.txt" 12
1000 http://localhost/apache/doc with spaces.doc "doc with spaces.doc" 2172
1000 http://localhost/apache/test.doc "test.doc" 2172
1000 http://localhost/apache/test.pdf "test.pdf" 12593
1000 http://localhost/apache/index.html "title" 207

-- 
Bill Moseley
moseley@hank.org
Received on Wed Oct 1 06:31:56 2003