Nick scribbled on 5/6/05 3:49 PM:
> swish-e -c /etc/swish.conf -S prog -i DirTree.pl
> I tried that but I got this:
>
> Indexing Data Source: "External-Program"
> Indexing "DirTree.pl"
> External Program found: /usr/lib/swish-e/DirTree.pl
> Must supply at least one directory
> Usage:
> DirTree.pl [options] directory <directory...> | swish-e -S prog -i stdin
>
> Options:
> -verbose Display processing info
> -debug Enable debugging (including SWISH::Filter debugging)
> -man Display documentation
> -path Display location lib path set at installation
> -no_skip Process documents even if filtering fails
> -symlinks Follow symbolic links. Default is to NOT follow
> symlinks
>
> Removing very common words...
> no words removed.
> Writing main index...
> err: No unique words indexed!
try adding this line to your existing config:
SwishProgParameters /home/shared
and comment out this line:
# IndexDir "/home/shared"
> Is there any reason to use SWISH::Filter for performance, or is it just
> supposed to be easier? To me doing something like this in the config file
> makes more sense, as I understand what it is doing when I tell it about
> each type of file:
>
I think you're right, in principle. You must be a sysadmin-type: we tend not to
like the black box approach. ;)
SWISH::Filter lets you drop in new filters and, in theory, not change your
config. But doing it longhand like you have it should work too. Unless it doesn't...
> IndexContents TXT* .txt
> IndexContents HTML* .htm
> IndexContents HTML* .html
>
> FileFilter .pdf pdftotext "'%p' -"
> IndexContents TXT* .pdf
>
> FileFilter .doc catdoc
> IndexContents TXT* .doc
>
> FileFilter .ppt ppthtml
> IndexContents TXT* .ppt
>
>
> But of course I have something wrong in there since I am getting lots of
> errors from catdoc, and also I don't know how to put the excel one in
> there since I think it is a perl script.
>
--
Peter Karman . http://peknet.com/ . peter(at)not-real.peknet.com
Received on Fri May 6 14:05:59 2005