Skip to main content.
home | support | download

Back to List Archive

[swish-e] Index Doc , excel , pdf Titles Only

From: <rmspamfilter(at)>
Date: Wed Sep 05 2007 - 19:06:43 GMT
I am trying to index Microsoft Document , Excel and PDF's. I do not want to
index the content but just the titles.
I have the following config

  # Example Swish-e Configuration file
FileFilter .doc       /usr/local/bin/catdoc "-s8859-1 -d8859-1 %p"
FileFilter .pdf       pdftotext   "%p -"

    # Define *what* to index
    # IndexDir can point to a directories and/or a files
    # Here it's pointing to the current directory
    # Swish-e will also recurse into sub-directories.
    IndexDir /opt/samba/CNR

    # But only index the .html files
    IndexOnly .doc .pdf

    # Show basic info while indexing
    IndexReport 1

Now i know the index the content inside the files but i do not want to index
the content, Right now it sorta works but it's indexing the content, also i
have to use the ReplaceRule replace to change the url to \\server\path to
were file is store.

Can somebody point me in the right direction for the configuration file

Users mailing list
Received on Wed Sep 5 15:06:43 2007