Hey guys,
Im not sure how to display .doc metadata from the swish-e index. For
example if i want to search for a word document with author Joe Doe (This
should in turn search for metanames: author = Joe Doe). I can get this
working with OpenOffice files as I can look @ the "meta.xml" file, but
cannot do this with the Word documents (filter catdoc).
Currently using the config provided below, and the swish.cgi to display the
search function.
P.S Anyone point me to some tutorials on configuring templates
(TemplateToolKit) for the swish.cgi script. Having trouble configuring the
output from the template.
Thankyou,
Jonathan Tan
=============
IndexDir /var/www/test
IndexFile /var/www/test/index.swish-e
IndexName Documents
IndexOnly .xml .htm .html .txt .doc .rtf .sxw .sxc .sxi .odt
DefaultContents TXT
SwishProgParameters -S fs
ReplaceRules replace /var/www/test /test
ExtractPath subject regex !^/test/([^/]+)/.*$!$1!
# Allow extra searching by title, path
MetaNames swishtitle swishdocpath
UndefinedMetaTags auto
PropertyNames dc:creator dc:date
IndexContents TXT* .pdf
FileFilter .pdf "/usr/bin/pdftotext" "'%p' -"
IndexContents TXT* .doc
FileFilter .doc "/usr/bin/catdoc" "-s8859-1 -d8859-1 '%p'"
IndexContents TXT* .rtf
FileFilter .doc "/usr/bin/catdoc" "'%p'"
#IndexContents TXT* .xls
#FileFilter .doc "/usr/bin/xls2csv" "'%p'"
FileFilterMatch "/usr/bin/unzip" "-p \"%p\" meta.xml"
/\.(sxw|sxc|sxi|odt)$/i
IndexContents XML* .sxw .sxc .sxi .odt
StoreDescription XML* <text:p>
FileFilterMatch "/usr/bin/unzip" "-p \"%p\" content.xml"
/\.(sxw|sxc|sxi|odt)$/i
IndexContents XML* .sxw .sxc .sxi .odt
StoreDescription XML* <text:p>
=========================
Received on Sun May 22 21:35:01 2005